<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by M. Rake Linggar A. on Medium]]></title>
        <description><![CDATA[Stories by M. Rake Linggar A. on Medium]]></description>
        <link>https://medium.com/@mrakelinggar?source=rss-6ad2c1beb234------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*trzx89jCgxwQZp-wIrwN_A.jpeg</url>
            <title>Stories by M. Rake Linggar A. on Medium</title>
            <link>https://medium.com/@mrakelinggar?source=rss-6ad2c1beb234------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Wed, 03 Jun 2026 09:20:53 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@mrakelinggar/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[The Best Two Tips on Professional Networking]]></title>
            <link>https://medium.com/practice-in-public/the-best-two-tips-on-professional-networking-5234141c73ce?source=rss-6ad2c1beb234------2</link>
            <guid isPermaLink="false">https://medium.com/p/5234141c73ce</guid>
            <category><![CDATA[leadership]]></category>
            <category><![CDATA[growth]]></category>
            <category><![CDATA[careers]]></category>
            <category><![CDATA[networking]]></category>
            <category><![CDATA[professional-development]]></category>
            <dc:creator><![CDATA[M. Rake Linggar A.]]></dc:creator>
            <pubDate>Wed, 11 Dec 2024 19:21:34 GMT</pubDate>
            <atom:updated>2024-12-11T19:21:34.431Z</atom:updated>
            <content:encoded><![CDATA[<h3>My Best Two Tips on Professional Networking</h3><h4>Probably the only tips one needs for their career</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*8JqkppC4CeBt0v-7" /><figcaption>Photo by <a href="https://unsplash.com/@priscilladupreez?utm_source=medium&amp;utm_medium=referral">Priscilla Du Preez 🇨🇦</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>Back in college, I was captivated by programming and machine learning. Coding late into the night and poring over tech blog tutorials became my world. Social events and networking felt like distractions from mastering my craft. But as my career progressed, I realized that technical skills alone wouldn’t get me where I wanted to go. Networking — growing and nurturing relationships — proved to be just as vital for career success.</p><p>It’s not just “who you know” but “who knows you” — and the depth of that connection — that truly matters.</p><p>Don’t get me wrong, “what you know” is also important. This is the foundation everyone needs to have to start networking. You don’t want to be known by the CEO of the company, but as someone who can’t do his/her job.</p><p>After you have enough skillset to excel at your job, <em>who you know </em>will start to feel more and more important. You need to know people and listen to their insights to expand your understanding. You need to know who can grant you access to certain data/information within an organization, who has the authority and needs to be convinced of your solution. Otherwise, it will be difficult for you to utilize <em>what you know</em> to make an impact.</p><p>Then, to move up in your career, <em>who knows you</em> will play a serious role.</p><p>In my previous company (as is the case in many others), who gets promoted is decided by a group of senior leaders in an annual meeting and they will vote on who gets the promotion. If senior leaders don’t know you well enough, particularly about your contributions, your chances of getting a promotion and a raise are pretty slim, as no one can advocate for you in these meetings. I was fortunate to have worked with several directors, including the managing director, when I was in consulting. This made my promotion to Senior and Principal Data Scientist much easier.</p><p>Getting a new job is also somewhat similar. When I moved from consulting to Indonesia’s top private telecommunication company, I had already worked on a couple of projects with them so the HR and VP could easily see the value I could bring to the table.</p><blockquote>But how does one grow and nurture their network to gain better opportunities?</blockquote><p>I have read several leadership books, watched career videos on YouTube, and asked several professionals with more experience than me on this matter. Up until this post, I can summarize them into two key things on how one can grow and develop their professional network.</p><h3>To Grow Your Network — Just Go Out and Be Nice To People</h3><h4>If you can even help them, great!</h4><p>That is it.</p><p>Just go out and be kind</p><p>Get to know people. Be curious about their stories. Everyone has got some stories to tell.</p><p>And you’ve started to be part of a new network.</p><p>One key trick here that many people don’t tell explicitly (at least when I asked them for networking advice) is — DON’T EXPECT ANYTHING IMMEDIATELY !!!</p><p>Over the years, several colleagues and friends shared their struggle to grow their network with me. Some were looking for a new job. Some wanted to grow their business. A trait many of them shared — they weren’t comfortable networking just for the sake of getting something out of it.</p><p>Networking is a long-term game.</p><p>Expecting immediate returns often leads to disappointment, especially in today’s competitive job market. Building connection and trust takes time; focus on forming genuine connections rather than asking for favors right away. Imagine you’re at a party, and you’ve just met someone for the first time. You barely know their name, and suddenly you ask them to lend you their car for a week. It’s awkward, right? Just like you wouldn’t ask a stranger for such a big favor without building some rapport first, it’s equally strange to ask someone you’ve just met for a job or referral.</p><p>Building a relationship takes time and trust, much like getting to know someone before asking to borrow their car.</p><p>You should just go out, meet new people, get to know each other, exchange stories, and that is it. Don’t ask for anything other than contact/business number/work e-mail/LinkedIn (even these should still be gotten with consent).</p><p>If for instance, you’re in a networking event. And you recently read about a company facing some challenges in the news. Someone high-level from that company is also attending there. If you are in a relevant role/company that is capable of helping, then you can use that in your discussion and also hint at a potential partnership. This is one of the key ideas of attending networking events.</p><h3>To Nurture Your Network — Stay Connected, and Exchange Stories or Insights of Value</h3><p>To nurture your network, it depends on where you are with them.</p><p>Nurturing your network is easier when you’re in the same company. Collaborate well on projects or, at the very least, greet colleagues warmly when you cross paths — regardless of their role.</p><p>If one of you has moved on to another opportunity, you can keep in contact by asking them how they are doing, saying congrats on their latest professional achievements/milestones, or wishing them a Happy New Year. When I share my career advancements, some network contacts congratulate me, and others even suggest a coffee catch-up. Other times, I also share some news/updates in the industry (tech and artificial intelligence) and ask them for their opinions.</p><p>The point here is to keep the relationship alive by maintaining communication. You never know what sort of collaboration you can have with your network in the future, and how you can help each other grow.</p><p>A few years back, I collaborated with my senior from my first company in a national machine learning hackathon via Kaggle. We were able to be in the top rank. This experience became a good story for me to tell during an interview for a project in the US (yes, I got the project). Recently I’ve reached out to an old friend from work in Indonesia as well, who is now working in the US. I told him I was in the city and asked if we could meet up. We did not do a project together from this encounter, but it was a fun and insightful reunion. We exchanged our experiences and learned from each other.</p><h3>Stay Connected</h3><p>Growing and nurturing your network is an ongoing process that requires genuine effort and patience. It also requires practice. By staying connected, exchanging valuable insights, and building relationships based on trust and mutual respect, you can create a strong and supportive professional network.</p><p>Remember, networking is not about immediate gains but about fostering long-term connections that can lead to meaningful opportunities. So, go out, be kind, and take the time to get to know people. Your future self will thank you for the relationships you build today.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=5234141c73ce" width="1" height="1" alt=""><hr><p><a href="https://medium.com/practice-in-public/the-best-two-tips-on-professional-networking-5234141c73ce">The Best Two Tips on Professional Networking</a> was originally published in <a href="https://medium.com/practice-in-public">Practice in Public</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[From Big Chunks to Tiny Pieces: Microservices for Beginners]]></title>
            <link>https://mrakelinggar.medium.com/from-big-chunks-to-tiny-pieces-microservices-for-beginners-38649337fc67?source=rss-6ad2c1beb234------2</link>
            <guid isPermaLink="false">https://medium.com/p/38649337fc67</guid>
            <category><![CDATA[software-architecture]]></category>
            <category><![CDATA[microservices]]></category>
            <category><![CDATA[distributed-systems]]></category>
            <category><![CDATA[scalability]]></category>
            <category><![CDATA[software-development]]></category>
            <dc:creator><![CDATA[M. Rake Linggar A.]]></dc:creator>
            <pubDate>Sun, 24 Mar 2024 10:38:37 GMT</pubDate>
            <atom:updated>2024-03-24T10:38:37.917Z</atom:updated>
            <content:encoded><![CDATA[<h4>Breaking down the tech jargon</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*i2R2r9jdS_NBoQGQ" /><figcaption>Photo by <a href="https://unsplash.com/@theshubhamdhage?utm_source=medium&amp;utm_medium=referral">Shubham Dhage</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>If you are a starting software developer/engineer, or just someone who is interested in learning about software architectures, then you must have at least heard of microservices at least once. This article aims to shed some light on the subject.</p><h3>What are Microservices?</h3><p>Microservices is an architectural style for developing software applications as a collection of small, independent, and loosely coupled functions, or services. Each service in a microservices fulfills one specific function and works together with other services to fulfill the overall software’s requirements.</p><h3>What existed before Microservices?</h3><p>Before microservices, IT organizations typically develop and utilized what is called as monolithic architecture. In a monolithic architecture, the entire software application is developed as a single, tightly integrated unit. All the components of the application, such as user interface, business logics, data storage, etc. are packaged together into a single codebase and deployed as one unit.</p><p>This posed several problems:</p><ul><li><strong>Tight integration </strong>— Components and functions are tightly coupled within the codebase. Changes to one part of the application may require changing several other parts and redeploying the entire application. Failure in one part of the application can impact neighboring parts.</li><li><strong>Scalability Challenges</strong> — since the entire application is tightly coupled, scaling the application, or introducing new features and changes can be very inefficient and resource intensive.</li><li><strong>Development and Deployment Challenges </strong>— Monolithic applications are harder to manage. More so as they grow in size and complexity. Developers will likely face difficulties coordinating changes, and deploying updates may require more effort than it appears.</li></ul><p>These problems cost companies so much that according to a survey conducted by <a href="https://www.statista.com/statistics/1236823/microservices-usage-per-organization-size/">Statista</a> in 2023, 81.5% of companies in 2021 already use microservices, and 17.5% of businesses plan to switch to this architecture type.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1000/0*jeMzq4pv-SxeYTDL.jpg" /><figcaption>Microservices implementation statistics — <a href="https://codeit.us/blog/benefits-of-microservices#microservices-implementation-statistics">Source</a></figcaption></figure><h3>What are the characteristics of Microservices?</h3><p>Several key characteristics of Microservices are:</p><ul><li><strong>Loose Coupling — </strong>Microservices are loosely coupled, meaning they interact with each other through well-defined interfaces without relying on the internal implementation details of other services. This reduces dependencies and promotes flexibility in making changes or updates.</li><li><strong>Scalability — </strong>Microservices architecture enables horizontal scaling, where individual services can be scaled independently based on demand. This scalability is essential for handling varying workloads and ensuring optimal performance.</li><li><strong>Resilience — </strong>Failure in one microservice does not necessarily impact the entire application. Services are designed to be resilient, with built-in fault tolerance and mechanisms for handling errors gracefully. Hence it reduces the risk of single point of failures.</li><li><strong>Technology Diversity</strong> — Different services within a microservices architecture can be implemented using different technologies, programming languages, and frameworks. This allows teams to choose the most appropriate tools for each specific service based on its requirements.</li></ul><p>Overall, microservices has become increasingly popular due to its ability to address the challenges of developing large, monolithic applications in rapidly changing environment.</p><h3>Examples of Microservices</h3><p>Let’s talk about the example of a microservice.</p><p>In a simple (let’s say) web application, a single functionality like login would generally be handled by a single microservice. Take the user’s username/email/mobile number and password, check them against the database, and either deny or grant access.</p><p>However, with increasing complexities such as demand for security and third-party integrations (using Google and Facebook logins), the login functionality becomes bigger and harder to manage. With the microservice architecture, we can breakdown the now huge login functionality into several more specific ones. In other words, instead of having one big microservice, it’s a good idea to break it into multiple smaller, easier-to-manage microservices.</p><p>Here are some example microservices that handles specific functions within the login’s big functionality:</p><ol><li><strong>Authentication Service</strong> — One microservice might be responsible for authenticating users based on their credentials (e.g., username/password or OAuth tokens). This service verifies the user’s identity and returns an authentication token or session identifier.</li><li><strong>Authorization Service</strong> — Another microservice might handle authorization, determining what actions or resources a user is allowed to access after they have been authenticated. This service checks the permissions associated with the user’s account and the requested resources.</li><li><strong>User Management Service</strong> — Separate microservice(s) might handle user management tasks, such as user registration, profile updates, or password resets. While not directly related to the authentication process, this service provides functionality that interacts with user accounts.</li><li><strong>Session Management Service</strong> — In systems requiring session management, there might be a microservice dedicated to managing user sessions. This service handles tasks like session creation, expiration, and invalidation.</li><li><strong>Single Sign-On (SSO) Service</strong> — In environments where multiple applications or services need to share authentication credentials, a dedicated microservice might handle single sign-on functionality. This service validates authentication requests and issues tokens that can be used across multiple applications.</li><li><strong>Identity Provider Integration </strong>— If the application integrates with external identity providers (e.g., social media logins, third-party authentication services), there might be microservices responsible for interfacing with these providers to handle authentication requests.</li></ol><h3><strong>How Microservices are Implemented</strong></h3><p>Microservices can be implemented in various ways depending on the organization’s technology landscape and business needs:</p><ol><li><strong>Containerization </strong>— Microservices can be packaged and deployed as lightweight, portable containers using containerization technologies such as Docker.</li><li><strong>Orchestration </strong>— Orchestration tools like Kubernetes or Docker Swarm can manage the deployment, scaling, and lifecycle of microservices deployed in containers.</li><li><strong>Serverless Computing</strong> — Microservices can be implemented using event-driven architectures like serverless functions, where code is executed in response to events triggered by external sources.</li><li><strong>RESTful APIs</strong> — Microservices communicate with each other via RESTful APIs (Representational State Transfer), using HTTP requests and responses to exchange data.</li><li><strong>gRPC (Remote Procedure Call) </strong>— Services across multiple servers define their interface using Protocol Buffers, and communication is performed via remote procedure calls.</li><li><strong>Message Brokers </strong>— Microservices can communicate asynchronously through message brokers like Apache Kafka, RabbitMQ, or Amazon SQS.</li><li><strong>Service Mesh</strong> — Service mesh architectures, such as Istio or Linkerd, manage communication between microservices by injecting a sidecar proxy alongside each service instance.</li><li><strong>GraphQL</strong> — Microservices can expose GraphQL APIs, allowing clients to query and mutate data using a single endpoint and flexible queries.</li><li><strong>Domain-Driven Design (DDD)</strong> — Microservices architecture can be aligned with domain-driven design principles, where services are organized around specific business domains or subdomains.</li><li><strong>Polyglot Persistence</strong> — Microservices can use different databases or storage solutions optimized for specific data models or access patterns, following the polyglot persistence approach.</li><li><strong>Continuous Integration/Continuous Deployment (CI/CD)</strong> — Microservices can be integrated and deployed using CI/CD pipelines, where automated tests are run, and new versions are deployed to production rapidly and frequently.</li></ol><p>Several companies have published how they have incorporated microservices in their business, unlocking scalability and growth. In the case of Netflix, you can read about them <a href="https://netflixtechblog.com/tagged/microservices">here</a>.</p><h3>Challenges</h3><p>Microservices architecture is a paradigm shift in software development, offering a modular, scalable, and resilient approach to building complex applications. By breaking down applications into smaller independent components, it can ensure iterative delivery and easier scalability. However, there is no such thing as a free lunch. Microservices has their own challenges, such as maintaining distributed systems, ensuring communication between services, and implementing effective monitoring and observability.</p><h3>Closing</h3><p>If you found this post insightful and want to continue the conversation, feel free to connect with me on LinkedIn. I’d love to hear your thoughts and discuss further. :)</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=38649337fc67" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Monitoring Model Performance]]></title>
            <link>https://medium.com/data-science/monitoring-model-performance-51635c044f52?source=rss-6ad2c1beb234------2</link>
            <guid isPermaLink="false">https://medium.com/p/51635c044f52</guid>
            <category><![CDATA[machine-learning]]></category>
            <category><![CDATA[data-science]]></category>
            <category><![CDATA[business-intelligence]]></category>
            <category><![CDATA[statistics]]></category>
            <dc:creator><![CDATA[M. Rake Linggar A.]]></dc:creator>
            <pubDate>Mon, 12 Sep 2022 14:07:33 GMT</pubDate>
            <atom:updated>2022-09-12T14:28:18.924Z</atom:updated>
            <content:encoded><![CDATA[<h4>Is your model continuously performing as expected?</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*IEmiB1reCIl6ftkM" /><figcaption>Photo by <a href="https://unsplash.com/@ibrahimboran?utm_source=medium&amp;utm_medium=referral">Ibrahim Boran</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><h3>Here’s the story</h3><p>So you’ve built and deployed your model. Be it using simple logistic regression, SVM, random forest, or the infamous deep learning.</p><p>The business users are also excited to see its impact. Whether it keeps customers around with the new personalized and targeted campaigns, increased transaction volumes and sales through that up-sell/cross-sell, or whatever KPIs you promised to be achieved may be.</p><p>Congrats.</p><p>The first couple of months, everything was going great.</p><p>Then suddenly you check the company dashboard/reports that suddenly KPIs are going back to its pre-model state, or perhaps it is worse. Stakeholders are bombarding your push notifications demanding answers. They are questioning your model’s performance.</p><h3>What happened?</h3><p>The most common explanations are:</p><ol><li>Your model overfitted. Perhaps it did not take into consideration some key factors, like seasonality. Perhaps you did not properly sample the data.</li><li>Late data issue. Perhaps the load balancers had a malfunction that it made the system not update the data for a whole day. So either the latest reports are not accurate or the model made its inference on incomplete data. Simply do a count(*) on the tables in question and escalate it to IT/data engineering team. Or, it could be…</li><li>The data itself has shifted.</li></ol><h3>What does it mean for the data to be shifted?</h3><p>It means there is a fundamental change in the data that the model you’ve built can no longer represent the current situation of the business. Be it from internal or external factor. In other words, the data to which the model is trained upon is no longer relevant, therefore the model is outdated.</p><p>This is very much likely to happen to business everywhere. Take a look at covid. Remember how much change it brought? Or the current hot news — Inflation. Both of which causes customers’ and businesses to change the way they behave and work in a significant way.</p><p>Or let’s take a more simple, common, and less apocalyptic, example. Say you are working in a big telecommunications company. One of your competitors are offering a huge discount in their prepaid packages with generous benefits. Something which no other company has ever done before. Turns out, your market loves this so much that they decided to abandon you for your competitor. It’s not you, it’s them.</p><p>All of these are external factors. What about internal ones?</p><p>Well, this is definitely your companies’ doing. A change in policy/management. The business is growing/loosing more money than it is making overall, so there are new products/cutbacks. Perhaps they soft-launched a new product variation that is very unique compared to existing ones. Or your star retail employees for 10 years has left/retired/finally use their holiday allowances at once, and customers are not loving the replacements’ service.</p><p>Can this be avoided? Yes. If your businesses captures all of these events into your data storage. Which, you can understand, if you’ve worked for any company, to be extremely difficult and costly to do. So, you’ve just got to do with what you have.</p><p>This is why it is very important to implement a model monitoring practice in place BEFORE the results are sent out. Every time the model gives inference on the latest data, you need to look out for these data shifts before the results are given to business users. If all is good, then the results can be blasted. If not, and depending on the severity, you can either easily fix it quick or raise awareness that something different has happened AS WELL as the measurable proof.</p><h3>How can you send measurable proof that the data has changed?</h3><p>There are three easy ways to do it:</p><ul><li>Descriptive Statistics</li></ul><p>A simple time series report could easily tell if there is a shift in your data. For instance, a dip in a simple monthly revenue MoM trend is a clear indicator your business is not doing so well overall. If the business keeps doing worse every month, it’s a matter of time until the model won’t recognize the sales data as it used to.</p><ul><li>Population Stability Index (PSI)</li></ul><p>This index basically measures the population of the model’s result and how much of it has shifted class/groups. You need to use the latest data and compare it to when the model performed good, e.g. its training data, or the month after that (which also performed good).</p><p>Let’s say the model produces N classes/groups/categories. Or it could also be binary classification like churn, in which case, you can for example take the probability of the customers churning, bin them into N equal or non equal groups. For instance, 0–10% as group 1, 11–20% as group 2, etc. It can also be like 0–50% group 1, 51–60% group 2, so on. The important thing is consistency throughout the entire process. Determining this bin could require some business acumen as well, as different bins from the same data and model would impact the model monitoring metrics significantly.</p><p>Simply count how many cases/customers fall within those groups for both the training data and latest one. Take the percentage of their respective total data. Then multiply the difference between the training (DT) and latest data (DL) with the natural log of DT/DL .</p><p>This is an example of the calculations, you can try to recreate the formula in Excel.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/924/1*-nDiNpqwsufPn6fnpKKR8g.png" /><figcaption>Source: Image from Author</figcaption></figure><p>The rule of PSI is:</p><ul><li>At least one bin &gt;20% — Data shift definitely has occurred. Retrain the model. If not, then</li><li>At least one bin is 10–20% — Slight change is required. This would likely give a bit of a drop in the model’s performance. If not, then</li><li>Less than 10% — no significant data shift. Carry on</li></ul><p>We can see the above example already has two groups whose PSI is &gt;20%. Therefore we need to investigate what happened to our customer’s behavior and retrain the model.</p><p>Bear in mind that these thresholds are not fixed. It depends on how much you and the business are willing to tolerate the change. For instance, in bin 3 you can clearly see theres is a huge difference in absolute numbers but the PSI is small.</p><ul><li>Characteristic Stability Index (CSI)</li></ul><p>If PSI determines if there is a data shift on the population, then CSI is to determine which features impacted it. The calculation is basically the exact same one as the PSI. Only difference is, we drill down into the problematic bins (in the example above, group 5, 8, and 9) and group them further based on the features.</p><p>Let’s say you have 10 features. Age. Monthly expenses. Outstanding bills. Whatever the case may be. Group them into accordingly and do the same calculations as you did in PSI.</p><p>The below example is 2 features’ CSI for bin 8.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/924/1*sUMt20RgEkUuVBbQ3cvKFQ.png" /><figcaption>Source; Image from Author</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/924/1*5fvRcfaLp2fbtGLvWJrqyg.png" /><figcaption>Source: Image from Author</figcaption></figure><p>From the above example, we can speculate that for bin 8, the number of young customers has risen and elder ones have declined. So much so that it has caused a shift in our data for these bins. Your monthly expenses for this group has also reduced significantly.</p><h3>Key Takeaways</h3><p>Descriptive statistics, PSI, and CSI are very simple and quite effective metrics in monitoring your model’s performance. But one thing that is better than these metrics in determining if the data has shifted, is business and market understanding that is regularly updated. Always stay updated to your business strategies, market, and customers needs.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=51635c044f52" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-science/monitoring-model-performance-51635c044f52">Monitoring Model Performance</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Set up configs like memory limits in Docker for Windows and WSL2]]></title>
            <link>https://mrakelinggar.medium.com/set-up-configs-like-memory-limits-in-docker-for-windows-and-wsl2-80689997309c?source=rss-6ad2c1beb234------2</link>
            <guid isPermaLink="false">https://medium.com/p/80689997309c</guid>
            <category><![CDATA[data-engineering]]></category>
            <category><![CDATA[docker]]></category>
            <category><![CDATA[windows]]></category>
            <category><![CDATA[docker-compose]]></category>
            <dc:creator><![CDATA[M. Rake Linggar A.]]></dc:creator>
            <pubDate>Wed, 19 Jan 2022 01:58:17 GMT</pubDate>
            <atom:updated>2022-01-19T01:58:17.051Z</atom:updated>
            <content:encoded><![CDATA[<h3>Set up config like memory limits in Docker for Windows and WSL2</h3><h4>Setup configurations for Docker is different for Windows than it is for Mac and Linux</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*stjto-KdXBi80ZNw" /><figcaption>Photo by <a href="https://unsplash.com/@carrier_lost?utm_source=medium&amp;utm_medium=referral">Ian Taylor</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>If you are starting to learn about setting up a system environment using Docker, then you definitely want to make sure it uses the right amount of memory and processing power.</p><p>After installing Docker in Windows, you might think that changing these settings would be as easy as changing configurations in VirtualBox (via UI).</p><p>Well, not quite.</p><p>Unlike Linux and Mac, we cannot directly change the configuration via Docker’s UI.</p><p>The latest release of Docker for Windows requires that we install the Windows Subsystem for Linux version 2 (WSL2) that allows you to configure global options that will be used by all WSL2 Linux distributions installed in Windows 10.</p><p>And to change the setting for Docker means that we need to create a <strong>.wslconfig</strong> file. Note that there are two types of config files that we can create. The <strong>.wslconfig</strong> file would let us create a global configuration. The other one, <strong>wsl.conf </strong>allows us to configure settings per -distribution for Linux distros in our machine. You can learn more about it <a href="https://docs.microsoft.com/en-us/windows/wsl/wsl-config#configure-global-options-with-wslconfig">here</a>.</p><ol><li>Make sure you shut down Docker and any other instances of wsl2. You can do this by going to a command prompt and type the command wsl — shutdown.</li><li>To create a .wslconfig file, simply open your File Explorer, and type and enter <strong>%UserProfile%</strong> to go to your profile directory in Windows.</li><li>Docker or WSL2 by default does not create these config files so we should do it ourselves. Create a new file called .wslconfig (make sure there are no .txt at the end)</li></ol><figure><img alt="" src="https://cdn-images-1.medium.com/max/422/1*fu4PovZNWtF4ecobr_e64Q.png" /><figcaption>Image by Author</figcaption></figure><p>4. Add the (either all or some of the) following commands in that file to configure docker</p><pre># Settings apply across all Linux distros running on WSL 2<br>[wsl2]</pre><pre># Limits VM memory to use no more than 4 GB, this can be set as whole numbers using GB or MB<br>memory=4GB</pre><pre># Sets the VM to use two virtual processors<br>processors=2</pre><pre># Specify a custom Linux kernel to use with your installed distros. The default kernel used can be found at <a href="https://github.com/microsoft/WSL2-Linux-Kernel">https://github.com/microsoft/WSL2-Linux-Kernel</a><br>kernel=C:\\temp\\myCustomKernel</pre><pre># Sets additional kernel parameters, in this case enabling older Linux base images such as Centos 6<br>kernelCommandLine = vsyscall=emulate</pre><pre># Sets amount of swap storage space to 8GB, default is 25% of available RAM<br>swap=8GB</pre><pre># Sets swapfile path location, default is %USERPROFILE%\AppData\Local\Temp\swap.vhdx<br>swapfile=C:\\temp\\wsl-swap.vhdx</pre><pre># Disable page reporting so WSL retains all allocated memory claimed from Windows and releases none back when free<br>pageReporting=false</pre><pre># Turn off default connection to bind WSL 2 localhost to Windows localhost<br>localhostforwarding=true</pre><pre># Disables nested virtualization<br>nestedVirtualization=false</pre><pre># Turns on output console showing contents of dmesg when opening a WSL 2 distro for debugging<br>debugConsole=true</pre><p>You can configure it however you want depending on your system’s need and available resources.</p><p>Once those are done, you can start your Docker and continue development.</p><p>Hope this is helpful in your journey to learn about Docker. Let me know if you have questions by replying to this post</p><h3>Sources</h3><ol><li><a href="https://docs.microsoft.com/en-us/windows/wsl/wsl-config#configure-global-options-with-wslconfig">https://docs.microsoft.com/en-us/windows/wsl/wsl-config#configure-global-options-with-wslconfig</a></li><li><a href="https://stackoverflow.com/questions/44533319/how-to-assign-more-memory-to-docker-container">https://stackoverflow.com/questions/44533319/how-to-assign-more-memory-to-docker-container</a></li></ol><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=80689997309c" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Tagging a Location to a Shapefile Area using Geopandas]]></title>
            <link>https://medium.com/data-science/tagging-a-location-to-a-shapefile-area-using-geopandas-5d74336128bf?source=rss-6ad2c1beb234------2</link>
            <guid isPermaLink="false">https://medium.com/p/5d74336128bf</guid>
            <category><![CDATA[python]]></category>
            <category><![CDATA[pandas]]></category>
            <category><![CDATA[geospatial]]></category>
            <category><![CDATA[programming]]></category>
            <dc:creator><![CDATA[M. Rake Linggar A.]]></dc:creator>
            <pubDate>Fri, 13 Aug 2021 14:06:42 GMT</pubDate>
            <atom:updated>2021-08-13T14:06:42.803Z</atom:updated>
            <content:encoded><![CDATA[<h4>Another geospatial use case made easy by GeoPandas</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*5Ayc368ba7jlolxu" /><figcaption>Photo by <a href="https://unsplash.com/@marjan_blan?utm_source=medium&amp;utm_medium=referral">Marjan Blan | @marjanblan</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>I previously wrote about a simple way to plot maps using Geopandas and Matplotlib.</p><p><a href="https://towardsdatascience.com/a-beginners-guide-to-create-a-cloropleth-map-in-python-using-geopandas-and-matplotlib-9cc4175ab630">A Beginners Guide to Create a Cloropleth Map in Python using GeoPandas and Matplotlib</a></p><p>Now another very common geospatial analytics use case is to tag a specific location to a predetermined location/area/region/etc. For example, let’s say you are an analyst for a brick-and-mortar store that wants to map the amount of competition in their key areas. These key areas can either be a business predefined area or based on predetermined country regions (like a province, district, etc.).</p><p>Your team already collected enough list of competitor addresses and their lat-long coordinates. Either purchased from a third-party company, or an arduous Google Search exercise, or both. So now you are tasked with mapping these addresses’ latitude and longitude with let’s say, a custom city name. Your company already has the shapefile containing the strategically drawn borders. All you need to do is to tag the City name to competitors’ locations.</p><p>You don’t need any fancy algorithm, GeoPandas package already got your back!</p><p>For this exercise, we will use the Indonesian province shapefile which you can get <a href="https://github.com/mrakelinggar/data-stuffs/tree/master/cloropleth_python">here</a>, and some random Indonesian malls’ addresses that I gathered. You can get it along with the complete source code at my Github <a href="https://github.com/mrakelinggar/data-stuffs/tree/master/latlong_in_shp">repo</a>.</p><p>The first step would be to load our required packages and then our reference shapefile and list of addresses that we would like to tag.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/c2abb2efa0cba20e5e3c094834c675a0/href">https://medium.com/media/c2abb2efa0cba20e5e3c094834c675a0/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*w_ovC0_1lUp9QWvc5IpVMQ.png" /><figcaption>Indonesian Province reference shapefile — Source from Author</figcaption></figure><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/a7900ce274ae0972ff70aee1a1514e23/href">https://medium.com/media/a7900ce274ae0972ff70aee1a1514e23/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/723/1*ZfKA_FfBZr4hPSb444JnAg.png" /><figcaption>Addresses Coordinates to be Tagged — Source from Author</figcaption></figure><p>Now, Geopandas is actually a very powerful geospatial package that makes this task so easy. It already provides us with the within command so we only a few short lines of codes!</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/e8a48e63a512952ee54aa86fcaa9e4ae/href">https://medium.com/media/e8a48e63a512952ee54aa86fcaa9e4ae/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/843/1*wfkaNAEAx8T9-X8lwcwKKg.png" /><figcaption>Address Tagging Result — Source from Author</figcaption></figure><p>The within code checks if our addresses’ latitude and longitude points are within our reference shapefile. Note that we need to check it with one province reference at a time. Otherwise, we would not be able to properly tag it (since two addresses can be in two different provinces, we would have difficulty correctly tagging it if we did it on all references at once).</p><p>And that’s it! Hopes this tutorial can be of use to your geospatial analysis. For the full source code and data, you can check them out on my <a href="https://github.com/mrakelinggar/data-stuffs/tree/master/latlong_in_shp">Github repo</a>.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=5d74336128bf" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-science/tagging-a-location-to-a-shapefile-area-using-geopandas-5d74336128bf">Tagging a Location to a Shapefile Area using Geopandas</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Switch from Anaconda to Miniconda for your Data project environment]]></title>
            <link>https://medium.com/data-science/switch-from-anaconda-to-miniconda-for-your-data-project-environment-8786c9e2dc95?source=rss-6ad2c1beb234------2</link>
            <guid isPermaLink="false">https://medium.com/p/8786c9e2dc95</guid>
            <category><![CDATA[data]]></category>
            <category><![CDATA[machine-learning]]></category>
            <category><![CDATA[platform]]></category>
            <category><![CDATA[tools]]></category>
            <dc:creator><![CDATA[M. Rake Linggar A.]]></dc:creator>
            <pubDate>Fri, 09 Apr 2021 15:30:49 GMT</pubDate>
            <atom:updated>2021-04-09T15:30:49.178Z</atom:updated>
            <content:encoded><![CDATA[<h4>Opinion</h4><h4>Mini can get the job done, even better than the big snake</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*m0L-KVieNlhY-506" /><figcaption>Photo by <a href="https://unsplash.com/@tannerboriack?utm_source=medium&amp;utm_medium=referral">Tanner Boriack</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>When I first started out my career as a data scientist, one of the tools I kept getting recommended is Anaconda. It is even said so in the original docs.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/818/1*yvLA7OLSg-4qmqiKTSrQ7w.png" /><figcaption>Why the Anaconda docs say you should choose it — Screenshot of <a href="https://conda.io/projects/conda/en/latest/user-guide/install/download.html#anaconda-or-miniconda">docs</a> by Author</figcaption></figure><p>For a while, I also thought Anaconda was cool. I mean look at the first page of Anaconda Navigator that you see once you open the tool.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*1AhYUYPMIvbck17t.png" /><figcaption>Anaconda Navigator — Image from <a href="https://docs.anaconda.com/anaconda/navigator/">Official Docs</a></figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*-F4DtPvoTuFSuBJB" /><figcaption>Photo by <a href="https://unsplash.com/@nci?utm_source=medium&amp;utm_medium=referral">National Cancer Institute</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>When I first saw that landing page, during my first tenure as a data scientist, I felt like how a kid imagined a real scientist works, in front of his complex and awesome tools.</p><p>All I needed then to complete the set was a pair of glasses.</p><p>I also agreed at the time why Anaconda was so recommended for data scientists. All the tools, including extra packages, code editors, and viz tools, were already provided.</p><h3><strong>One or Two Years later…</strong></h3><p>After working on dozens of analytics use cases, each with its unique challenges and requirements, I have grown wearier and wearier of Anaconda.</p><p>In summary, it became one bloated useless app on my computer.</p><ol><li>The thousands of packages are total storage and memory eater. And even worse, useless. All those preinstalled packages really weighed my computer’s performance down. Storage taken was a few GB, which can make a pretty big difference, especially on a Macbook Pro 128GB. Even with conda clean --all it is still taking up quite a space. I did not use many of the packages that came with Anaconda, like pomegranate, proj4, pyopengl, and so many more. I’m not sure what they all are for.</li><li>Managing the Python packages became a very slow process. Even updating a single package withconda update [some package] felt like it dragged on for way too long. I have tried using from only one environment (<em>not really recommended, by the way</em>) to different virtual environments with specific packages for different use cases (one for data exploration, one for numerical analysis and modeling, and another one for image processing). Both were still slow.</li><li>The Navigator became obsolete. I can just open my preferred code editor app either from the Start menu/Application folder or via command prompt/Terminal. In fact, opening the app through Navigator is even way slower. Updating and managing packages is also more convenient via command prompt. Making new virtual environments? Open command prompt/terminal and type in the conda command. Why bother with the Environment page in Anaconda Navigator. Even more so, why bother with it at all?</li><li>Makes my job a lot harder. Once the deployment schedule arrives for some of my clients/users, I have to create a virtual environment with the same required packages and config to ensure the model(s) can run smoothly in their production server(s). For the exact same reasons that I have just listed above, no way I was going with Anaconda (or likely ever).</li></ol><h3>So what’s the alternative…</h3><p>Miniconda!</p><p>Basically, it is just the conda package management system + Python + its base packages.</p><p>That’s it.</p><ul><li>No extra (useless!) tools and installations.</li><li>Need some packages for a specific use case/project? You can still create virtual environments and install just the packages you need there.</li><li>Need some code editors installed? Just download directly from the official web. Directly managed by the developers, and just as good (or even better) than the one installed via Anaconda Navigator.</li></ul><p>With Miniconda, I actually felt like making lean data science and analytics projects.</p><h3>Conclusions</h3><p>I am not saying Anaconda is an outdated tool that no data scientist/analyst should ever use. It still has its potential. There is even an <a href="https://www.anaconda.com/products/enterprise">Enterprise version </a>of it. I think first-time data scientists/analysts could benefit from starting their career/training using Anaconda. Like, learn what is an optimal virtual environment for your data projects. They might even have some uses for those 1,500 extra packages. Who knows.</p><p>I am saying that if you have a firm grasp of what you want to build or use with Python (e.g. you want to be a time series expert, deep learning engineer, or a data-driven marketing specialist), Miniconda is an<strong> </strong>efficient and recommended tool to use.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=8786c9e2dc95" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-science/switch-from-anaconda-to-miniconda-for-your-data-project-environment-8786c9e2dc95">Switch from Anaconda to Miniconda for your Data project environment</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Finding Most Common Colors in Python]]></title>
            <link>https://medium.com/data-science/finding-most-common-colors-in-python-47ea0767a06a?source=rss-6ad2c1beb234------2</link>
            <guid isPermaLink="false">https://medium.com/p/47ea0767a06a</guid>
            <category><![CDATA[image-processing]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[computer-vision]]></category>
            <category><![CDATA[opencv]]></category>
            <category><![CDATA[machine-learning]]></category>
            <dc:creator><![CDATA[M. Rake Linggar A.]]></dc:creator>
            <pubDate>Fri, 09 Oct 2020 03:10:43 GMT</pubDate>
            <atom:updated>2020-10-09T03:10:43.595Z</atom:updated>
            <content:encoded><![CDATA[<h3>Finding the Most Common Colors in Python</h3><h4>A standard, but crucial, functionality in image processing tasks</h4><p>There are several use cases in image processing that can be solved if we know the most common color(s) of an image or object is. For example in the field of agriculture, we might want to determine the maturity of a fruit, an orange or strawberry for instance. We can simply check if the color of the fruit falls in a predetermined range and see if it is mature, rotten, or too young.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*px1mjBN-Ked4n-1V" /><figcaption>Photo by <a href="https://unsplash.com/@sarahjgualtieri?utm_source=medium&amp;utm_medium=referral">Sarah Gualtieri</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>As usual, we can solve this case using Python plus simple yet powerful libraries like Numpy, Matplotlib, and OpenCV. I will demonstrate several ways on how to find the most frequent color in an image using these packages.</p><h4>Step 1 — Load Packages</h4><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/1f1cd1ec64ae46d231d8a9de42da6bed/href">https://medium.com/media/1f1cd1ec64ae46d231d8a9de42da6bed/href</a></iframe><p>We’ll load the basic packages here. We’ll load some more packages as we go along. Also, since we are programming in Jupyter, let’s not forget to include %matplotlib inline command.</p><h4>Step 2 — Load and show sample images</h4><p>In this tutorial, we will be showing two images side by side a lot. So, let’s make a helper function to do so.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/76315ca735877196370517888220b25b/href">https://medium.com/media/76315ca735877196370517888220b25b/href</a></iframe><p>Next, we’ll load some sample images that we’ll be using in this tutorial and show them using the function above.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/44c6ba40546533daf5155c78c6c65038/href">https://medium.com/media/44c6ba40546533daf5155c78c6c65038/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/714/1*zBJIhb2dGTZVA6WMX8MhyA.png" /><figcaption>Source: Images by Author</figcaption></figure><p>Now we are ready. Time to find out the most common color(s) in these images.</p><h4>Method 1 — Average</h4><p>The first method is the easiest (but ineffective one) — simply find the average pixel values.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/291045f7d4d0b32acea16e95d776fb25/href">https://medium.com/media/291045f7d4d0b32acea16e95d776fb25/href</a></iframe><p>Using numpy&#39;s average function, we can easily get the average pixel value across row and width — axis=(0,1)</p><figure><img alt="Most common color #1 — average method" src="https://cdn-images-1.medium.com/max/714/1*vBT608ysePkOmeEvUnidEQ.png" /><figcaption>Most common color #1 — average method</figcaption></figure><p>We can see that the average method can give misleading or inaccurate results, as the most common colors it gave are a bit off. This is because the average took into consideration all pixel values. This will be really problematic when we have images with high contrast (both “light” and “dark” in one image). This is much more clearer in the second image.</p><blockquote>It gave us a somewhat new color that is not visibly clear/noticeable in the image.</blockquote><h4>Method 2 — Highest Pixel Frequency</h4><p>The second method will be a bit more accurate than the first one. We’ll simply count the number of occurrences in each pixel value.</p><p>Fortunately for us, numpy again gives us a function that gives us this exact result. But first, we must reshape the image data structure to only give us a list of 3 values (one for each R, G, and B channel intensity).</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/714/1*hCNrQGhEXipO7k_b0DKkrg.png" /></figure><p>We can simply use numpy ‘s reshape function to get the list of pixel values.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/714/1*LJtQo0fvzc5HVuM1Y0Gc-A.png" /></figure><p>Now that we have the data in the right structure, we can start counting the frequency of the pixel values. We can just use numpy&#39;s unique function, with the parameter return_counts=True .</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/714/1*aG862Pvng1syu1RFTMI8Uw.png" /></figure><p>Done, let’s run it to our images.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/0eb130db43b0096193e7c9fbd226926e/href">https://medium.com/media/0eb130db43b0096193e7c9fbd226926e/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/723/1*dK7b_wYbmJ2-kUXzAeg09Q.png" /><figcaption>Most common color #2 — frequency method</figcaption></figure><p>This makes more sense than the first one right? The most common colors are in the black area. But we can go further. What if we take not just one most common color, but more than that? Using the same concept, we can take the top N most common colors. Except, if you look at the first image, many colors with the highest frequencies would most likely be neighboring colors, probably with a difference of a tiny few pixels.</p><blockquote>In other words, we want to take the most common, different <strong>color clusters</strong>.</blockquote><h4>Method 3 — Using K-Means clustering</h4><p>Scikit-learn package comes to the rescue. We can use the infamous K-Means clustering to cluster groups of colors together.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/723/1*IY9azogc-j302cqYUh343Q.png" /></figure><p>Easy, right? Now, all we need is a function to display the clusters of colors above and display it right away.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/7c39c05e5496e04c59d7c4ea35422939/href">https://medium.com/media/7c39c05e5496e04c59d7c4ea35422939/href</a></iframe><p>We simply create an image with a height of 50, and a width of 300 pixels to display the color groups/palette. And for each color cluster, we assign it to our palette.</p><figure><img alt="Most common colors #3 — K-means clustering" src="https://cdn-images-1.medium.com/max/723/1*NIt9jA-Zs0Plo53tVpDy6w.png" /><figcaption>Most common colors #3 — K-means clustering</figcaption></figure><p>Beautiful isn’t it? K-Means clustering gives great results in terms of the most common colors in the images. In the second image, we can see that there are too many shades of brown in the palette. This is most likely because we picked too many clusters. Let’s see if we can fix it by choosing a smaller value of <em>k</em>.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/3ec210ff50d3dbee8edbf84646853062/href">https://medium.com/media/3ec210ff50d3dbee8edbf84646853062/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/723/1*zbESJqoWlAf_PuopNeJc1A.png" /></figure><p>Yep, that solved it. Since we use K-Means clustering, we still have to determine the appropriate number of clusters ourselves. Three clusters seem to be a good choice.</p><p>But we can still improve upon these results plus still solve the number of cluster issues.</p><blockquote>How about we also show the proportion of the clusters towards the whole image?</blockquote><h4>Method 3.1 — K-Means + Proportion display</h4><p>All we need to do is to modify our palette function. Instead of using fixed steps, we change the width of each cluster to be proportionate to how many pixels are in that cluster.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/dde9f8025c6487bb7f377d3d0188eaf3/href">https://medium.com/media/dde9f8025c6487bb7f377d3d0188eaf3/href</a></iframe><figure><img alt="Most common colors #3.1 — K-means clustering + proportions" src="https://cdn-images-1.medium.com/max/630/1*9S_7r23Z3Vd2zNeYOV4s5Q.png" /><figcaption>Most common colors #3.1 — K-means clustering + proportions</figcaption></figure><p>Much better.</p><p>Not only it gives us the most common colors in the images. It also gives us the proportion of occurrences of each of the pixels.</p><p>It also helps answer how many clusters should we use. In the case of the top image, two to four clusters seem reasonable. In the case of the second image, looks like we need at least two clusters. The reason we don’t use one cluster (<em>k=4</em>) is that we’ll run into the same problem as the average method.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/728/1*T1g9qhqU-ImCb0x-cwus2g.png" /><figcaption>K-Means with k=1 result</figcaption></figure><h3>Conclusion</h3><p>We have covered several techniques to get the most common colors in images using Python and several well-known libraries for it. Plus we’ve also seen the advantages and disadvantages of those techniques. So far, finding the most common colors using K-Means with <em>k &gt; 1</em> is one of the best solutions to finding the most frequent colors in images (at least compared to the other methods we’ve gone through).</p><p>Let me know if you have problems with the script in the comments, or in my <a href="https://github.com/mrakelinggar/data-stuffs/tree/master/frequent_color">Github</a>.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=47ea0767a06a" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-science/finding-most-common-colors-in-python-47ea0767a06a">Finding Most Common Colors in Python</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[My Experience as a Bertelsmann Tech and Deep Learning Nanodegree Graduate]]></title>
            <link>https://medium.com/data-science/my-experience-as-a-bertelsmann-tech-and-deep-learning-nanodegree-graduate-459ab27db477?source=rss-6ad2c1beb234------2</link>
            <guid isPermaLink="false">https://medium.com/p/459ab27db477</guid>
            <category><![CDATA[udacity-nanodegree]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[deep-learning]]></category>
            <category><![CDATA[bertelsmann]]></category>
            <category><![CDATA[udacity]]></category>
            <dc:creator><![CDATA[M. Rake Linggar A.]]></dc:creator>
            <pubDate>Sat, 19 Sep 2020 17:36:03 GMT</pubDate>
            <atom:updated>2020-09-19T17:36:35.042Z</atom:updated>
            <content:encoded><![CDATA[<h4>As hard as it is to believe, 2020 is almost at an end. This is one of the amazing experiences I had in it.</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*IPNvWBtK0p96PHMR" /><figcaption>Photo by <a href="https://unsplash.com/@belchev?utm_source=medium&amp;utm_medium=referral">Dimitar Belchev</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>One of the responsible things to do when a year is ending is to reflect on it. What accomplishments you have made, what challenges did you face, what did you learn, and how you can make the remainder of the year count.</p><p>One experience that I can definitely share, and hopefully it would be beneficial to readers, is being awarded the 2019 Bertelsmann Tech Scholarship and receive the Deep Learning Nanodegree from Udacity, completely free of charge. And this year, Bertelsmann Tech is opening <a href="https://www.udacity.com/bertelsmann-tech-scholarships">another</a> scholarship application, which you should definitely try if you have a passion for data and cloud tech.</p><p>Many people have asked online what it was like to apply for the Bertelsmann Tech scholarship, win it, and complete the Nanodegree from Udacity. Also, what benefit can one receive from achieving all that? So hopefully this article can help answer some of those questions that you might have.</p><p>Let’s begin!</p><h3>1. What exactly is the Bertelsmann Tech scholarship</h3><p>Well, Bertelsmann is a media, services, and education company that operates in about 50 countries around the world. Their CEO, Thomas Rabbe, spoke that it was their mission to empower many individuals across the world with the increasing demand for digital skills. They work together with Udacity to provide free nanodegree courses that would otherwise be expensive if paid in full personally.</p><p>In 2019, they offered nanodegree in Data Analysis, AI/Deep Learning, and Data Science. This year, they are offering three new programs. I’ve included the link below if you want to find out more.</p><figure><img alt="2020 Bertelsmann Tech Scholarship programs" src="https://cdn-images-1.medium.com/max/1024/1*4dY7jlHKg2aQsO6NVTOnvw.png" /><figcaption>2020 Bertelsmann Tech Scholarship programs. <a href="https://www.udacity.com/bertelsmann-tech-scholarships">Source</a></figcaption></figure><h3>2. Applying for the Scholarship</h3><p>Last year’s application required applicants to write short essays about (1) why should you receive the scholarship, and (2) what do you plan to do with the skills that you will have acquired in the scholarship or how you would have benefitted from this scholarship. We were also asked about our confidence in our Python skills.</p><p>This year though, the application is a lot more simple and easy. You simply need to answer a few questions about yourself and your current skills in Python and SQL.</p><h3>3. How the selection is done</h3><p>There are two main stages that you have to pass in order to receive the full scholarship.</p><p>The first one is the application stage. There are thousands and thousands of people around the world applying, and only about 10k-15k are accepted into the three programs, that means about 3–5k for each ND program are selected to pass the application process.</p><p>The second stage is called Phase 1. Phase 1 is where you are given the basics or fundamental courses of the ND program that you applied for. To see which candidates can actually finish the entire course. The duration is about 3 months, the same as last year. Of the 3–5k applicants, only about 300–500 (10%) are awarded the full scholarship, i.e. Phase 2, where the awardees begin the full ND program and finish the remaining courses.</p><p>That’s it. after that, it is up to the scholars to finish the ND in 6 months.</p><p>Below is the timeline for this year’s scholarship</p><figure><img alt="2020 Bertelsmann Tech scholarship timeline" src="https://cdn-images-1.medium.com/max/1024/1*POBxOdYW40Jb9Ki3mAi5BA.png" /><figcaption>2020 Bertelsmann Tech scholarship timeline. <a href="https://www.udacity.com/bertelsmann-tech-scholarships">Source</a></figcaption></figure><p>Do not be fooled, these time window may seem long, but participating in Phase 1 and Phase 2 is not as easy as it seems. Especially with Covid-19 pandemic. It took me longer than I expected to finish the course thanks to Covid-19 and the “new normal” everyone had to adapt into.</p><h3>4. My Experience</h3><p>Now, we get to the fun part. I will share with you my story from applying all the way to graduating.</p><h4>4.1 Application process</h4><p>In 2019, I chose the Deep Learning program because I believed the resources (learning materials, forums, tips and tricks, best practices, etc.) for Deep Learning are still harder to attain than the other two if I chose to learn by myself in other online courses or websites.</p><p>I wrote the essays and submitted my application. One or two months later, I received this.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/667/1*hpZs8VnpEOJR4kDcr6JYXw.png" /></figure><p>I was so happy and ecstatic that my application stood out and I got accepted into Phase 1.</p><p>The key was that I had to really know what I want out of the program and why I (and also Bertelsmann) would benefit from this receiving (and giving) of scholarship.</p><blockquote><em>So, if there is another scholarship with the similar application process, the key here is that </em>You really, really have to sell it.</blockquote><p>But this year is a bit simpler, and therefore in my opinion, harder. Why? Well, with no essays in the application and only a few checkboxes, it can be different to set yourself apart from the competition.</p><p>But, that shouldn’t stop anyone from trying, am I right? :)</p><h4>4.2 Phase 1</h4><p>This is where the fun of the Bertelsmann Tech scholarship begins.</p><p>So, in each ND program, the students were gathered into three global Slack groups, each according to the program that they applied for. And each student had been given access to the basics or foundational courses for their ND programs.</p><p>The instructions for this phase, and to get to Phase 2, were simple and straightforward.</p><blockquote>We had to pass the fundamental courses in time AND actively participate in the Slack groups.</blockquote><p>Sounds simple right? Complete the course and participate in the group Slack for 3–4 months, easy right?</p><p>Not so fast,</p><p>Because it was “easy” and “simple”, everyone could easily do it, i.e. the competition is also fierce and it was easy to slip up. The most common slip ups people made were:</p><ol><li>They finished the course so fast, got bored, and no longer participated in the group</li><li>They finished the course in time and participated in the group, but not enough</li></ol><p>Mistake #1 was likely due to boredom, urgent matters in the office, etc. The point is they did not consistently participate in the Slack group. Mistake #2 is the most dangerous. Because many people did finish their course in time, they did participate in the Slack groups, but they got drowned by those who were way more active. Some of them even complained to Udacity, saying that they disagreed/protested the selection outcomes for their friends. These highly active people usually participate in study group sessions (we called them Study Jams), posted in the Slack groups on a daily basis or a few days in a week, either useful thinks, answers to other student’s questions, motivational quotes, or possibly even a funny meme to relax.</p><blockquote>The point is, the ones who passed onto Phase 2, were those who“madly” wanted these scholarships AND they showed it. In ALL 3–4 MONTHS of Phase 1.</blockquote><p>What did I do in Phase 1? I gathered with my fellow Indonesian scholars after office hours, worked together in the course, organized and participated in regular meetings, and kept working on it. I even published one of the projects I made from the skills I got in Phase 1.</p><p><a href="https://towardsdatascience.com/chinese-zodiac-sign-classification-challenge-with-pytorch-d89a8897d00b">Chinese Zodiac Sign Classification Challenge with Pytorch</a></p><p>And after a lot of hard work, those 3–4 months felt way longer. But it was worth it, especially with this e-mail.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/631/1*XH_g5P56M_tkpylKfnBArA.png" /></figure><h4>4.3 Phase 2</h4><p>Now the real hard work begins.</p><p>There were 5 main courses, 5 projects that we had to submit before 17th of Sept. Plus one extracurricular if one feels extremely motivated. Working on these 5 projects were not the biggest challenge. It was Covid-19.</p><p>The world was thrown a huge wrench into its machine and everyone around the globe had to adapt to a “new normal”, including in Indonesia. Work from office from 9 AM to 6 PM turned to Work from Home with longer, random work hours. Sometimes it was still the same, other times you had to work from the moment you opened your eyes until 10 PM. It was really unpredictable. I couldn’t plan my time properly. I had to prioritize my work first, of course. Since many people were laid off because of the pandemic.</p><p>I had no choice but to work on my nanodegree late at nights and weekends full time. It was exhausting.</p><p>Nearly six months had passed. In the end, with the help of praying, coffee, Google, and Udacity’s mentors and student hub, I managed to finish it before the big deadline and graduated. Plus I managed to do it without sacrificing my health.</p><p><a href="https://confirm.udacity.com/P2VR7KDA">Certificate of graduation from a Udacity Nanodegree program</a></p><h4>4.4 What did I get</h4><p>That was the process and journey in 2019’s Bertelsmann Tech Scholarship and Udacity’s Deep Learning ND. I was also asked, what benefit did I get from doing all this. Here are some of them</p><ol><li>I met and befriended lots of awesome people with the same drive and passion during the scholarship and nanodegree. We exchanged ideas and awesome topics and motivated each other to progress. I still have contact with some of them. Who knows how we will benefit from this network in the future. And who knows whom you might meet in 2020’s Bertelsmann Tech Scholarship, too</li><li>No doubt the skill and knowledge was as good as I thought. The lecturers, mentors, and student hub were so amazing. The lectures were relatively easy to follow and understand. The practice projects were also very intuitive, they also provided the solutions as well in case we get stuck with one or two lines of codes or the concept behind them.</li><li>The projects were very unique. The experience I got from working the Nanodegree projects were unbelievably so good. The problems were unique. The cases were real. Udacity also provided students with 99 hours of GPU enabled workspace to run our codes and model faster than using our own private laptops. They also gave us a set of clear objectives for us to achieve in the project (min accuracy, max loss, clear and concise explanation of our logic/thought process). No other online course could have provided the same experience for me.</li></ol><p>Overall, it was an amazing experience. Udacity did not exaggerate the term nanoDEGREE for their Deep Learning ND. It almost felt like a mini master’s program. One that I recommend to others to achieve.</p><h3>Last words</h3><p>Hopefully, this post is helpful to those who want to apply and learn in the Bertelsmann Tech scholarship and Udacity ND, or at least motivate you to keep learning.</p><blockquote>“Stay hungry, stay foolish” — Steve Jobs</blockquote><blockquote>“Stay safe and healthy” — Anyone with common sense</blockquote><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=459ab27db477" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-science/my-experience-as-a-bertelsmann-tech-and-deep-learning-nanodegree-graduate-459ab27db477">My Experience as a Bertelsmann Tech and Deep Learning Nanodegree Graduate</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[6 Actions You Can Take to Deal With Monday Dreads]]></title>
            <link>https://mrakelinggar.medium.com/6-actions-you-can-take-to-deal-with-monday-dreads-ea3663c7c1c?source=rss-6ad2c1beb234------2</link>
            <guid isPermaLink="false">https://medium.com/p/ea3663c7c1c</guid>
            <category><![CDATA[mental-health]]></category>
            <category><![CDATA[stress-management]]></category>
            <category><![CDATA[career-advice]]></category>
            <category><![CDATA[corporate]]></category>
            <category><![CDATA[startup]]></category>
            <dc:creator><![CDATA[M. Rake Linggar A.]]></dc:creator>
            <pubDate>Sun, 02 Aug 2020 01:01:01 GMT</pubDate>
            <atom:updated>2021-02-27T08:15:36.311Z</atom:updated>
            <content:encoded><![CDATA[<h4>Get out of bed and actually do something on Mondays</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*ywWPszRapNPQL_CF" /><figcaption>Photo by <a href="https://unsplash.com/@anniespratt?utm_source=medium&amp;utm_medium=referral">Annie Spratt</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>If you have worked for a company or startup for many years, it is very likely that you have dreaded Mondays. Be it because of a looming deadline(s), annoying co-worker(s), demanding boss(es)/client(s), or because of the nasty traffic caused by construction work for who knows how long. We have all felt something uncomfortable about going back to work after a nice relaxing weekend that makes us roll back into the comfort of our beds and not have a care in the world.</p><p>To make it worse, the mere thought that tomorrow is a regular, non-public holiday Monday, can cause the anxieties to be way worse. There is even a name for that, the <a href="https://www.themuse.com/advice/5-ways-to-shut-your-sunday-scaries-down-for-good">Sunday Scaries</a>. It can prove to be an even bigger annoyance. You waste your Sundays thinking about Mondays and stress yourself even more.</p><blockquote>Dude, I never felt anything like that. I love my work so much. It is my passion. It is my calling on this earth.</blockquote><p>Well, either you are a fresh graduate and just starting your first job, or you are about to feel the dread. <strong><em>No matter how much you love something, you will eventually get bored/find something wrong with it that makes you love it less.</em></strong></p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*jb3JpwSRKCMcYta8.gif" /><figcaption><a href="https://garfield.com/comic?keywords=mondays">Source</a></figcaption></figure><h3>Why is this such a big deal?</h3><p>Worrying or hating so much about something will give you stress, which obviously can be bad for our health, either physical or mental. Especially when we are doing it almost every week. Our immune systems get weaker the more stressed we are.</p><p>We can also develop more negative thoughts about our work — anxiousness, paranoia, etc. It can also make your work productivity suffer. Or it can put a strain in your relationship with coworkers or (worse) your boss.</p><p>These are pretty good reasons for why we have to manage our Monday dreads. It may not seem much of a big deal now. But if one carries this out for a long period of time, it will affect your day to day lives. Work productivity. Health. Personal relationships. You name it.</p><h3>So what do we do about it?</h3><p>Back in the early days of high school when I started feeling these Monday dreads, I go tell my school counselors or close friends or my mom, hoping to get some advice on how to deal with it. But even in the career world, the advice we get to deal with these Monday dreads is usually in at least one of the following archetypes.</p><p>The <strong><em>motivational speaker/saint </em></strong>— “Try looking at it in a new way. Today is a gift, you should be grateful you are given the chance for another adventure”</p><p>The <strong><em>pragmatic/tough guy</em></strong> — “You are not the only one, deal with it and get the job done”, “Don’t whine, you’re a man, so shut up and take it like one”</p><p>They are all good, reasonable points. But if you are the kind of person who prefers doable realistic actions along with the shared wisdom/advice, then we are pretty much alike. I mean, what good is advice if you can’t act on it, right? Yes, a smile can help, but sometimes our negativity is way heavier than putting a frown on your face (cause, you know, it takes more muscles to frown than to smile). I needed something more than that.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*J6sf8i6gQttnSohC.gif" /><figcaption><a href="https://garfield.com/comic?keywords=mondays">Source</a></figcaption></figure><p>Fortunately, with some years of experience, and a little bit of research (especially after I started focusing on my career), I learned to better manage my Monday dreads.</p><p>Here are some of the things that helps me, and hopefully you as well.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*a3WjR8uQItOnfQmD.gif" /><figcaption><a href="https://garfield.com/comic?keywords=mondays">Source</a></figcaption></figure><h3>1. Acknowledge It</h3><p>Feeling nervous, panicky, or anxious that Monday is near is completely fine. It is completely natural. As I’ve said before, everyone has felt it.</p><p>What’s not fine, is if you respond to it in an unproductive, negative, or another bad way. And this is where all the difference is.</p><p>What good does incessantly looking at the clock all day on a Sunday do? Hoping you still have more hours left in the day? Nothing. In fact, it can be worse than nothing. You end up worrying all day, with nothing to show for it. You could have spent more time relaxing, going to the mall shopping for new clothes, eat at a fancy restaurant, playing with your kids, spending quality time with your family, the list goes on.</p><p>If you acknowledge you are stressed and hating Mondays, you have a better chance of actually dealing with it. This is how so many professionals are able to return on Mondays and do kick ass at work (among many other qualities). They know there is an issue that needs to be addressed and they do just that.</p><blockquote>Acknowledging that there is a problem is the first step towards fixing said problem.</blockquote><h3>2. Plan for your next week</h3><blockquote>“If you fail to plan, you plan to fail” — Benjamin Franklin</blockquote><p>There are things in this world which we can control. And those which we cannot. That is an undeniable fact.</p><p>It can be somewhat depressing when we enlist all things that can go wrong and outside our control. So it makes perfect sense to do everything we can so that those within our control goes in our favor, not against us. Nothing short of proper planning from the beginning followed by good execution will do the job for this. Even if your week still sucks afterward, you can still find comfort in knowing you did what you could.</p><p>A piece of good advice I got was to use my Friday to plan my next week. The reason being is that Friday I was done with work for the week. My brain was still humming with all the tasks I had to do next, all the notes and feedback from my supervisors are still fresh in my head. I can still think about work and the momentum from the day helped to keep me going.</p><h3>3. Keep away from anything that reminds you of work</h3><p>We are feeling the Monday dread, likely because we are in dire need of spending quality time outside of work. Either with friends, family, or just ourselves. Go to the mall, to the movies to see the new Marvel movies (oh, how I miss this in this hard time of Covid-19), that new restaurant you heard your friends talking about. Or take a short trip to someplace nice. Or just cover yourself under a warm blanket and watch Netflix or read one of your favorite books. Anything you enjoy.</p><p>Another pro tip I got from one of my friends, is to shut down (at least set to silent) my work e-mail and Whatsapp work group. I was very surprised at just how effective this method was in reducing work stress. I wasn’t able to fully relax, to enjoy my weekend because a percentage of my brain is still allocated to work. In the end, I ended up having no real weekends, and no work done either. One or two days without thinking or be reminded about work turns out to be a necessity.</p><h3>4. Little to no work on weekends</h3><p>I know. This contradicts the previous point. No one likes to work on weekends. It is the time to rest and relax, starting after coming home from work Friday.</p><p>But, there are some times where some coworkers/clients/boss sent a late email, sending you the required materials for you to do your work, and they ask you to present it first thing next week. No one likes that. I’m sure not even they like having to send a late email on Friday nights or, hopefully on more less often, rarer occasions, weekends in the first place (is what I tell myself to be positive and not complain).</p><p>Which is why I cram everything on Friday night. Prepare a fresh cup of coffee, order pizza or a burger to aid me. Shut off any distraction (Youtube, Netflix, Instagram, etc.) and just try to reach that goal of finishing that last annoying report and send it to whoever is supposed to receive it. I do not want ANY work to do on weekends. If it is not possible to finish by Friday night, what can I do? But I won’t go crazy with it. I’ll just allocate three hours on Sunday night to do it.</p><h3>5. Communicate</h3><p>If you cannot change the situation yourself and still feel like the office is a bleak house, it’s time to ask for help. It’s not wrong if you talk to your supervisor/manager about your work condition and ask for help. Ask to appoint another team member to help carry the massive load. Or to be moved to other less demanding projects that you can help. What’s wrong is if you clearly need help and you keep the problem to yourself.</p><p>Often times it is not just a one-time quick meeting. Your boss may need some convincing that you need help. More than once. Show him your workload and the unreasonable demand behind it. That it is having a negative impact on yourself. Missing deadlines, getting sick more frequently, or whatever the case may be.</p><h3>6. Find a new place</h3><p>Sometimes the problems are embedded so deep, it becomes too difficult to fix. Whether it is poor budget allocations for new staff, difficult boss(es), or any other forms of toxic work culture. If we cannot change it despite our best efforts, maybe your job isn’t right for you. Maybe you need to ask yourself if its time to move on.</p><p>Work is not always great or easy, but it should never make you feel uncomfortable, anxious, or depressed.</p><h3>Final words</h3><p>Hopefully, this post can help you. It may take time to grasp this and apply it to our lives and keep practicing to make it more natural. But once you understand and practice this consistently, you will soon realize the Monday dread is getting smaller, by a lot. And this is obviously no small win, at the very least, for your career.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*ryai7ZsvU3TdbaDi.gif" /><figcaption><a href="https://garfield.com/comic?keywords=mondays">Source</a></figcaption></figure><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=ea3663c7c1c" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Atom’s Hydrogen for writing data science projects using Python]]></title>
            <link>https://medium.com/data-science/atoms-hydrogen-for-writing-data-science-projects-using-python-46b6507fcdf7?source=rss-6ad2c1beb234------2</link>
            <guid isPermaLink="false">https://medium.com/p/46b6507fcdf7</guid>
            <category><![CDATA[python]]></category>
            <category><![CDATA[hydrogen]]></category>
            <category><![CDATA[atom]]></category>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[text-editor]]></category>
            <dc:creator><![CDATA[M. Rake Linggar A.]]></dc:creator>
            <pubDate>Sat, 02 May 2020 02:45:11 GMT</pubDate>
            <atom:updated>2020-05-02T02:45:11.363Z</atom:updated>
            <content:encoded><![CDATA[<h3>Atom’s Hydrogen for writing data science projects with Python</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*p8wh-mDAnJCfRy1C" /><figcaption>Photo by <a href="https://unsplash.com/@codestorm?utm_source=medium&amp;utm_medium=referral">Safar Safarov</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>Whenever I want to write Python code, whether it’s for testing or deployment purposes, I will just open up Jupyter Notebook and start writing. And why not, right? It’s free, open-source, and everyone seems to love it (for <a href="https://www.nature.com/articles/d41586-018-07196-1">example</a>).</p><p>And if your company wants to deploy your code on their production environment, you can just use <a href="https://enterprise-docs.anaconda.com/en/docs-site-5.0.2/user-guide/tutorials/deploy-notebook-project.html">Anaconda Enterprise</a> to do it.</p><figure><img alt="Deploying Jupyter Notebooks in Anaconda Enterprise" src="https://cdn-images-1.medium.com/max/1024/1*0VPYAkroZHLKZl7wf2SYBQ.png" /><figcaption>Deploying Jupyter Notebooks in Anaconda Enterprise</figcaption></figure><p>But, if you need to write Python modules in your project, it can be less than ideal to write and test the module in Jupyter, save it as Python file, and then including it in your main Jupyter notebook and all that.</p><p>Plus, the downside of deploying with Anaconda Enterprise? — well….</p><figure><img alt="Google result for Anaconda enterpise cost" src="https://cdn-images-1.medium.com/max/587/1*zCy5HNbK_OvbqMteP8CO7g.png" /><figcaption>Google result for Anaconda Enterprise’s cost</figcaption></figure><p>Which is why I looked for an alternative tool. One that can save the hassle of writing production, modular codes, but in which I can still run codes line by line, and also display the data and variables inline.</p><p>In comes Hydrogen package for Atom!!!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/851/1*4nxNG2rilpM0TufI0U_1-w.png" /></figure><p>What is Hydrogen?</p><p>Well, like the tagline in the official documentation says</p><blockquote>… All the power of Jupyter kernels, inside your favorite text editor.</blockquote><p>And so far, it is true and I’m loving it on my MacOS 10.15. It turns your Atom text editor into a Jupyter-like notebook! Plus Atom is a lightweight text editor with plenty of packages to make it an awesome code editor.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/960/1*i2Ck_p0X8dTspU-Qus66TA.gif" /><figcaption>Source: Hydrogen official page</figcaption></figure><p>This post will walk you through the steps and workarounds to the issues I’ve faced (and some will likely face as well) during installation on <strong>MacOS 10.15</strong>.</p><h3>1. Install Atom</h3><p>This is, of course, a very easy step to do. Just head over to <a href="https://atom.io">https://atom.io</a> and download the latest version of Atom. Once downloaded, simply copy and paste the Application file to your, well, Applications folder.</p><p>“Then just double click to open it, right?” — Well, not really.</p><p>Many Mac users, including myself, found that we cannot open the editor, and were greeted with a message like this.</p><blockquote>“Atom 2” can’t be opened because Apple cannot check it for malicious software.<br>This software needs to be updated. Contact the developer for more information.</blockquote><p>But don’t worry.</p><p>Just <strong>right-click the Application file, and then click “Open”.</strong></p><h3>2. Install Hydrogen Package</h3><p>Click on <strong><em>Atom &gt; Preferences &gt; Install &gt; type “hydrogen” </em></strong>and <strong>click Install</strong>.</p><figure><img alt="Install Hydrogen package" src="https://cdn-images-1.medium.com/max/910/1*P96GaWNAETy1NuTUR5JQAQ.png" /><figcaption>Install Hydrogen package</figcaption></figure><p>Once it is done, you will need to install the kernels for Python, otherwise, Hydrogen will not work. Open up your Terminal, and type in the following code.</p><pre>python -m pip install ipykernel<br>python -m ipykernel install --user</pre><p>This will install the IPykernel, which is a kernel for Python so that Hydrogen can make Atom execute Python code like how Jupyter does.</p><p>Once it is done, restart Atom, and you will be able to run your code line by line and even display your pandas data frames and other visuals you normally make in Jupyter. To see the command shortcuts, you can check it on <strong><em>Packages &gt; Hydrogen</em></strong> and there you will find the list of shortcuts you can use. You’ll find they are quite similar to those in Jupyter.</p><figure><img alt="Hydrogen shortcuts" src="https://cdn-images-1.medium.com/max/926/1*TJnkPghlaXbpkmcGIc-wlg.png" /><figcaption>Hydrogen shortcuts</figcaption></figure><p>To run only a selection of lines, just select which line(s) of code you want to execute and press the <strong>Run</strong> shortcut.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*U-SuJL900oa1ThmTMgeFfQ.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*7RiCYvEHR_mnVkC_aaw3rA.png" /></figure><p>Awesome, right?</p><p>To see the full set of Hydrogen’s capabilities, you can check them on the <a href="https://atom.io/packages/hydrogen">official page</a>.</p><h4>Plus — Autocomplete</h4><p>Lastly, it wouldn’t be much of an editor if there are no autocomplete.</p><p>Hydrogen already provides an autocomplete function, but when I tried it, several variables or built-in Python functions were not shown. Therefore I recommend you add the autocomplete-python package.</p><p>Click on <strong><em>Atom &gt; Preferences &gt; Install &gt; type “autocomplete-python”</em></strong>. and click Install.</p><p>If you find some troubles in autocompleting basic Python keywords and variables, it may be that the grammar file is not copied correctly. To fix it, you can try the following code in your Terminal.</p><pre>cd ~/.atom/packages/autocomplete-python/lib/jedi/parser<br>cp grammar3.6.txt grammar3.7.txt</pre><h3>Final words</h3><p>Hope this post is useful and that everyone has a better time writing complex, modular machine learning, data preprocessing, or data analysis codes. Especially during these hard times due to the Covid-19 pandemic.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=46b6507fcdf7" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-science/atoms-hydrogen-for-writing-data-science-projects-using-python-46b6507fcdf7">Atom’s Hydrogen for writing data science projects using Python</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
    </channel>
</rss>