<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Yashika Sharma on Medium]]></title>
        <description><![CDATA[Stories by Yashika Sharma on Medium]]></description>
        <link>https://medium.com/@yashika51?source=rss-d3deb7eb050c------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*M-W7w-X4PJhPvOq6aQk4Fg.png</url>
            <title>Stories by Yashika Sharma on Medium</title>
            <link>https://medium.com/@yashika51?source=rss-d3deb7eb050c------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Tue, 09 Jun 2026 04:20:43 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@yashika51/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[Tech Certifications: Are they worth it?]]></title>
            <link>https://medium.com/@yashika51/tech-certifications-are-they-worth-it-bc6df7a55ff0?source=rss-d3deb7eb050c------2</link>
            <guid isPermaLink="false">https://medium.com/p/bc6df7a55ff0</guid>
            <category><![CDATA[career-development]]></category>
            <category><![CDATA[data-engineering]]></category>
            <category><![CDATA[technology]]></category>
            <category><![CDATA[certification]]></category>
            <category><![CDATA[software-development]]></category>
            <dc:creator><![CDATA[Yashika Sharma]]></dc:creator>
            <pubDate>Mon, 15 Jul 2024 17:28:51 GMT</pubDate>
            <atom:updated>2024-07-15T17:28:51.880Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*BsVbTz-Q-4ZNJB5t" /><figcaption>Photo by <a href="https://unsplash.com/@writecodenow?utm_source=medium&amp;utm_medium=referral">Boitumelo</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>There’s always been a lot of debate about whether tech professionals need certifications and if they’re even worth it. You’ll often hear that hands-on experience is what really counts, and many recruiters might not even look at certifications on your resume. While that’s definitely true, I want to share a different perspective.</p><h3>The Debate: Experience vs. Certifications</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*j6LpEbKhNkaxgG5l" /><figcaption>Photo by <a href="https://unsplash.com/@neonbrand?utm_source=medium&amp;utm_medium=referral">Kenny Eliason</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p><strong>First things first: </strong>Certifications are not meant to replace real-world experience or hands-on projects. They’re a good addition but not a necessity. So, why might it still be a good idea to get some certifications? Let me explain.</p><p>I’m a big advocate for learning by doing. I didn’t go to a fancy university; most of what I know comes from online courses, working on tons of projects, and my experience as an engineer with various companies over the past six years.</p><h3>Certifications vs. Certification Exams</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*WJeS39TgEGw-xCRM" /><figcaption>Photo by <a href="https://unsplash.com/@homajob?utm_source=medium&amp;utm_medium=referral">Scott Graham</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>There’s a difference between getting a certification just for finishing a course and passing a certification exam. Some courses give you a certificate for completing the modules and maybe a final project. For instance, when I started with machine learning and deep learning, I was all over the place with online documentation and YouTube videos. It was helpful, but I lacked structure.</p><p>That’s when I found the Deep Learning Nanodegree from Udacity. It offered a structured path with hands-on assignments and projects that really helped me grasp the material. This kind of certification is more about having a structure to follow and showing that you’ve completed a learning journey.</p><p>Certification exams are a different ball game. As a data engineer, I try to stay updated with new advancements. Some certification exams like the Azure Data Engineer Associate or Google Professional Data Engineer can be really beneficial if you are in this field. These exams are more recognised in the industry. While they might come with official learning modules, you can also pair them with YouTube tutorials or Udemy courses for a broader understanding.</p><p>When I was preparing for the Azure Data Engineer exam, I used my real-world experience with Data Engineering and Azure and supplemented it with hands-on exercises, which were part of the <a href="https://learn.microsoft.com/en-us/training/courses/dp-203t00">Data Engineering with Microsoft Azure course</a>. It was fun to test my knowledge and pass the exam.</p><h3>Why Bother with Certifications?</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*xDrqrL4V72QRwh4k" /><figcaption>Photo by <a href="https://unsplash.com/@emilymorter?utm_source=medium&amp;utm_medium=referral">Emily Morter</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>Some might say, “Why not just follow a course or look up documentation as needed? Why specifically do you need to pass the exams?”</p><p>That’s a fair point, but realistically, how many people actually finish the courses they start? Additionally, finding the right documentation can be time-consuming if you’re not sure where to look.</p><p>I see certifications as a motivator — an end goal to keep you on track. You don’t need a certificate for every tool or technology, but if you’re really interested in learning something in-depth or switching to a new platform, it can be beneficial. For example, if you’re experienced with AWS and need to work with Azure, a certification exam can help you quickly get up to speed with Azure-specific concepts.</p><h3>Things to Consider Before Getting a Certification</h3><ul><li><strong>Time:</strong> Do you have the extra hours to study?</li><li><strong>Relevance:</strong> Is the certification relevant to your role, field, or interests?</li><li><strong>Cost:</strong> If you can’t afford it, look for financial aid (for example, Coursera offers this) or see if your employer has a professional development budget.</li><li><strong>Practical Experience:</strong> Do you already have exposure to this area and want to take it to the next level, or will you have opportunities to apply what you learn?</li></ul><h3>Conclusion: Are Certifications Worth It?</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*BG8bM8AkN930yJKa" /><figcaption>Photo by <a href="https://unsplash.com/@sharonmccutcheon?utm_source=medium&amp;utm_medium=referral">Alexander Grey</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>In the end, whether or not to pursue certifications comes down to your personal goals and circumstances. They can be a great way to structure your learning and validate your skills, but they shouldn’t replace hands-on experience. Use them as a tool to complement your practical knowledge.</p><p>Think about whether a certification is worth it for you based on your time, interest, and goals. Certifications can enhance your knowledge and career if you approach them with the right mindset. Just remember, they shouldn’t replace hands-on experience and they shouldn’t be pursued just for the sake of credentials but rather for practical use. Sometimes, you might even benefit from having them on your resume, especially if you have corresponding hands-on experience.</p><p>Feel free to reach out if you have any questions or if you have a different opinion. I’m always up for a discussion! 😊</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=bc6df7a55ff0" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Data Governance from your Terminal]]></title>
            <link>https://medium.com/alvin-ai/data-governance-from-your-terminal-a2e1af7cc331?source=rss-d3deb7eb050c------2</link>
            <guid isPermaLink="false">https://medium.com/p/a2e1af7cc331</guid>
            <category><![CDATA[data-lineage]]></category>
            <category><![CDATA[data-engineering]]></category>
            <category><![CDATA[dbt]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[cli]]></category>
            <dc:creator><![CDATA[Yashika Sharma]]></dc:creator>
            <pubDate>Thu, 02 Mar 2023 16:42:15 GMT</pubDate>
            <atom:updated>2023-03-02T16:54:56.202Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*u8q0uj4n6x9QR1iRIHKNDQ.jpeg" /></figure><p>Data engineers, software developers, sysadmins. To the untrained eye, we’re all just computer folks.</p><p>But to us in the know, we’re far apart when it comes to performing our day jobs, with different challenges and different toolkits. That said, there is one thing that we all tend to know and love: our beautiful terminals. Doing things via command-line instead of dragging your mouse through an interface is the kind of thing that once you get used to, there is no turning back.</p><p>And the same applies to Alvin: once you get used to managing and consuming your metadata right from the terminal, well… let’s just say you’ll start asking yourself how you lived without it.</p><p>The power of Alvin’s metadata in your terminal? Let’s check out how it works!</p><h3>Introducing: the Alvin CLI</h3><p>With the Alvin CLI, you can use all the main features of our tool directly in your favorite terminal:</p><ul><li>Add, remove and modify new platforms;</li><li>Perform impact analysis on your schema changes;</li><li>Support for dbt models;</li><li>Bulk apply (and remove) tags;</li><li>Analyze upstream and downstream column-level data lineage of your assets;</li><li>Add and remove lineage for your assets;</li><li>View usage statistics of your columns, tables and dashboards.</li></ul><p>But hey, seeing is believing. So let me show you, in my humble opinion, some of the coolest features for analyzing and managing your metadata in the terminal.</p><h3>Regression Testing</h3><p>As I mentioned before: once you get used to the terminal, you never want to leave. And well, we get you.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*y0fBkBDnBRvXcTvZ" /><figcaption>Regression test for dropping a table, with the impacted entities listed by platform.</figcaption></figure><p>Whenever you need to drop or change a column or a table, test your SQL and reveal any downstream breaking changes, without leaving your terminal.</p><h3>Support for dbt models</h3><p>The regression testing can be used for tables, columns and BI elements. But also for dbt models!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*5KLmQZa6UkqiAyLT" /></figure><p>When working with them, you can use the CLI to run tests and get a nice report of what you are going to break (hopefully nothing).</p><p><strong>Bulk applying (and removing) tags</strong></p><p>Want to apply the same tag to different entities at once? The terminal is your friend. You can apply and delete tags to entities in a batch based on keywords and rules!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*cuNLOIqJG-4EkQSY" /></figure><p>On the Alvin side, we’ll need a few arguments, but based on the input all the matching entities will have the new tag bulk applied to or deleted from them.</p><p><strong>Tag batch apply</strong></p><p>Let’s say I want to apply a tag to all the column entities from the platform <em>“bigquery”</em> which exactly matches the rule text <em>“first_name”</em>, domain <em>“name”</em>, and the tag I want to bulk apply is <em>“cli_demo”</em> with tag type business_term and classification type <em>“pii” </em>(by default it is “default”).</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*Y54Uq5rD902Dq705" /></figure><p>Once the command executes successfully, you can go to the UI and check out any entity matching the rule and you’ll see the new tag applied to it.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/680/1*4KqqW30WvaYLAIl-bZrjJQ.png" /></figure><p><strong>Tag Batch Delete</strong></p><p>Similar to applying tags, you can also bulk delete them from entities based on the rule text and other parameters.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*kq4rt1rryQXSanQa" /></figure><p>You’ll be asked for confirmation before applying the bulk delete operation, but once you confirm, the mentioned tag from all the matching entities gets deleted right away. Pretty neat.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*gxsWW2ejHk-A0JJM" /></figure><h3>Usage Statistics</h3><p>Need to know if a specific entity is being used, how many times it was accessed, and by whom? Yeah, you can see that in the CLI too.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*qqOEB7n6M0WfKFnB" /></figure><p>Below is an example of usage statistics from the past 30 days of a column called <em>office</em>. You are also able to see the usage count by user:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*cNTpv6mWqHvY7I_B" /></figure><h3>Support for tabular, YAML and json formats</h3><p>Get your data in your preferred format!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*6rF6d7gWHxTAKd0P" /></figure><p>For lovers of the classic tabular format.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*muidoFoO-MvH5Kmj" /></figure><p>For all of us json lovers out there.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*Qk2MG3y_e5bsOXrf" /></figure><p>I don’t miss XML at all, but whatever floats your boat.</p><p>And you can not only view your data in CSV, JSON and YAML, but also save the output in those formats!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*7B_OtyEGsepB0pCE" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*Mn9jqPGcmHK6upOu" /></figure><p>Sample data in tabular format saved in a CSV.</p><h3>Want to try it out?</h3><p>If you are already an Alvin user, the documentation to install and use the CLI is <a href="https://docs.alvin.ai/getting-started/using-the-cli">here in our docs</a>.</p><p>If not, how about <a href="https://www.alvin.ai/signup?utm_source=medium&amp;utm_medium=article&amp;utm_campaign=Data+Governance+from+your%C2%A0Terminal">signing up for a free trial</a> or <a href="https://calendly.com/alvin-demo">booking a live demo</a>? :)</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=a2e1af7cc331" width="1" height="1" alt=""><hr><p><a href="https://medium.com/alvin-ai/data-governance-from-your-terminal-a2e1af7cc331">Data Governance from your Terminal</a> was originally published in <a href="https://medium.com/alvin-ai">Alvin</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[23 Life Lessons on the day I turn 23]]></title>
            <link>https://medium.com/@yashika51/23-life-lessons-on-the-day-i-turn-23-24b408ce5539?source=rss-d3deb7eb050c------2</link>
            <guid isPermaLink="false">https://medium.com/p/24b408ce5539</guid>
            <dc:creator><![CDATA[Yashika Sharma]]></dc:creator>
            <pubDate>Thu, 05 Jan 2023 15:36:57 GMT</pubDate>
            <atom:updated>2023-01-05T15:36:57.214Z</atom:updated>
            <content:encoded><![CDATA[<p>Today I turned 23 and I decided to take a moment and reflect back on the things I have learned so far. Life never stops you teaching lessons and there are tons of things you learn everyday but these are the 23 things I found worth mentioning.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*gGfLzyGWkMHU841G" /><figcaption>Photo by <a href="https://unsplash.com/it/@danielpd?utm_source=medium&amp;utm_medium=referral">Daniel Huniewicz</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><ol><li>Count your blessings, always be grateful for what you have, everyday.</li></ol><p>2. Don’t skip your meals.</p><p>3. Kindness is important. To everyone and anyone, including animals, everything that breathes deserves respect and kindness.</p><p>4. Your only competition is you. Dont compare yourself with others.</p><p>5. Progress is not always linear or fast paced.</p><p>6. One thing that I have learned from my mother is to wake up everyday and thanking god for the life before starting your day.</p><p>7. People come and go. Sometimes even blood related people wont be there for you and that’s okay, keeping expectations is going to let you down.</p><p>8. Patience is extremely important, it might not work today but it might tomorrow.</p><p>9. Don’t say things when you’re angry, most certainly you are going to regret it later.</p><p>10. Asking for too many opinions will confuse you, trust your instincts.</p><p>11. Do things for others without expecting anything back in return. You are doing it because you want to not because you want something back.</p><p>12. Leave your nest. Stepping out of comfort zone is the only way for long term growth.</p><p>13. Home is where your heart belongs, a ceiling and a few closed walls can’t be called home its just a living space.</p><p>14. You don’t have to be an early bird to get work done, I’ve been nocturnal for my whole life.</p><p>15. Your university doesn’t matter as long as you have hunger to learn and work hard on your own.</p><p>16. Trying something new means there are chances to fail but not trying it at all means you already lost the chances of success.</p><p>17. Maturity doesn’t come with age, it comes with experiences.</p><p>18. Learning everything at once is a bad idea, learning while doing is the best way.</p><p>19. “Don’t make perfect the enemy of good”. Things might not be perfect but still be decent to go forward with.</p><p>20. I love startups more than corporate roles. I strive better in a fast paced environment.</p><p>21. Always look for opportunities to learn, it doesn’t have to be technical or your field related, the world is full of interesting things to strike your curiosity from.</p><p>22. Health is equally important as wealth, healthy person can work towards getting wealth but wealth cannot buy you health back.</p><p>23. Embrace every experience you have, good or bad, it happens for a purpose, falling is part of the process but learning from it and moving forward is what doers do.</p><p>I plan to write my reflections of the year moving forward but for now I’ll rush to eat pizza and spend time watching my favourite shows :)</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=24b408ce5539" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Acknowledge Your Privileges]]></title>
            <link>https://medium.com/@yashika51/acknowledge-your-privileges-4f6f67227ab0?source=rss-d3deb7eb050c------2</link>
            <guid isPermaLink="false">https://medium.com/p/4f6f67227ab0</guid>
            <category><![CDATA[acceptance]]></category>
            <category><![CDATA[motivation]]></category>
            <category><![CDATA[self-awareness]]></category>
            <category><![CDATA[life]]></category>
            <category><![CDATA[progress]]></category>
            <dc:creator><![CDATA[Yashika Sharma]]></dc:creator>
            <pubDate>Wed, 04 Jan 2023 15:01:45 GMT</pubDate>
            <atom:updated>2023-01-04T15:01:45.166Z</atom:updated>
            <content:encoded><![CDATA[<p>Humans have a nature of complaining and seeking for better things constantly. It doesn’t matter if you are at the start of something or you’ve progressed and came through a long way, the feeling of not having enough is very common and probable to appear.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*TwjIGREwkVJXG5xO" /><figcaption>Photo by <a href="https://unsplash.com/@tateisimikito?utm_source=medium&amp;utm_medium=referral">Jukan Tateisi</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>While one should always aim for higher goals and progress throughout their life, a feeling of satisfaction and awareness of acknowledging their privileges and advantages is extremely important. Not only it helps in self fulfilment but also gives a chance to step out of your own little bubble and see how different people across different parts of the world are struggling.</p><figure><img alt="Be grateful" src="https://cdn-images-1.medium.com/max/1024/1*29xxy78XWoRhfWAjMpMWQg.jpeg" /><figcaption>Photo by <a href="https://unsplash.com/@ann10?utm_source=medium&amp;utm_medium=referral">Ann</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>It’s easy to take things for granted and not paying enough attention or a feeling of gratitude for the things you have in your life already, overlooked while running in the race of getting more.</p><p>Now, if you are wondering what are these privileges which are so obvious yet overlooked everyday, let me give you some examples:</p><ul><li>Do you have a plate full of food for all meals in a day? Acknowledge your privilege, millions of people are struggling to put bread on the table and losing lives from hunger.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*I-2cVU_HpVR6bEfW" /><figcaption>Photo by <a href="https://unsplash.com/@henniestander?utm_source=medium&amp;utm_medium=referral">Hennie Stander</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><ul><li>Complaining about having a small house but have a decent bed to sleep and a place to live? Think about those who are homeless.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*JzbOzCObi68hv4FV" /><figcaption>Photo by <a href="https://unsplash.com/@evstyle?utm_source=medium&amp;utm_medium=referral">Ev</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><ul><li>Have a peaceful enviroment where you dont have to be scared of your life? People are still in the middle of wars.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*TPRW5x1EIUBwpqcm" /><figcaption>Photo by <a href="https://unsplash.com/@jordymeow?utm_source=medium&amp;utm_medium=referral">Jordy Meow</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><ul><li>Got into a college and don’t like the university ranking? So many children still do not have right to education.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*PdE7vlKwkRvt3djs" /><figcaption>Photo by <a href="https://unsplash.com/@yansphotobook?utm_source=medium&amp;utm_medium=referral">Yannis H</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><ul><li>Tired of listening to your parents giving you advices about your betterment? Some people never get a chance to see their mom and dad and some lose them too soon.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*6ygPzCNwb1PP7-7JN-lomw.jpeg" /><figcaption>Photo on <a href="https://unsplash.com/photos/7edWO30e32k">Unsplash</a></figcaption></figure><p>These are just some very obvious things that we have in front of our eyes and yet we don’t appreciate them enough. There are countless number of things that you might have in your life but thousands of people are still struggling to have them even for once.</p><p>What might seem normal to you can be a huge privilege that someone else is day and night struggling and wishing for.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*D7Fd83YMKJzvCwIb" /><figcaption>Photo by <a href="https://unsplash.com/@aaronburden?utm_source=medium&amp;utm_medium=referral">Aaron Burden</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>Before taking anything for granted, rethink about how that can be a privilege of yours, it will help in self reflection and satisfaction along with gratefulness to appreciate that in your life even more. :)</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=4f6f67227ab0" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Write Robust APIs In Python With Three Layer Architecture, FastAPI and Pydantic Models]]></title>
            <link>https://medium.com/@yashika51/write-robust-apis-in-python-with-three-layer-architecture-fastapi-and-pydantic-models-3ef20940869c?source=rss-d3deb7eb050c------2</link>
            <guid isPermaLink="false">https://medium.com/p/3ef20940869c</guid>
            <dc:creator><![CDATA[Yashika Sharma]]></dc:creator>
            <pubDate>Fri, 18 Nov 2022 10:22:42 GMT</pubDate>
            <atom:updated>2022-11-18T10:22:42.979Z</atom:updated>
            <content:encoded><![CDATA[<p>Github is the ultimate source of project pool for software engineers. You can find plenty of ideas and implementations for almost anything that you can think of. While there are some amazing projects to take inspiration from, there are enough bad examples too.</p><p>Working on a software engineering project is more than just coding. Many people make the mistake of rushing directly to coding the idea in an unstructured way and skipping all the steps in between.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*-kB5yFEAzf_5fzib" /><figcaption>Photo by <a href="https://unsplash.com/@cdr6934?utm_source=medium&amp;utm_medium=referral">Chris Ried</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>Today we are going to talk about how to write more structured APIs by following three layers of Software Engineering architecture. Splitting up project into layers helps in abstraction and more manageable structure. Another advantage is, in case you want to change something in a particular layer, for example, the database connection or a framework, you can do that in that individual layer without affecting the other layers.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*dAxEK8g7VgdqcFK4w_C0ew.png" /><figcaption>The three layers of API Architecture</figcaption></figure><p>As you can see above, the three layers are :</p><ul><li>API Layer</li><li>Service Layer</li><li>Database Layer</li></ul><p>While setting up the project, a structure like below can be a good starting point.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/798/1*kbOKiKgc_ijX-BtYY-zsWQ.png" /><figcaption>Initial project setup in PyCharm</figcaption></figure><p>Above, you can see we have directories set up for each layer with an additional directory called schemas which will hold our Pydantic models.(We’ll talk about this in detail).</p><p>We’ll talk about a hypothetical example of building the backend for a website which involves entities like users , orders and items in the end. (We are not going to code the example fully, to keep it short, this post is more focused on explaining the structure and frameworks)</p><h3>API Layer</h3><p>This is the topmost layer in the backend architecture and one the user can directly interact with. As a good practise, no business logic should be present inside the API layer. Ideally, API Layer should interact with the service layer via Meta classes.</p><p>This layer should be the simplest, and deal with CRUD operations. We can also keep different versions inside the API layer like below</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/720/1*Wh7mq1mZ0QJAQ0eweMnR3Q.png" /><figcaption>Different API versions</figcaption></figure><p>One Layer that we did not mention before is the <strong>Schema Layer </strong>which is a sub layer under the Interface layer. Broadly speaking, both API Layer and schemas can be considered under the umbrella of Interface Layer.</p><p>A good Python backend schema logic would have custom classes defined with data schema and types for validation. We will talk more about this when we’ll talk about Pydantic Models.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/720/1*lICOBFqvFaTuFh4sJLCBpg.png" /><figcaption>Schemas for different entities</figcaption></figure><h3>Service Layer</h3><p>Most of the heavy-lifting is done within this layer. All the business logic that shouldn’t be exposed to the user should go inside the service layer. Service layer acts as an intermediate layer between the API layer and database layer. The logic and mapping is done within the service layer to prepare the request as such that it cannot fire a query against the database directly but sends the processed data to the DB layer instead. As a good practise we can breakdown the service layer into meta and implementation directories. As mentioned before, it’s better to use Meta classes holding abstract methods (which can be implemented inside the corresponding service implementation class) to interact with the API endpoints.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/720/1*nxJOS2ojjQBK5HFpEpuVwQ.png" /></figure><h3>Database Layer</h3><p>DB layer or database layer is the layer where all of the data ingestion, data modification or deletion logic is present. It contains database connectors and models. Before working on any project, a good practise is to write a High Level Design document with the expected data model defined. Specially in cases where you work on big projects, and the task involves modifying the existing data model, a proposal document goes a long way.</p><p>The database layer accepts the processed data from the service layer and perform queries and operations to interact with the database. It then returns Response objects that are passed through the service layer eventually to the API layer.</p><p>The DB layer will also hold data models that can be created for example using sqlalchemy along with the files for query operations, in this case we use postgres as an example.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/720/1*rMvEqJzVcpsDOhdI0XCnMA.png" /><figcaption>Database layer containing the models and postgres directories</figcaption></figure><p>Similar to the Service layer, the database layer can also be divided into meta and impl for more granularity.</p><h3>Fast API</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*cv18qjWN9VM3s5QauQxH6A.png" /><figcaption>FastAPI Github Repository</figcaption></figure><p>That was about setting up the structure, let’s talk about FastAPI now. FastAPI is an amazing web framework that one can use for creating APIs with Python based type hinting. It also comes very handy to work with the layer based structure that we just talked about.</p><p>FastAPI has multiple advantages, while it can be used for web development, I mostly find it extremely useful for building APIs.</p><p>The official documentation talks about the below cons for opting FastAPI</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*zByJ1YRuPB7F-4SoplM6Iw.png" /><figcaption>Key Features of FastAPI</figcaption></figure><p>The FastAPI documentation is very detailed but in a nutshell, its a very quick framework based on other powerful frameworks like <a href="https://www.uvicorn.org/">Uvicorn</a>, <a href="https://www.starlette.io/">Starlette</a>, <a href="https://pydantic-docs.helpmanual.io/">Pydantic</a> and <a href="https://github.com/OAI/OpenAPI-Specification">OpenAPI</a> which makes it more powerful along with native async support. With FasAPI you can write your API function parameters with Python 3.6+ type declarations and get automatic data conversion, data validation, OpenAPI schemas (with JSON Schemas) and interactive API documentation UIs.</p><p>For example, you can define data schema and types using Pydantic library which will make sure the validations for request and response payloads are done properly and you can also see documentation of your APIs without any extra efforts using Swagger UI which can also be extended and improved by writing docstrings in the API endpoint function itself.</p><p>One might question why to use FastAPI for REST API creation when we already have frameworks like Django REST framework and Flask RESTful.</p><p>Sure we do, but if you have tried to create an API with Django before you’d know that the process is similar to creating a web application because you have to use the Django Application model. Comparatively FastAPI is far better and easy to maintain compared than that.</p><p>Performance-wise as well FastAPI out-wins Django and Flask and is on par with NodeJS and Go.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*2m-DNAXcnI19rflG.png" /><figcaption><a href="https://www.techempower.com/benchmarks/">https://www.techempower.com/benchmarks/</a></figcaption></figure><h3>Pydantic</h3><p>If you know Python types well, Pydantic models are easy to understand as well.</p><p>Pydantic library offers type hints at runtime in a sense that you can create classes with attributes and use Pydantic to define types for them. You can also define default values while creating Pydantic classes(also called as Pydantic models).</p><p>Some of the advantages of using Pydantic are:</p><ul><li>Editor support, linting and autocomplete works well. Running lint script will help you catch any errors with validating the data types and schema while working on the project itself.</li><li>Faster than other similar libraries</li><li>Validating data, especially with recursive complex Pydantic models is easy.</li><li>Converts data types of attributes automatically wherever applicable. This means you can pass the same object you get from a request directly to the database, as everything is validated automatically and similarly from the database directly to the client.</li></ul><p>Now that we have learned more about the architecture and frameworks, let’s touch base on our prior example and see some FastAPI endpoints and Pydantic model examples in code.</p><p>Below is the schema prepared for entity UserAs you can see we have created three classes inherited from Pydantic’s BaseModel with attributes and expected types along. These classes can be used to validate the request and response objects for endpoints related to User</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*CpW_F9_CNtByzvGE6H5fYw.png" /><figcaption>Pydantic models for entity User</figcaption></figure><p>We can create endpoints for different entities structured like this and route them under v1 using router.py</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/718/1*Mm25w1NQ9xDQZxKKtrEraQ.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/854/1*_osalUrU-DqmeqZcIpqlaw.png" /><figcaption>Routing APIs under router.py</figcaption></figure><p>An example endpoint for getting user details based on email_id is shown below.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*oABAjPDdtSvBkBHQzH02CA.png" /><figcaption>Get user details based on email_id</figcaption></figure><p>As you can see, we are calling the get_user function from the Service Meta Layer (UserServiceMeta in this case)</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/946/1*0auzOSujenLDBY2iVv88Uw.png" /><figcaption>get_user abstract method in Meta Service class</figcaption></figure><p>This method can be fully implemented inside the user_service.py file in Service implementation that in turn can call the final function from the postgres_user.py file from Database layer which will involve read and write operations to the database.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/946/1*Czu394zWnysidfd2NIGVUQ.png" /><figcaption>postgres_user.py in database layer</figcaption></figure><p>One final thing in this section is the model inside the database layer. In this example I am using sqlalchemy as ORM to define the User model for the database.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*mHIR4RQ6hNXKgQNCPB3_zQ.png" /></figure><p>After implementing all the layers and connecting things together, next step is to test the endpoints. I usually do that with Postman(this is out of scope for this post).</p><p>You can also see your API documentation using Swagger UI. Below is an example of a GET endpoint from official documentation.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/960/0*YgyNR5SaS2ozLELE.png" /><figcaption>API documentation on Swagger UI</figcaption></figure><h3>Bonus Part</h3><p>Now that we have covered the primary focus of the post, I want to quickly talk about Typer.</p><p>If you want to build a CLI along with the FastAPI project, Typer can be a good option to consider. Both FastAPI and Typer are created by <br><a href="https://github.com/tiangolo">Sebastián Ramírez</a>. In his own words</p><blockquote>Typer is the FastAPIs of CLIs</blockquote><p>Typer is based on Click(another tool for building CLIs) so you get all its benefits, plug-ins, robustness, etc as well as Rich (Python library for <em>rich</em> text and beautiful formatting in the terminal).</p><p>Commands like typer.secho outputs beautiful text in the terminal with giving you options to choose the colors as well.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*P6_Xm7iXIgQCjvrLGEhk0w.png" /><figcaption>Example output from the documentation.</figcaption></figure><p>I have built CLI with Typer for the projects built with FastAPI backend in the past. Some of the advantages I found by using Typer are:</p><ul><li>It allows type hinting</li><li>Easy to use</li><li>Fast and short. Don’t have to write a lot of code to implement quick commands.</li><li>A very well written documentation</li></ul><p>Typer doesn’t use Pydantic directly as of now like FastAPI but there have been discussions on Github such as this <a href="https://github.com/tiangolo/typer/issues/181">one</a> to request the support. Perhaps someday we’ll see Pydantic in Typer as well.</p><p>Diving deep into “How to build CLIs with Typer” would be a topic for another post.</p><p>Thank you for reading!</p><h4>Resources:</h4><ul><li><a href="https://fastapi.tiangolo.com/">https://fastapi.tiangolo.com</a></li><li><a href="https://typer.tiangolo.com/">https://typer.tiangolo.com</a></li><li><a href="https://pydantic-docs.helpmanual.io/">https://pydantic-docs.helpmanual.io</a></li><li><a href="https://www.uvicorn.org">https://www.uvicorn.org</a></li><li><a href="https://github.com/OAI/OpenAPI-Specification">https://github.com/OAI/OpenAPI-Specification</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=3ef20940869c" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Be The Bigger One]]></title>
            <link>https://medium.com/@yashika51/be-the-bigger-one-1b8881b8ad65?source=rss-d3deb7eb050c------2</link>
            <guid isPermaLink="false">https://medium.com/p/1b8881b8ad65</guid>
            <category><![CDATA[moving-on]]></category>
            <category><![CDATA[self-awareness]]></category>
            <category><![CDATA[forgiveness]]></category>
            <category><![CDATA[motivation]]></category>
            <dc:creator><![CDATA[Yashika Sharma]]></dc:creator>
            <pubDate>Tue, 02 Aug 2022 00:02:20 GMT</pubDate>
            <atom:updated>2022-08-02T00:02:20.110Z</atom:updated>
            <content:encoded><![CDATA[<p>In the human life as long as you are alive there will be multiple instances of people lying, hurting or harming you. Most of the times it’s all mental pain and suffering which makes you think about things from a completely different perspective.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*aqnpZqch8H_28HN_" /><figcaption>Photo by <a href="https://unsplash.com/@fwed?utm_source=medium&amp;utm_medium=referral">Fred Moon</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>When you’re close to someone, you put all your faith in them, being assured they will never harm you but it doesn’t mean the other person has the same intentions.</p><p>Some of these incidents will leave pretty long living scars over your memory and will shape the prejudice you have to react to situations, judging people and making decisions.</p><p>While all of this is not something that can be easily controlled, one thing that can be done is to <strong><em>“Be the bigger one”</em></strong>.</p><p>As long as you hold on the feeling of what happened to you, it’s going to hurt even more.</p><p>These things sound very philosophical but when you can’t control anything, one thing you can control is <strong><em>“how you react to it”</em></strong>.</p><p>When <em>forgetting</em> is not an option, <strong><em>forgiving</em></strong> is. And if that’s too difficult then acknowledging what happened and moving on is the only thing you can do in present for your future self to thank you.</p><p>It take guts, maturity and self awareness to be the bigger one. It’s important to accept how you’re feeling, even if it’s bad, acknowledge it. Ignorance is a temporary fix if at all.</p><p>It takes multiple significant experiences to realise this but once you become bigger one moving on will be easier. It might not seem like it in the moment but the future will bring you the assurance of the right decisions you make in the present.</p><p>PS: This is not a post break up draft, this is just a 3 am thought!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=1b8881b8ad65" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Why what’s working for them might not work for you?]]></title>
            <link>https://medium.com/@yashika51/why-whats-working-for-them-might-not-work-for-you-bb63714522b0?source=rss-d3deb7eb050c------2</link>
            <guid isPermaLink="false">https://medium.com/p/bb63714522b0</guid>
            <category><![CDATA[self-improvement]]></category>
            <category><![CDATA[burnout]]></category>
            <category><![CDATA[life-lessons]]></category>
            <category><![CDATA[motivation]]></category>
            <category><![CDATA[hustle]]></category>
            <dc:creator><![CDATA[Yashika Sharma]]></dc:creator>
            <pubDate>Tue, 31 May 2022 21:02:22 GMT</pubDate>
            <atom:updated>2022-05-31T21:02:22.397Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*Eg1YO-UiXU9Sujs3" /><figcaption>The grass is greener on the other side(these are eggs but one of them is looking at the grass)</figcaption></figure><p>The motivation behind this post is the feelings I had or sometimes still have. I have to constantly remind myself about what I want and if still doing the things that I might not want to do would really make me happy.</p><blockquote>Now these are of course just my thoughts and people might disagree, there might be edge cases or your story/background might be different and I totally respect that. This is just what I’ve experienced and I’ve seen people around me feeling.</blockquote><p>With that disclaimer, let’s start imagining two cases, two loops that you are entering into, not because you want to but because they look great from the outside.</p><h3>Loop 1</h3><figure><img alt="People enter into loops only because it looks good from outside but the world inside is entirely different(and the one you probably dislike a lot)." src="https://cdn-images-1.medium.com/max/1024/0*Mnov7WGtXMA3renG" /><figcaption>People enter into loops only because it looks good from outside but the world inside is entirely different(and the one you probably dislike a lot).</figcaption></figure><p>Imagine you are an aspiring undergraduate, you love learning new skills, working with communities and building good hackathon projects.</p><p>You are looking at a classmate of yours bagging a very good and well paying internship at a FAANG(or MAANG now). You aren’t jealous of them(or might be) but you are feeling bad for yourself and thinking about why you didn’t get it in the first place.</p><p>You start the leetcode grind, solving questions out of the woods, with no clear goal in mind but with just the flames within chanting “you didn’t get the damn FAANG internship”. You do it until you feel burned out and leave it midway.</p><p>Now you are back to what you enjoy, maybe open source contributions, maybe some development or something else until you find another “pleasing loop” for yourself.</p><h3>Loop 2</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*XAMiopUp6pdLmCKL" /><figcaption>I did it before and now I am going to do it again. Loop 2 here i come……</figcaption></figure><p>A few days later while scrolling youtube you see some videos with millions of views. Guess what? It’s time for the next “pleasing loop”.</p><p>You watch someone creating you tube videos high in content quality….. you watch more……. think more and decide to stalk more.</p><p>You see they have thousands of followers on twitter, linkedin and every possible platform you can think of.</p><h4>Your mind to you:</h4><blockquote>“Why couldn’t I do this? I should have started posting content regularly, these concepts are the same stuff I learned. But….I think I might not have gotten so many views and followers but…..I should have done this. Let me try this now. I’ll create YOU TUBE Videos now!”</blockquote><p>You enter the second endless loop of following what pleases you but something that you don’t love or wanted to do yourself.</p><p>You again waste a couple of days/weeks or months in the loop and leave it midway with nothing.</p><h4>Why are you doing what others are doing? Time to retrospect!</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*42_gUcWcQy0WdVAr" /><figcaption>Enough of loops, can we retrospect?</figcaption></figure><p>Because …….</p><ol><li>You don’t know what you want to do(yet).</li><li>You know what you want to do but you don’t have the courage to do it further(thinking about ‘what’ and ‘ifs’)</li><li>You are very jealous of the person you are stalking(its normal but also very harmful for your inner peace)</li><li>You think their path might be your path too and what worked for them will work for you as well. (No, this is not how it works)</li></ol><h4><strong>BUT… The question is….Is that what you really want?</strong></h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*sRowRlq1N-XBDTKk" /><figcaption>Raise your hand to answer(to yourself)</figcaption></figure><p>Taking an example from the above two loops you stuck in for months, ask these questions to yourself:</p><p><strong>Loop 1(Bagging a big company internship dream):</strong></p><ol><li>Is getting a FAANG job your ultimate goal?</li></ol><p>2. Is this going to help you achieve what you want?(if you know what your goals are, if you don’t its fine)</p><p>3. Is it the work at FAANG or just the (so called)status symbol? Are you wanting this for yourself or your relatives who would eventually say a few good words on your face but in reality they don’t care at all?</p><p><strong>Loop 2(The short term, gaining followers passion):</strong></p><ol><li>Is your ultimate goal getting famous?</li></ol><p>2. Does being around people makes you happy?</p><p>3. Is this thing something that you’ll do for months and would never get bored really?</p><p><strong>If the answer to most of the questions above is <em>NO</em> or <em>I don’t think so</em>. My friend, what’s working for them might not work for you</strong>.</p><h4>WHY?</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*KDFf1Yoq4aA3dhHy" /><figcaption>Let the part time inspired passions run away, focus on what YOU want?</figcaption></figure><p>Because that is not what you want. Those things that are looking fancy to you right now might be something that you think will make you happy but it’s not the truth.</p><p>Even if you work until you burnout and match the pace of what the other person is doing, it’s not going to help.</p><p>At some point, most certainly you’ll again try to break out of the loop since you entered in it with no interest and this was expected to happen.</p><h3><strong>Takeaway</strong></h3><p>In case the loop examples didn’t seem very specific to your case, the overall idea is still the same. Often time people enter endless loops just because things are working for others. There’s always more to the story. What you are able to see is only the half truth. Everyone has their hustles and hard work before getting to where they are right now. And if the end goal is not what you actually want, you’ll not be able to progress at any case.</p><p>The path is built from motivation and passion, and if you don’t have these then you are just trying to replicate someone else’s footsteps and while this might work sometimes, most of the times you’ll find yourself stuck in the loop.</p><p>So always, think before blindly running behind anything. Your own dreams and passions deserves your energy.</p><p>This post is to only make you realise that what they are doing is not what you want. And even if it is what you want, the exact same path might not work.</p><p>I hope this can slightly contribute in helping you to find answers to whatever you are looking for. I love hearing different perspectives, if you’d like to chat or just share your views feel free to comment or reach out to me on <a href="http://twitter.com/yashika51">Twitter</a>.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=bb63714522b0" width="1" height="1" alt="">]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Decoding Word2Vec Part by Part]]></title>
            <link>https://medium.com/swlh/decoding-word2vec-part-by-part-9d5d7b8a946e?source=rss-d3deb7eb050c------2</link>
            <guid isPermaLink="false">https://medium.com/p/9d5d7b8a946e</guid>
            <category><![CDATA[nlp]]></category>
            <category><![CDATA[word2vec]]></category>
            <category><![CDATA[data-science]]></category>
            <dc:creator><![CDATA[Yashika Sharma]]></dc:creator>
            <pubDate>Sat, 20 Jun 2020 13:31:46 GMT</pubDate>
            <atom:updated>2020-06-25T23:13:51.280Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1000/0*sknegTKtoSIP_Iw8" /><figcaption><a href="https://unsplash.com/photos/JYBBcCbRaFc">Link</a></figcaption></figure><p>I was reading about Word2Vec and realized there are so many features and important information but all scattered. I had to read different blogs before completely understanding the logic. If you are like me and love to supplement reading the paper with blogs, this is the right place.</p><h4>What is Word2Vec?</h4><p>If you are exploring NLP you must have come across this word. Let me first introduce some other topics to build the intuition.</p><p>If you want to do exciting things with text including sentiment analysis, question-answering, text similarity, topic modeling, and what not you are going to feed words/sentences/documents (corpus 😍) to the model.</p><p>But the machine doesn’t understand the text right? So you would convert it into a machine-readable form.</p><p>For that:</p><p>The <strong>word vector</strong> is the thing you are looking for.</p><p>Word Vector is the representation of words in the vector form. Well, you can just one hot encode the words and match the encoded vectors but that won’t solve the purpose. Imagine having 1 Million words in the vocabulary and then encoding all those words to pass to the network. Why not learn to encode similarities to vectors?</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*tJ6r-l1ykivns-M208r1Mw.png" /><figcaption>Word representations in vector space (2-D)</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/680/1*UUYsXfzhCwKo8qWEaz-PZA.png" /><figcaption>Word representations in vector space (3-D)</figcaption></figure><p>Crazy Computation! Not Possible(Even if it is why would I kill my machine with heavy math and 0&#39;s?)</p><h4>What before Word2Vec?</h4><p>Before Word2Vec N-grams were used to capture the meaning of the word given n accompanying words before context word but N-grams cannot capture the context.</p><p>The overall probability would be P(current_word|n_words).</p><p>To enhance similarities we need embeddings. Don’t just cluster words, seek representation that can capture the degree of similarity.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/889/1*937Q9KQUhTmjNAqVDLhn-A.png" /><figcaption>Context includes words in a fixed window around the word in a text</figcaption></figure><p><strong>What’s the measurement of quality in Word2Vec?</strong></p><p>Its the similarity of words in a task.</p><p><strong>Advantages:</strong></p><ol><li>Computationally efficient</li><li>Accurate</li><li>Performs well on finding the semantic and syntactic similarity</li></ol><p>Okay, but how does these **<strong>WORD VECTORS WORK</strong>**?</p><p>Each word is encoded in a vector(as a number represented in multiple-dimension) to be matched with vectors of words that appear in a similar context. Hence a dense vector is formed for the text. The vectors are based on the features.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/482/1*lxmX1mjxYvZ19dCXr35tYg.png" /><figcaption>Word vectors are sometimes called word embeddings or word representations. They are a distributed representation</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/877/1*MlWtMs2tj2aW_51Qy_Dd-g.png" /><figcaption>See how related words are around <strong>‘expect’, </strong>it’s because those word vectors are similar</figcaption></figure><pre>Word Embedding gives the meanings of words with the help of vectors. Subtractions of vectors of some words gives rise to meaningful relationship.</pre><pre>For Example:<br>King-Man+Woman=Queen<br>(Vking-VMan+VWoman=VQueen, where V=Vector)</pre><p>By using word embedding we use a fully connected layer and its weights are called embedding weights(whose values are learned during training the model just like the other layers like Dense, CNN, etc are learned).</p><p>And this embedding weight matrix turns out to be <strong>a lookup table</strong>.</p><p>Without word embedding, you’d encode the text and then multiply that with the hidden layer. For 200 words you will one hot encode and then multiply with the hidden layer. That will give you mostly 0’s in the output. How inefficient is that?</p><p>Word Embedding comes to rescue. Since multiplication of any One hot encoded vector with weight matrix is the corresponding row itself, we just assign unique integers to word and then take the corresponding row from the lookup table. Thus there&#39;s no need to multiply.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/743/1*iERQ62A74FVXhndwNpVXNg.png" /><figcaption>Multiplying any OHE vector gives just the corresponding row.</figcaption></figure><p>So an embedding layer is a layer having embedding weights that are learned during training.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*3AwJ7riJOCuK2m2aQldTTg.png" /><figcaption>From the Udacity’s deep learning nano degree</figcaption></figure><p>Say there is a word heart whose index is 958. Now the 958th row of the embedding matrix will be the output and will be moved forward to the hidden layer.</p><p>These weights in the lookup table are just vector representations of words. Columns in these matrices represent the embedding dimension. Any word having the same meaning has the same representation.</p><h3><strong>Finally, let’s discuss Word2Vec</strong></h3><p>Word2Vec model uses this concept of embedding and lookup. Based on the word of interest and context it understands and learns the weights to prepare the matrix. This prepared matrix is embedding which understands the similarity in words.</p><p>The words in a similar context have similar representation. Word2Vec find these similarities and relationships between them during training and hence prepare a master vector representation called embedding.</p><p>Similar words are near each other and dissimilar are far in the representation.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/440/1*VAC6w53f2RK0vL_BG8lk3g.png" /></figure><p>This is all about Word Vectors, Embeddings, and Word2Vec. Keep an eye out for understanding SkipGram in part-2. Show your love by clapping 👏.</p><p>Drop your questions in the comments or reach out on <a href="https://twitter.com/yashika51">Twitter</a>.</p><p>Recommended Resources:</p><ol><li><a href="https://www.kaggle.com/pierremegret/gensim-word2vec-tutorial">https://www.kaggle.com/pierremegret/gensim-word2vec-tutorial</a></li><li><a href="http://web.stanford.edu/class/cs224n/slides/cs224n-2020-lecture01-wordvecs1.pdf">http://web.stanford.edu/class/cs224n/slides/cs224n-2020-lecture01-wordvecs1.pdf</a></li><li><a href="http://jalammar.github.io/illustrated-word2vec/">http://jalammar.github.io/illustrated-word2vec/</a></li></ol><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=9d5d7b8a946e" width="1" height="1" alt=""><hr><p><a href="https://medium.com/swlh/decoding-word2vec-part-by-part-9d5d7b8a946e">Decoding Word2Vec Part by Part</a> was originally published in <a href="https://medium.com/swlh">The Startup</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Understanding Count Vectorizer]]></title>
            <link>https://medium.com/swlh/understanding-count-vectorizer-5dd71530c1b?source=rss-d3deb7eb050c------2</link>
            <guid isPermaLink="false">https://medium.com/p/5dd71530c1b</guid>
            <category><![CDATA[text-preprocessing]]></category>
            <category><![CDATA[count-vectorizer]]></category>
            <category><![CDATA[naturallanguageprocessing]]></category>
            <category><![CDATA[data-preprocessing]]></category>
            <category><![CDATA[nlp]]></category>
            <dc:creator><![CDATA[Yashika Sharma]]></dc:creator>
            <pubDate>Thu, 21 May 2020 13:47:58 GMT</pubDate>
            <atom:updated>2020-05-24T02:26:52.411Z</atom:updated>
            <content:encoded><![CDATA[<p>Whenever we work on any NLP related problem, we process a lot of textual data. The textual data after processing needs to be fed into the model.</p><p>Since the model doesn’t accept textual data and only understands numbers, this data needs to be vectorized.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1000/0*XcNWxnBYDK1EBXof" /><figcaption><a href="https://unsplash.com/photos/BVyNlchWqzs">Reference</a></figcaption></figure><p><strong>What do I mean by vectorized?</strong></p><p>Before we use text for modeling we need to process it. The steps include removing stop words, lemmatizing, stemming, tokenization, and vectorization. Vectorization is a process of converting the text data into a machine-readable form. The words are represented as vectors.</p><p>However, our main focus in this article is on Count Vectorizer. Let’s get started by understanding the Bag of Words model:</p><h4>Bag of Words(BoW)</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1000/0*jUF1evoDMjVXQCaH" /><figcaption><a href="https://unsplash.com/photos/2NiVOHcIx4I">Reference</a></figcaption></figure><p>As already mentioned, we cannot process text directly, so we need to convert it into numbers. The Bag of Words(BoW) model is a fundamental (and old way) of doing this.</p><p>The model is very simple as it discards all the information and order of the text and just considers the occurrences of the word. It converts the documents to a fixed-length vector of numbers.</p><p>A unique number is assigned to each word. Within the length of the vocabulary(vocabulary means a collection of all the unique words), the frequency of words is assigned. This is the encoding of the words, in which we are focusing on the representation of the word and not on the order of the word.</p><p>There are multiple ways with which we can define what this ‘encoding’ would be. Our focus in this post is on <strong>Count Vectorizer.</strong></p><h4>Count Vectorizer:</h4><p>CountVectorizer tokenizes(tokenization means dividing the sentences in words) the text along with performing very basic preprocessing. It removes the punctuation marks and converts all the words to lowercase.</p><p>The vocabulary of known words is formed which is also used for encoding unseen text later.</p><p>An encoded vector is returned with a length of the entire vocabulary and an integer count for the number of times each word appeared in the document. The image below shows what I mean by the encoded vector.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*edJflAjc-CTmsku0QyPt8g.png" /><figcaption>Count Vectorizer sparse matrix representation of words. (a) is how you visually think about it. (b) is how it is really represented in practice.</figcaption></figure><p>The row of the above matrix represents the document, and the columns contain all the unique words with their frequency. In case a word did not occur, then it is assigned zero correspondings to the document in a row.</p><p>Imagine it as a one-hot encoded vector and due to that, it is pretty obvious to get a sparse matrix with a lot of zeros.</p><p>The scikit-learn library offers functions to implement Count Vectorizer, let’s check out the code examples.</p><h4>Examples</h4><p>In the code block below we have a list of text. Here each row is a document. We are keeping it short to see how Count Vectorizer works.</p><p>First things first, let’s do the import. Also, observe document containing the list of documents we are going to process:</p><pre>from sklearn.feature_extraction.text import CountVectorizer</pre><pre>document=[&quot;devastating social and economic consequences of COVID-19&quot;,<br>&quot;investment and initiatives already ongoing around the world to expedite deployment of innovative COVID-19&quot;,<br>&quot;We commit to the shared aim of equitable global access to innovative tools for COVID-19 for all&quot;,<br>&quot;We ask the global community and political leaders to support this landmark collaboration, and for donors&quot;,<br>&quot;In the fight against COVID-19, no one should be left behind&quot;]</pre><p>The second step is to initialize the object cv_doc for using Count Vectorizer and fitting it on our document:</p><pre>cv_doc=CountVectorizer(document)</pre><pre>vocab=cv_doc.fit(document)</pre><p>The text has been preprocessed, tokenized(word-level tokenization: means each word is a separate token), and represented as a sparse matrix. The best part is it ignores single character during tokenization like I and a.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/248/1*lfyBkoE-5EV1NQXmXNVbRA.png" /><figcaption>This is how our vocab looks like.</figcaption></figure><p>To see the complete vocabulary we can write vocab.vocabulary_ .</p><p>Note that the numbers here are not the count, they are the positions in the sparse matrix.</p><p>Further, there are some additional parameters you can play with.</p><ol><li><strong>Stop words: </strong>You can pass the stop_words list as an argument. The stop words are words that are not significant and occur frequently. For example ‘the’, ‘and’, ‘is’, ‘in’ are stop words. The list can be custom as well as predefined.</li></ol><p>Define your own list of stop words that you don’t want to see in your vocabulary.</p><pre>cv1=CountVectorizer(document,stop_words=[&#39;the&#39;,&#39;we&#39;,&#39;should&#39;,&#39;this&#39;,&#39;to&#39;])</pre><pre>#check out the stop_words you sepcified<br>cv1.stop_words</pre><p>2.<strong> min_df: </strong>min_df equals a number specifies how much importance you want to give to the less frequent words in the document. There might be some words that appear only once or twice and may qualify as noise.</p><p><strong>What does min_df do?</strong></p><p>min_df considers words that are only present in a minimum of 2 documents. We can also pass a proportion instead of an absolute number.</p><p>For example, min_df=0.25 ignores words that are present in less than 25% of the document</p><pre>cv2=CountVectorizer(document, min_df=2)</pre><p>3. <strong>max_df: </strong>Similar to min_df there is max_df which indicates the importance you want to give to the most frequent words. There might be some words that are very frequent and you don’t want to include in your vocab, in that case, max_df is used.</p><p>It’s opposite to min_df and considers words based on their presence in the maximum n number of documents specified.</p><p>Let’s test the proportion instead of the absolute number here. If words are present in more than 25% of the document they are ignored.</p><pre>cv3=CountVectorizer(document, max_df=0.25)</pre><p>4. <strong>Tokenizer:</strong> If you want to specify your custom tokenizer, you can create a function and pass it to the count vectorizer during the initialization. We have used the NLTK library to tokenize our text.</p><pre>def tok():<br> #add your code here</pre><pre>cv4=CountVectorizer(document,tokenizer=tok)</pre><p>5. <strong>Custom Preprocessing:</strong> The same goes for preprocessing if you want to include stemmer and lemmatizer for preprocessing the text, you can define a custom function just like we did for the tokenizer. Although our data is clean in this post, the real-world data is very messy and in case you want to clean that along with Count Vectorizer you can pass your custom preprocessor as an argument to Count Vectorizer. Keeping the example simple, we are just lowercasing the text followed by removing special characters.</p><pre>def preprocess():<br> #add your code here</pre><pre>cv5=CountVectorizer(document,tokenixer=my_tok)</pre><p>6. <strong>n-grams:</strong> Combination of words sometimes are more meaningful. Let’s say we have words <em>‘sunny’</em> and<em> ‘day’</em>,<em> ‘sunny day’</em> combined makes more sense. This is bigram. We can also use character level and word level n-grams. ngram_range=(1,2) specifies we want to consider both unigrams(single words) and bigrams(a combination of 2 words).</p><pre>cv6=CountVectorizer(document, ngram_range=(1,2))</pre><p>7. <strong>Limiting Vocabulary size: </strong>We can mention the maximum vocabulary size we intend to keep using max_features. In this example we are going to limit the vocabulary size by 20.</p><pre>cv7=CountVectorizer(document, max_features=20)</pre><p>Phew! That’s all for now. CountVectorizer is just one of many methods to deal with textual data. The TF-IDF and embeddings are better methods to vectorize the data. More on that later.</p><p>To access the code used in this article, Check out the repository <a href="https://github.com/yashika51/Understanding-Count-Vectorizer">here</a>.</p><p>Recommended Resources:</p><ul><li><a href="https://www.kaggle.com/pceccon/countvectorizer-and-tfidf-strategies">CountVectorizer and Tfidf strategies</a></li><li><a href="https://kavita-ganesan.com/how-to-use-countvectorizer/#.XsaDTmgzZPZ">10+ Examples for Using CountVectorizer | Kavita Ganesan</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=5dd71530c1b" width="1" height="1" alt=""><hr><p><a href="https://medium.com/swlh/understanding-count-vectorizer-5dd71530c1b">Understanding Count Vectorizer</a> was originally published in <a href="https://medium.com/swlh">The Startup</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[EmoTorch]]></title>
            <link>https://medium.com/@yashika51/emotorch-448d40b1f56c?source=rss-d3deb7eb050c------2</link>
            <guid isPermaLink="false">https://medium.com/p/448d40b1f56c</guid>
            <category><![CDATA[facebook-ai-hackathon]]></category>
            <category><![CDATA[pytorch]]></category>
            <category><![CDATA[vgg19]]></category>
            <category><![CDATA[emotion-recognition]]></category>
            <category><![CDATA[artificial-intelligence]]></category>
            <dc:creator><![CDATA[Yashika Sharma]]></dc:creator>
            <pubDate>Mon, 16 Mar 2020 22:43:04 GMT</pubDate>
            <atom:updated>2020-03-16T22:43:04.800Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/626/0*tgtQC6nbKSM9GMvG.jpg" /><figcaption>Sample from FER2013 Dataset with labels</figcaption></figure><p><strong>EmoTorch</strong> is a project built as a part of the <a href="https://devpost.com/software/emotion-recognition-for-a-recommender-system">Facebook AI Hackathon 2020 </a>using PyTorch. The project aims at predicting the emotion of a person based on the image of their face. The image can be anything ranging from a selfie to an image captured while scrolling the feed via the mobile’s front camera or webcam.</p><p>Because of the PyTorch’s diverse modules and packages, that is the main tool used in this project.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/876/0*Hxn7SRSqC0YPeKlw.jpg" /></figure><p>This image is then fed to the neural network which extracts the features from the image, analyzes the emotions and predicts the most accurate emotion out of 7 most common emotions.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/726/0*KQdCjTAg0qsZhXht.jpg" /><figcaption>Most Likely predicted classes</figcaption></figure><p>This article is the explanation of the model, the motivation behind the idea and future scope will be in next article.</p><p>The<a href="https://github.com/AhMeDxHaMiDo/EmoTorch/"> EmoTorch repository</a> is explanatory to a great extent but we will overview the project from the top here.</p><h3>Choosing the Dataset</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/932/1*JE0GJ-DBNa9CBfY6mQnnXQ.png" /><figcaption><a href="https://datarepository.wolframcloud.com/resources/FER-2013">https://datarepository.wolframcloud.com/resources/FER-2013</a></figcaption></figure><p>The first step in any project is to choose a dataset, we chose the publicly available FER dataset for our task. The reason behind choosing this dataset :</p><ul><li>It has images categorized in one of the seven emotions.</li><li>Is publicly available</li><li>The length of the dataset is suitable for our task with</li><li>Training set: 28,709 examples.</li><li>Test set: 3,589 examples. -Validation set: 3,589 examples</li></ul><p>The data was pulled from a past Kaggle Competition.</p><p>There are two files available. First file train.csv contains two columns, “emotion” and “pixels”. The “emotion” column contains a numeric code ranging from 0 to 6, inclusive, for the emotion that is present in the image. The “pixels” column contains a string surrounded in quotes for each image. The contents of this string a space-separated pixel values in row-major order. test.csv contains only the “pixels” column and our task is to predict the emotion column.</p><p>The emotions available are as follows</p><pre>{&#39;0&#39;: &#39;angry&#39;,<br> &#39;1&#39;: &#39;disgust&#39;,<br> &#39;2&#39;: &#39;fear&#39;,<br> &#39;3&#39;: &#39;happy&#39;,<br> &#39;4&#39;: &#39;neutral&#39;,<br> &#39;5&#39;: &#39;sad&#39;,<br> &#39;6&#39;: &#39;surprise&#39;}</pre><p>However, the model is built in a way to work with any dataset. We can use commercial datasets with our model and get better results.</p><p>After choosing the dataset, we preprocessed the images. The images were already centered crop with good to go dimensions. We did not choose to resize and went with just normalizing the images and converting them to tensors.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/840/1*XPpl32f3rqU6huxVPUG4nw.png" /><figcaption>Data Augmentation</figcaption></figure><h3>Network Architecture</h3><p>The motivation of using transfer learning for our task came after we implemented a Deep Neural Network from Scratch. The model built from scratch gives accuracy only around 18%-20%. We boosted the accuracy with the help of transfer learning.</p><p>PyTorch’s subpackage <a href="https://pytorch.org/docs/stable/torchvision/models.html">model </a>has a variety of pre-trained networks that can be easily downloaded.</p><p>For EmoTorch, we tried multiple networks before settling to VGG19.</p><p>Initially, we used VGG16 which gave us accuracy below 40% followed by ResNet50 with 41% and DenseNet101 with 42.5%.</p><p>VGG19 yields an accuracy of 46% which is better than all other pre-trained models.</p><p>Therefore, we decided to choose VGG19 for the implementation</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/857/0*G1-p66vASJSsDwQH.jpg" /><figcaption>VGG19 Architecture</figcaption></figure><h3>Model &amp; HyperParamters</h3><p>The pre-trained models are trained using the ImageNet dataset, which has 1000 classes. Our task is to only classify the images into one of the 7 emotions so we had to alter the classification layer. We prepared our own Network to merge with the vgg19 pre-trained layers.</p><p>For this task we chose-</p><ul><li>1024 dense hidden layer</li><li>ReLu Activation Function</li><li>Dropout layers in between the hidden layers with p=0.2</li><li>Adam Optimizer</li><li>25 Epochs</li><li>Batch Size of 64</li></ul><h3>The Imbalanced Dataset</h3><p>The distribution of samples per category in the FER dataset is not balanced. The category disgust is least represented with only 547 samples whereas the category happiness is most represented with 8989 samples.</p><p>Future Scope lies in augmentation. Multiple balancing techniques can be used to present an equal number of apparations per category which will result in higher accuracy</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/492/0*4DSWvjGQeFKBR-tp" /><figcaption>Data Distribution</figcaption></figure><h3>Accuracy</h3><p>The model gives an accuracy of 46% which is due to the dataset we used. The commercial large dataset with high-resolution pictures can outperform and give better accuracy.</p><h4>Some of the plots we plotted with TensorBoard :</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*7GC1t5DseGP_REVd" /><figcaption>Training Loss</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*dsPOBEwcet5eUUQz" /><figcaption>Validation Loss</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*BMKL3V1qxp5eWs4M" /><figcaption>Valid Accuracy</figcaption></figure><h3>Few Examples from Testing Set</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/759/1*QNcNrZlcs0uVEk2TozGhFA.png" /><figcaption>Top 3 predicted classes</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/497/0*LhDrRP0j0EkOmMvg.jpg" /><figcaption>All the class probability</figcaption></figure><h3>What’s Next?</h3><p>EmoTorch can be combined with recommendation systems. The image when passed to our model will return the predicted emotion. This emotion can be used by the system to recommend the products.</p><p>Often we see recommendation systems working based on user’s watch history or buying history. EmoTorch gives real-time predictions that will help in more accurate recommendations. User can either feed their selfie to the system or front camera can track the facial expressions on the user’s consent. In any case, the image will be then processed by EmoTorch and prediction will be used by the system to recommend songs to listen, movies to watch, products to buy, places to visit and much more.</p><p>Contributors to EmoTorch are-</p><ul><li>Yashika Sharma</li><li>Nathan Curtis</li><li>Ahmed Hamido</li></ul><p><a href="https://github.com/AhMeDxHaMiDo/EmoTorch/">Visit</a> the project repository.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=448d40b1f56c" width="1" height="1" alt="">]]></content:encoded>
        </item>
    </channel>
</rss>