<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Andrei Lapets on Medium]]></title>
        <description><![CDATA[Stories by Andrei Lapets on Medium]]></description>
        <link>https://medium.com/@lapets?source=rss-f6b147043b04------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/2*govjM3GR9J6ATUjNktS2zg.jpeg</url>
            <title>Stories by Andrei Lapets on Medium</title>
            <link>https://medium.com/@lapets?source=rss-f6b147043b04------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Wed, 08 Apr 2026 19:59:03 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@lapets/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[Accessible and Scalable Secure Data Evaluation]]></title>
            <link>https://medium.com/nthparty/accessible-and-scalable-secure-data-evaluation-de13abfd086c?source=rss-f6b147043b04------2</link>
            <guid isPermaLink="false">https://medium.com/p/de13abfd086c</guid>
            <category><![CDATA[data]]></category>
            <category><![CDATA[data-privacy]]></category>
            <category><![CDATA[security]]></category>
            <category><![CDATA[advertising-and-marketing]]></category>
            <category><![CDATA[secure-computation]]></category>
            <dc:creator><![CDATA[Andrei Lapets]]></dc:creator>
            <pubDate>Wed, 21 Jul 2021 06:07:20 GMT</pubDate>
            <atom:updated>2021-07-22T20:11:07.203Z</atom:updated>
            <content:encoded><![CDATA[<h4>Infutor protects customer data using multi-party computation</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ooQTMHYhg0eYmNiVOW94sw.png" /></figure><p>Members of the Nth Party team have spent the <a href="https://multiparty.org/">last six years</a> building and enhancing libraries, applications, and products that deliver or incorporate secure multi-party computation (MPC) protocols and capabilities. More recently, browser-based tools released by Nth Party (such as <a href="https://nth.link/">nth.link</a>) have demonstrated that MPC is ready to secure common data workflows and transactions encountered in domains such as marketing and advertising.</p><p>Over the last several months, Nth Party has had the opportunity to collaborate with <a href="https://infutor.com/">Infutor Data Solutions</a> in building an accessible, scalable secure MPC solution for a common data workflow: evaluation of a prospective customer’s data against a full-scale identity graph. <a href="https://infutor.com/infutor-taps-nth-party-to-enhance-consumer-identity-data-match-test-capability/">Launched</a> as part of Infutor’s new <a href="https://infutor.com/test-drive">Test Drive</a> experience, the solution allows prospects and customers to explore how Infutor’s products can add value <em>without revealing their data to Infutor</em> <em>or any other third party</em>.</p><h3>Secure Data Evaluation at Scale with Infutor</h3><p>Within the fractured value chains of the data-driven marketing and advertising spaces, companies engage in an innumerable quantity of data transactions on a daily basis. Often, even transactions that may be exploratory in nature (such as those in which a prospective customer is evaluating a vendor’s product or service) involve the transfer of sensitive data about consumers. Conducting these transactions in a safe and compliant manner can require lengthy negotiations, the assembly of legal agreements, the use of data governance tools and platforms, and investment in secure data storage infrastructure or services. But these mitigation techniques are not addressing the root issue: <em>data must be transferred between parties and processed in its decrypted form</em>.</p><p>Secure MPC eliminates the root issue: only encrypted data is ever transferred, and <em>no one except the original data owner has a decryption key</em>. This is a key feature of secure MPC protocols, and is <a href="https://www.infosum.com/blog/assessing-googles-data-collaboration-tool-private-join-and-compute">sometimes misunderstood</a>: <strong>transfer of encrypted data to a recipient who has no key involves no transfer of information</strong> — and thus carries no risk— because the data cannot be decrypted.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/540/1*aZQByBR0p9vPkdsNbEfvNg.png" /></figure><p>Using MPC techniques and protocols, Infutor’s Test Drive experience ensures that data is encrypted in a prospective customer’s browser and never decrypted after that point. Quoting the <a href="https://infutor.com/infutor-taps-nth-party-to-enhance-consumer-identity-data-match-test-capability/">press release</a>:</p><blockquote><em>“Data security and privacy has always been our #1 priority,”</em> says Gary Walter, Chief Executive Officer of Infutor, <em>“With Nth Party’s encryption technology we don’t need to see the data to demonstrate value. It’s a win-win for us and our clients.”</em></blockquote><p>In only a few minutes, the secure computation compares the customer’s encrypted data to Infutor’s identity graph (which features hundreds of millions of records). The customer can then view the results to instantly explore how Infutor’s products can help them achieve their business objectives.</p><h3>Accessible and Scalable MPC</h3><p>Nth Party’s work with Infutor builds on <a href="https://medium.com/nthparty/introducing-nth-link-9f956d6a63c3">previous product releases</a> and demonstrates once again that given the right delivery mechanism and configuration, MPC protocols are more than ready to address real-world data analysis workflows at industrial scales. While some skepticism is sometimes expressed about the maturity of MPC techniques, we hope this work builds confidence in the marketplace that services providers can offer their customers the strong security benefits of well-studied technologies such as secure MPC and private set intersection (which underlie <a href="https://nth.link/">nth.link</a>, <a href="https://engineering.fb.com/2020/07/10/open-source/private-matching/">Facebook’s Private-ID</a>, and <a href="https://github.com/Google/private-join-and-compute">Google’s Private Join and Compute</a>, among others). Furthermore, the security benefits of MPC can be incorporated into on-premise software-only solutions…</p><ul><li>without the need for trusted third parties or clean rooms,</li><li>without use of any specialized hardware, and</li><li>without moving data (either that of the service providers or their customers) to expensive third-party SaaS platforms.</li></ul><p>How was the MPC solution for Infutor’s Test Drive experience designed and developed? Throughout its work on practical use cases involving MPC, the Nth Party team has focused on identifying low-hanging fruit opportunities: find the <a href="https://ieeexplore.ieee.org/document/7839796">simplest protocol</a> that leverages <a href="https://eprint.iacr.org/2017/803.pdf">asymmetry of participant roles</a> to offer the required privacy benefits at scale, combine it with <a href="https://en.wikipedia.org/wiki/Introduction_to_Algorithms">well-understood and long-established algorithms and data structures</a>, and build <a href="https://ieeexplore.ieee.org/document/8901552/">software libraries</a> and <a href="https://dl.acm.org/citation.cfm?id=3212701">applications</a> that leverage contemporary serverless computing platforms and operate within ubiquitous environments such as web browsers.</p><p>The Nth Party team looks forward to building more commercial MPC applications that help secure data-driven workflows, and to contributing <a href="https://github.com/nthparty">open-source libraries</a> that help everyone build production-quality, at-scale secure applications.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=de13abfd086c" width="1" height="1" alt=""><hr><p><a href="https://medium.com/nthparty/accessible-and-scalable-secure-data-evaluation-de13abfd086c">Accessible and Scalable Secure Data Evaluation</a> was originally published in <a href="https://medium.com/nthparty">Nth Party</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Privacy-Preserving Information Exchange Using Python]]></title>
            <link>https://medium.com/nthparty/privacy-preserving-information-exchange-using-python-1a4a11bed3d5?source=rss-f6b147043b04------2</link>
            <guid isPermaLink="false">https://medium.com/p/1a4a11bed3d5</guid>
            <category><![CDATA[open-source]]></category>
            <category><![CDATA[cryptography]]></category>
            <category><![CDATA[privacy]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[software-development]]></category>
            <dc:creator><![CDATA[Andrei Lapets]]></dc:creator>
            <pubDate>Fri, 30 Oct 2020 18:01:00 GMT</pubDate>
            <atom:updated>2021-09-20T06:02:09.484Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*qv_uRc6kpn91vhh9" /><figcaption>Photo by <a href="https://unsplash.com/@lucabravo?utm_source=medium&amp;utm_medium=referral">Luca Bravo</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><p>Imagine a scenario involving a vendor that offers two distinct digital products for sale at the same price (such as mobile apps or digital content) and a customer that would like to purchase exactly one of these two products. Furthermore, suppose that the customer and vendor are interested in performing this transaction in a way that preserves the privacy of the customer. To be more specific, whatever approach is used needs to satisfy the following two criteria:</p><ul><li>the customer does not disclose to the vendor which of the two digital products they are purchasing , but</li><li>the vendor allows the customer to purchase exactly one of the two digital products.</li></ul><p>Is this even possible? One approach the vendor and customer can employ is to recruit a <em>trusted third party</em>. This third party can provide a kind of escrow service: retrieve a copy of each of the two products from the vendor and accept the selection from the customer, and then deliver to the customer only a copy of the selected product.</p><p>This approach satisfies the two criteria as they were originally stated, but it introduces a number of potential issues. Two of these include (1) the need to recruit (and potentially compensate) a third party and (2) the necessary disclosure of the customer’s selection to the third party. The latter of these may be particularly problematic if the customer actually does not wish to disclose their selection to <em>any</em> third party (which just happens to include the vendor).</p><h3>Exchanging Information via Oblivious Transfer</h3><p>In fact, it possible for the vendor and customer to perform this transaction while satisfying the original two conditions and <em>without</em> relying on a third party. Instead, the vendor and customer could each use a simple piece of software to communicate with one another via a cryptographic protocol for exchanging information known as <em>oblivious transfer</em>, or OT.</p><p>A technique first published in 1985 called <a href="http://www.lix.polytechnique.fr/~catuscia/teaching/papers_and_books/SigningContracts.pdf">one-out-of-two oblivious transfer</a> is a form of secure computation that allows two parties to interact in exactly the way the vendor and customer in our scenario would like: a sender can deliver exactly one of two messages to a receiver without knowing which message it delivered. Since that time, generalized and streamlined variants of this technique have been developed, such as a <a href="https://eprint.iacr.org/2017/370">simple protocol</a> published in 2015 by Genç, Iovino, and Rial.</p><h3>Simple OT using Python</h3><p>The open-source <a href="https://pypi.org/project/otc/">otc</a> Python library published by Nth Party provides an encapsulated implementation of the protocol published by Genç, Iovino, and Rial. How can the library be used to perform a privacy-preserving transaction that satisfies the original criteria?</p><p>The library lets programmers construct one of two objects: a sender object or a receiver object. In the transaction scenario, the vendor is the sender and the customer is the receiver. For the purposes of this example, the two products available for purchase are each a 16-byte string (this is not implausible in practice, as the two 16-byte strings could be cryptographic keys that can be used to decrypt larger files).</p><p>At the beginning of a transaction, the vendor must create a sender object (assigned to the variable s in the example below) and send a public key s.public to the receiver. To learn more about why this is called a <em>public key</em> and how public keys are used in secure communication, you may want to delve deeper into the details of <a href="https://en.wikipedia.org/wiki/Public-key_cryptography">public-key cryptography</a>.</p><pre>&gt;&gt;&gt; <strong>import</strong> otc<br>&gt;&gt;&gt; s = otc.send()<br>&gt;&gt;&gt; s.public <em># Public key to send to the receiver.<br></em>b&#39;\x18\x91\xee\xc9\xe7|\x81k\xf5a\xd2\x9b\xdbc\x92\xe9\x8c\xc4\x1c)\xb6u\x90\xb0\xfc\x91\x04\xc7\x80\xcd~z&#39;</pre><p>Once they receive the public key s.public, the customer can create a receiver object r and use it to build a query byte string (assigned to the variable query below) that effectively represents an encrypted request for one of the two items. In the example below, the receiver is requesting item 1 (the choices are 0 or 1). Note that the sender <em>is not able to decrypt</em> this query and <em>cannot determine which item is being requested</em> by examining it.</p><pre>&gt;&gt;&gt; <strong>import</strong> otc<br>&gt;&gt;&gt; r = otc.receive()<br>&gt;&gt;&gt; selection = r.query(s.public, 1) <em># Use public key from sender.</em><br>&gt;&gt;&gt; selection <em># Selection to share with sender.</em><br>b&#39;z\x01T\xbc\xa8\r2\xf0@v\x16k\xb7_\x01\x1a:\xdd\x8d\xb2\x8du1\xee\x99\xd1\xe0\xd1|\xe5\xad\x11&#39;</pre><p>Once the sender receives the query byte string query, they can build a reply (assigned to the variable replies in the example below) consisting of a pair of byte strings. These two byte strings are the encrypted versions of the two products, but <em>only</em> the product originally selected by the receiver can be successfully decrypted. In the example below, the products offered by the vendor are two 16-letter words.</p><pre>&gt;&gt;&gt; replies = s.reply(<br>...     selection,<br>...     &#39;absentmindedness&#39;.encode(),<br>...     &#39;wholeheartedness&#39;.encode()<br>... )<br>... <br>&gt;&gt;&gt; replies <em># Encrypted replies to share with receiver.</em><br>(b&#39;\xd8\xda\xdf\xf0\x89JsJ\xb5\x9e0\x0b\xe8Kd\xcf\x1f\x92\xf2\x18\r\xc6r\xdc)\x04\xa0\x990\x93\xc1f&#39;, b&#39;\xd0iy3\xa2\xb8\xbf\xefI\x0eF\xf9\rI^\xf9\xaf\x7fwO\xbd\x18\x9cL\x12\xba&gt;\xd2V\xed\xec\xb4&#39;)</pre><p>The receiver can now use the original receiver object r constructed at the beginning to obtain the product (which is selection 1 from the two choices 0 and 1) to which they originally committed when they generated their query.</p><pre>&gt;&gt;&gt; r.elect(s.public, 1, *replies).decode()<br>&#39;wholeheartedness&#39;</pre><p>Note that reversing the two parts of the reply (in effect, attempting to decrypt the other product) will result in a decryption error.</p><pre>&gt;&gt;&gt; <strong>try</strong>:<br>...     r.elect(s.public, 1, *reversed(rs)).decode()<br>... <strong>except</strong> Exception <strong>as</strong> e:<br>...     print(e)<br>...<br>Decryption failed. Ciphertext failed verification</pre><p>Of course, having only two products that are each a 16-byte string is not a very realistic scenario. But these limitations can be overcome by creatively chaining multiple instances of the back-and-forth exchange presented in the example above. For now, we leave these generalizations of the approach as an exercise for the reader.</p><h3>Implementation Details</h3><p>The <a href="https://pypi.org/project/otc/">otc</a> library is an implementation of the <a href="https://eprint.iacr.org/2017/370">Genç, Iovino, and Rial protocol</a> that relies on cryptographic primitives that consist of operations involving <a href="https://en.wikipedia.org/wiki/Elliptic-curve_cryptography">elliptic curve</a> points and scalars. This OT library in turn leverages another library, <a href="https://pypi.org/project/oblivious/">oblivious</a>, that provides a single API that can seamlessly switch between <a href="https://libsodium.gitbook.io/doc/">libsodium</a> and pure-Python implementations of the necessary primitives. This means the OT library can be used either in conjunction with a compiled instance of libsodium (yielding much better performance) or on its own (ensuring better portability at the expense of performance).</p><h3>Practical Secure Computation</h3><p>Hopefully, this article illustrates that incorporating cryptographic techniques like oblivious transfer into software applications that are implemented using contemporary, ubiquitous programming languages such as Python is becoming more practical. Visit the <a href="https://github.com/nthparty">Nth Party GitHub</a> to find and/or contribute to <a href="https://github.com/nthparty/otc">otc</a>, <a href="https://github.com/nthparty/oblivious">oblivious</a>, and other libraries that you might find useful for your projects. And, of course, be sure to check back here in the future for more examples and tutorials like this one.</p><p><em>This article is also available as a </em><a href="https://github.com/nthparty/article-privacy-preserving-information-exchange/blob/master/privacy-preserving-information-exchange.ipynb"><em>Jupyter Notebook</em></a><em> on </em><a href="https://github.com/nthparty"><em>GitHub</em></a><em>.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=1a4a11bed3d5" width="1" height="1" alt=""><hr><p><a href="https://medium.com/nthparty/privacy-preserving-information-exchange-using-python-1a4a11bed3d5">Privacy-Preserving Information Exchange Using Python</a> was originally published in <a href="https://medium.com/nthparty">Nth Party</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Guide to Publishing Packages]]></title>
            <link>https://medium.com/python-supply/guide-to-publishing-packages-109dbbf41a77?source=rss-f6b147043b04------2</link>
            <guid isPermaLink="false">https://medium.com/p/109dbbf41a77</guid>
            <category><![CDATA[python]]></category>
            <category><![CDATA[software-distribution]]></category>
            <category><![CDATA[software-testing]]></category>
            <category><![CDATA[software-engineering]]></category>
            <category><![CDATA[open-source]]></category>
            <dc:creator><![CDATA[Andrei Lapets]]></dc:creator>
            <pubDate>Thu, 29 Oct 2020 21:51:00 GMT</pubDate>
            <atom:updated>2021-05-24T04:59:41.423Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*XDk0pzCUmWYmZrYFsqqQbw.png" /></figure><p>This article is a step-by-step guide to assembling and publishing a small, open-source Python package. While not all of the steps below will be appropriate or desirable for every package, each of these can contribute to the accessibility and maintainability of the package. Note that this material is meant to serve as a roadmap and overview; it is not a thorough review of all the nuances and trade-offs involved. You are encouraged to consult additional resources (links are included throughout the text) for other options and viewpoints on how to approach every portion of the process.</p><p>Each section covers a task or category of tasks related to organizing and publishing a package and, where appropriate, provides templates and examples. Note that depending on when you are reading this article, some of the steps, templates, and examples may be out of date.</p><h3>Project Organization and Directory Tree Template</h3><p>The directory structure below is one way in which a hypothetical project called <strong>published</strong> might be organized.</p><pre>├─ .gitignore ............. File name patterns for Git to ignore <br>├─ .travis.yml ............ Travis CI configuration<br>│<br>├─ LICENSE ................ Distribution license<br>├─ README.rst ............. Project README<br>├─ setup.cfg .............. Configuration with parameters for setup<br>├─ setup.py ............... Package file<br>│<br>├─ published/<br>│  ├─ __init__.py ......... Namespace module<br>│  └─ published.py ........ Library module<br>│<br>└─ test/<br>   └─ test_published.py ... Unit tests</pre><p>For the purposes of this article, it is assumed that the package contains just one source module (<em>i.e.</em>, published/published.py). Therefore, it is sufficient to place it (along with its <a href="https://docs.python.org/3/tutorial/modules.html#packages">namespace module</a> published/__init__.py) into its own directory. The sections below go into more detail about each of the files and their purpose.</p><h3>Package File Organization</h3><p>The setup.py <a href="https://docs.python.org/3/distutils/apiref.html?highlight=setup%20py#distutils.core.setup">package file</a> specifies relevant metadata associated with your project, the locations of some important project files, and the dependencies that your project requires. An example of a possible setup.py file for the hypothetical package called <strong>published</strong> is presented below.</p><pre><strong>from</strong> setuptools <strong>import</strong> setup<br><br><strong>with</strong> open(&quot;README.rst&quot;, &quot;r&quot;) <strong>as</strong> fh:<br>    long_description = fh.read()<br><br>setup(<br>    name=&quot;published&quot;,<br>    version=&quot;0.1.0&quot;,<br>    packages=[&quot;published&quot;,],<br>    install_requires=[],<br>    license=&quot;MIT&quot;,<br>    url=&quot;https://pypi.org/project/published&quot;,<br>    author=&quot;python.supply&quot;,<br>    author_email=&quot;contact@python.supply&quot;,<br>    description=&quot;Example illustrating how an open-source library &quot;+\<br>                &quot;can be organized and published.&quot;,<br>    long_description=long_description,<br>    long_description_content_type=&quot;text/x-rst&quot;,<br>    test_suite=&quot;nose.collector&quot;,<br>    tests_require=[&quot;nose&quot;],<br>)</pre><h3>Establishing and Checking Style Conventions</h3><p>One easy way to add <a href="https://en.wikipedia.org/wiki/Lint_(software)">linting</a> to your project is to create a default configuration file for <a href="https://www.pylint.org/">Pylint</a>.</p><pre>python -m pip install pylint<br>pylint --generate-rcfile &gt; .pylintrc</pre><p>You can then check your project source files in the following way.</p><pre>pylint published</pre><p>In some cases, you may find it necessary to alter certain parameters in the configuration.</p><ul><li>The variable-rgx, class-rgx, constant-rgx, and similar parameters can be used to specify an alternate standard for acceptable naming conventions. This can be reasonable to do in certain scenarios. For example, if you are implementing a specialized library that involves certain mathematical constructs, it may be more clear to your audience and in accordance with domain conventions to use single-letter variables such as x and y rather than longer identifiers that use snake case.</li><li>You might add rules that you want to ignore to the comma-separated list that follows the disable= parameter.</li></ul><p>You may also want to direct Pylint to ignore a rule in a specific location within a source file. This can be done using a directive inside a comment. An example is presented below.</p><pre><strong>class</strong> published(): <em># pylint: disable=C0103,R0903</em><br>    <strong>def</strong> __init__(self):<br>        self.published = <strong>True</strong></pre><h3>Defining Unit Tests and Measuring Test Coverage</h3><p>There is an extensive supply of tools, packages, and conventions for defining unit tests, running unit tests, and measuring unit test coverage. In this article, the focus is on techniques that are suitable for small Python libraries and packages, so only two approaches are presented.</p><h4>Using doctest</h4><p>Unit tests for individual classes and methods can be included inside the <a href="https://www.python.org/dev/peps/pep-0257/">docstrings</a> that appear at the top of their bodies. The built-in <a href="https://docs.python.org/3/library/doctest.html">doctest</a> library can then be used to find and validate them. In the example below, the definition for the class published includes one such collection of tests in its docstring (<em>i.e.</em>, a sequence of executed statements and the corresponding results, as might be observed when engaging a session via the <a href="https://docs.python.org/3/tutorial/interpreter.html#interactive-mode">interactive prompt</a>).</p><pre><strong>class</strong> published():<br>    &quot;&quot;&quot;<br>    A published package.<br><br>    &gt;&gt;&gt; p = published()<br>    &gt;&gt;&gt; p.is_published()<br>    <strong>True</strong><br>    &gt;&gt;&gt; p.published = <strong>False</strong><br>    &gt;&gt;&gt; p.is_published()<br>    Traceback (most recent call last):<br>      ...<br>    RuntimeError: package must be published<br>    &quot;&quot;&quot;<br>    <strong>def</strong> __init__(self):<br>        &quot;&quot;&quot;Build an instance.&quot;&quot;&quot;<br>        self.published = <strong>True</strong><br><br>    <strong>def</strong> is_published(self):<br>        &quot;&quot;&quot;Check publication status.&quot;&quot;&quot;<br>        <strong>if</strong> <strong>not</strong> self.published:<br>            <strong>raise</strong> RuntimeError(&quot;package must be published&quot;)<br><br>        <strong>return</strong> self.published</pre><p>To check if the outputs in these sequences are accurate descriptions of what is actually returned by the Python interpreter when it runs each statement, simply add the following to your main module (<em>e.g.</em>, inside published.py in this case):</p><pre><strong>if</strong> __name__ == &quot;__main__&quot;:<br>    doctest.testmod()</pre><p>It is then possible to <a href="https://docs.python.org/3/library/__main__.html">run these tests</a> as follows (the -v option ensures a report is displayed even if all tests succeed):</p><pre>python published/published.py -v</pre><p>Alternatively, you can allow <a href="https://nose.readthedocs.io/en/latest/">nose</a> to find and run doctests using appropriate parameters in a <a href="https://setuptools.readthedocs.io/en/latest/userguide/declarative_config.html?highlight=setup.cfg">setup.cfg</a> file. An example of asetup.cfg file that is sufficient for the examples above is presented below.</p><pre>[nosetests]<br>exe=True<br>tests=published/published.py</pre><h4>Using unittest</h4><p>If you would like to employ a more extensive test suite, you can create one using the built-in <a href="https://docs.python.org/3/library/unittest.html">unittest</a> library and put it in a location that the <a href="https://nose.readthedocs.io/en/latest/">nose</a> tool can locate automatically. Below is an example of the layout a test script test/test_published.py might have.</p><pre><strong>from</strong> unittest <strong>import</strong> TestCase<br><strong>from</strong> published.published <strong>import</strong> published</pre><pre><strong>class</strong> Test_published(TestCase):<br>    <strong>def</strong> test_is_published(self):<br>        p = published()<br>        self.assertTrue(p.is_published())</pre><p>To ensure nose can find the test script, the setup.cfg configuration file should specify the directory that contains the test script.</p><pre>[nosetests]<br>exe=True<br>with-doctest=1<br>tests=test/</pre><h4>Using both doctest and unittest with nose</h4><p>It is possible to use nosetests to run all tests, including both doctest unit tests and any testing scripts. An example of a setup.cfg file that enables this is provided below.</p><pre>[nosetests]<br>exe=True<br>with-doctest=1<br>tests=published/published.py, test/</pre><h4>Measuring coverage</h4><p>To determine how much of your code is covered by unit tests every time you use nose, you can add a few optional lines to the setup.cfg file.</p><pre>[nosetests]<br>exe=True<br>with-doctest=1<br>cover-package=published<br>cover-html=1<br>tests=published/published.py, test/</pre><p>By assigning your package name to cover-package, you are indicating that coverage should be measured (typically using <a href="https://coverage.readthedocs.io/">coverage</a>). By assigning 1 to cover-html, you are indicating that human-readable HTML files should be generated that highlight what portions of the module files are not covered by any unit tests.</p><h3>Continuous Integration and Coverage Reporting</h3><p>You can connect your GitHub personal or organization account with <a href="https://travis-ci.org/">Travis CI</a> and <a href="https://coveralls.io/">Coveralls</a> in order to automatically run tests and publish test coverage every time you or a contributor pushes to the package’s GitHub repository or makes a pull request. Travis CI and Coveralls provide extensive documentation describing how you can link your GitHub account with their services; this article focuses is on what configuration files you need to add to your project.</p><p>A template for a simple .travis.yml <a href="https://config.travis-ci.com/">Travis CI configuration file</a> for the hypothetical package called <strong>published</strong> is provided below.</p><pre>notifications:<br>  email:<br>    on_success: never<br>    on_failure: always<br>language: python<br>python:<br>  - &quot;3.8&quot;<br>cache: pip<br>install:<br>  - pip install pylint<br>  - pip install coveralls<br>  - pip install .<br>script:<br>  - pylint published<br>  - nosetests<br>after_success:<br>  - coveralls</pre><p>The notifications section indicates that email notifications should only be sent if any step in the script section produces any result other than 0 (usually indicating success) to the standard output. The pylint and nosetests commands conform to this convention. The after_success section includes an entry that publishes the coverage results via Coveralls if there are no linting issues and no errors arise when the unit tests are executed.</p><h3>README Organization and Format</h3><p>An effective README document might cover some of the following topics:</p><ul><li>the purpose of the package/libary;</li><li>a quick start guide showing how a first-time user can begin using the package/library;</li><li>how to run unit tests and measure test coverage;</li><li>how others can contribute;</li><li>the versioning standards or conventions being used.</li></ul><p>Two popular formats for a README file that are supported by GitHub and PyPI are <a href="https://daringfireball.net/projects/markdown/">Markdown</a> and <a href="https://docutils.sourceforge.io/rst.html">reStructuredText</a>. You can also learn more about <a href="https://packaging.python.org/guides/making-a-pypi-friendly-readme/">how to structure a README file for PyPI</a>.</p><h4>Badges</h4><p>You may want to add badges to your README that provide information about the status of your project (<em>e.g.</em>, the host package repository and version number of the latest release, the last Travis CI build outcome, test coverage statistics, and so on). This requires determining the image URL for the badge you want to display (for example, the <a href="https://docs.travis-ci.com/user/status-images/">badge image URL for last Travis CI build status</a> for the hypothetical package called <strong>published</strong> might be https://travis-ci.com/python-supply/published.svg) and then inserting that image into your README document using the appropriate syntax. For example, when using the reStructuredText format, the badge section of the README.rst file for the package <strong>published</strong> might look as follows if it uses a <a href="https://docutils.sourceforge.io/0.4/docs/ref/rst/directives.html#directives-for-substitution-definitions">substitution definition</a> for an <a href="https://docutils.sourceforge.io/0.4/docs/ref/rst/directives.html#image">images</a> in order to include a badge for PyPI.</p><pre>|pypi|</pre><pre>.. |pypi| image:: <a href="https://badge.fury.io/py/published.svg">https://badge.fury.io/py/published.svg</a><br>   :target: <a href="https://badge.fury.io/py/published">https://badge.fury.io/py/published</a></pre><h4>Versioning and Contributions</h4><p>When deciding how to assign version numbers to different versions or releases of your package, there are advantages to adopting a published standard. An example of a popular standard is <a href="https://semver.org/#semantic-versioning-200">Semantic Versioning 2.0.0</a>. One benefit of a standard is that it makes it easier for anyone working with your package to understand what the difference between two versions signifies (<em>e.g.</em>, whether they can expect one version to be backwards compatible with another). Another benefit is that it reduces the burden on you as the author and maintainer, as you do not need to invest effort in documenting your own conventions and resolving any corner cases that might arise. Of course, this may come at some cost or additional effort if your goal is to adhere to the standard consistently.</p><p>You may also want to specify in the README document any expectations you have of contributors who may be interested in reporting issues or making improvements to your package. For example, you might direct potential contributors to report problems via <a href="https://guides.github.com/features/issues/">GitHub Issues</a> and to submit suggested fixes or enhancements via <a href="https://docs.github.com/en/free-pro-team@latest/github/collaborating-with-issues-and-pull-requests/about-pull-requests">pull requests</a>.</p><h3>Publishing to PyPI</h3><p>Once you are ready to publish your package, you may want to test one last time that the package file and other project files are organized appropriately by installing the package locally.</p><pre>python -m pip install .</pre><p>You can then generate the archive files for distribution.</p><pre>python setup.py sdist bdist_wheel</pre><p>These can then be contributed to <a href="https://pypi.org/">PyPI</a> after you have set up an account. When you run the command below for the first time, you will be asked for your PyPI credentials.</p><pre>twine upload dist/*</pre><p>You can then try installing your package from PyPI. The example below is for the hypothetical package called <strong>published</strong>.</p><pre>python -m pip install published</pre><p>If you already installed a version locally, you may want to ensure you upgrade to the latest version using the --upgrade and --force-reinstall options.</p><pre>python -m pip install --upgrade --force-reinstall published</pre><h3>Further Reading</h3><p>This guide provides templates and examples that are just one path, involving a minimal number of steps, to organizing and publishing a small package. Not all of these steps may be appropriate for your particular package, and there are many variations and extensions of what is presented here. For a more detailed guide to packaging and publishing, you might want to refer to the <a href="https://packaging.python.org/">Python Packaging User Guide</a> and the <a href="https://setuptools.readthedocs.io/en/latest/">Setuptools documentation</a>. You may also want to explore alternatives that may exist for every step in this guide:</p><ul><li>other frameworks for unit testing (<em>e.g.</em>, <a href="https://behave.readthedocs.io/en/latest/">behave</a>), linting (<em>e.g.</em>, <a href="https://pycodestyle.pycqa.org/en/latest/">pycodestyle</a>), and static analysis (<em>e.g.</em>, <a href="http://mypy-lang.org/">mypy</a>);</li><li>other CI/CD workflow options such as <a href="https://packaging.python.org/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows/">GitHub Actions</a>; and</li><li>other <a href="https://packaging.python.org/guides/tool-recommendations/">packaging tools and frameworks</a>,</li></ul><p>When evaluating alternatives, it is a good idea to check the level of activity in the community surrounding them, in addition to their suitability for your particular package. The long-term maintainability and compatibility of your package may be impacted if the tools and frameworks on which it relies are not well-supported or are phased out.</p><p><em>This article is also available in </em><a href="https://github.com/python-supply/guide-to-publishing-packages"><em>Markdown form</em></a><em> on </em><a href="https://github.com/python-supply"><em>GitHub</em></a><em>. The hypothetical package called </em><strong><em>published</em></strong><em> used as an example throughout the article is actually not so hypothetical and can be found on </em><a href="https://github.com/python-supply/published"><em>GitHub</em></a><em> and </em><a href="https://pypi.org/project/published/"><em>PyPI</em></a><em>.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=109dbbf41a77" width="1" height="1" alt=""><hr><p><a href="https://medium.com/python-supply/guide-to-publishing-packages-109dbbf41a77">Guide to Publishing Packages</a> was originally published in <a href="https://medium.com/python-supply">Python Supply</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Privacy-Preserving Matching and Computation in the Browser]]></title>
            <link>https://medium.com/nthparty/privacy-preserving-matching-and-computation-in-the-browser-a2a311244fd2?source=rss-f6b147043b04------2</link>
            <guid isPermaLink="false">https://medium.com/p/a2a311244fd2</guid>
            <category><![CDATA[cryptography]]></category>
            <category><![CDATA[privacy]]></category>
            <category><![CDATA[web]]></category>
            <category><![CDATA[data]]></category>
            <category><![CDATA[open-source]]></category>
            <dc:creator><![CDATA[Andrei Lapets]]></dc:creator>
            <pubDate>Wed, 30 Sep 2020 19:55:35 GMT</pubDate>
            <atom:updated>2022-08-09T16:28:43.041Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*v9lvEJmfQDnjW11Xthoenw.png" /></figure><p>There are a variety of obstacles that can make it a challenge to deliver the benefits of privacy-preserving secure computation capabilities to organizations. These can include a lengthy process for obtaining approval to install new software, the cost and personnel effort associated with provisioning new cloud resources, and the technical hurdles involved in integrating existing infrastructure with new APIs. Fortunately, there is a way to sidestep these challenges for at least some secure computation workflows by delivering them via a web browser.</p><h3>Privacy-Preserving Matching and Computation</h3><p><a href="https://ai.facebook.com/">Facebook AI Research</a> (FAIR) recently released an <a href="https://engineering.fb.com/open-source/private-matching/">open-source framework</a> that makes it possible for two parties to match their respective data sets based on a common identifier — and to perform some aggregate computations over the results of that match — without requiring that either party share data with the other party. The framework is <a href="https://github.com/facebookresearch/Private-ID">implemented in Rust</a>, with Facebook’s stated rationale for the choice including “superior safety features and ease of writing multithreaded code”.</p><p>There are a variety of compelling use cases for a software solution that makes it possible to align the rows within two private or sensitive data sets (based on common row attributes) and to subsequently perform a computation such as a count or a sum over some of the columns in the combined data set. We enumerated some of these use cases in our <a href="https://www.nthparty.com/s/Nth_Party_MPC_Whitepaper.pdf">white paper</a>: enabling business-to-business decision-making without introducing the cost or complexity of third parties or data clean rooms, supporting benchmarking efforts involving multiple organizations, and the evaluation of sensitive or proprietary models or data.</p><p>Facebook’s release consists of a command-line software solution that allows two data owners to perform a <a href="https://en.wikipedia.org/wiki/Relational_algebra#Joins_and_join-like_operators">join</a> operation followed by an <a href="https://en.wikipedia.org/wiki/Relational_algebra#Aggregation">aggregation</a> on a pair of data sets (with each data owner contributing one of these data sets). For example, a hospital may have information about the length of each patient’s stay and an insurance provider may know whether each patient visits their primary care physician regularly. The hospital and insurance provider can use this solution to create a report that lists the average hospital stay for each of the two categories of patients (at least for the patients they both have in common). What is unique about this solution is that it uses a cryptographic approach known as <a href="https://en.wikipedia.org/wiki/Secure_multi-party_computation">secure multi-party computation</a> (including a specific technique known as <a href="https://en.wikipedia.org/wiki/Private_set_intersection">private set intersection</a> for the join step) to make this computation possible without requiring that either data owner reveal their data set to the other side.</p><h3>Secure Computation in the Browser</h3><p>The members of our team at Nth Party have extensive <a href="https://dl.acm.org/doi/10.1145/3209811.3212701">experience</a> developing and deploying secure computation libraries, frameworks, and software applications that <a href="https://eprint.iacr.org/2019/734.pdf">can run in a standard web browser</a>. We were excited by Facebook’s announcement of their framework, but we were particularly pleased with the choice of <a href="https://www.rust-lang.org/">Rust</a> for its implementation. This is because over the past few years, we have seen that it possible to improve the performance of browser-based secure computation solutions by implementing them using <a href="https://webassembly.org/">WebAssembly</a>. It so happens that it is possible to <a href="https://rustwasm.github.io/book/">compile Rust to WebAssembly</a>, with a variety of tools available for accomplishing this task.</p><p>Leveraging our team’s experience and combining FAIR’s framework with our own <a href="https://github.com/nthparty">JavaScript secure multi-party computation libraries</a>, we were able to assemble a solution that can deliver via a standard browser the same privacy-preserving matching and computation capabilities available in Facebook’s original solution. This is accomplished by first compiling the client-side and server-side components of the original Rust framework into WebAssembly. It is then possible to tie the client-side portion to a user interface that can run in a browser and to package the server-side component inside a <a href="https://nodejs.org/en/">Node.js</a> application.</p><h3>The Result: Scalable Browser-Based Join and Aggregation Workflows</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*rFh7NUep6kqzqLl3mQsHww.png" /><figcaption>Our browser-based application displaying some simulated input data (left-hand pane) and the output of a privacy-preserving computation (right-hand pane).</figcaption></figure><p>Our open-source solution is <a href="https://github.com/nthparty/private-id-web">available on GitHub</a>. The application as a whole scales reasonably well for a browser-based application, performing in under one minute a privacy-preserving join and aggregation workflow over data sets that number in the thousands. This demonstrates not only that secure computing techniques are ready to address real-world problems, but that they are relatively straightforward to adapt for modern software application stacks. This reinforces our views at Nth Party that secure computation is ready to help make business-to-business and consumer-facing data-oriented services and workflows more secure.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=a2a311244fd2" width="1" height="1" alt=""><hr><p><a href="https://medium.com/nthparty/privacy-preserving-matching-and-computation-in-the-browser-a2a311244fd2">Privacy-Preserving Matching and Computation in the Browser</a> was originally published in <a href="https://medium.com/nthparty">Nth Party</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Applications of Immutability]]></title>
            <link>https://medium.com/python-supply/applications-of-immutability-2580266ef57b?source=rss-f6b147043b04------2</link>
            <guid isPermaLink="false">https://medium.com/p/2580266ef57b</guid>
            <category><![CDATA[data-structures]]></category>
            <category><![CDATA[software-development]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[programming-languages]]></category>
            <dc:creator><![CDATA[Andrei Lapets]]></dc:creator>
            <pubDate>Wed, 30 Sep 2020 19:31:00 GMT</pubDate>
            <atom:updated>2020-12-27T07:48:28.308Z</atom:updated>
            <content:encoded><![CDATA[<p>Suppose you are implementing a service that works with data sets that represent routes on a map (<em>e.g.</em>, collections of driving directions or logs of past trips). The road network is represented as a graph with nodes and edges, and each route is conceptually a collection of edges. You are tasked with choosing an appropriate data structure for routes that meets at least the following criteria:</p><ul><li>it should be possible to deduplicate large collections of routes,</li><li>it should be possible to use individual routes as keys into a dictionary (<em>e.g.</em>, to build a cache that maps each route to its total distance or average trip time),</li><li>programmers should not be able to modify a route instance if they obtain a reference to it, and</li><li>the collections of edges found in a route may be supplied to your data structure’s constructor in any order (even though they represent the same route for purposes of caching or deduplication).</li></ul><p>What are your options in implementing a data structure for routes that meets these criteria, and what issues should you consider?</p><p>Both built-in and user-defined data structures in Python can be either mutable or immutable. This article explains why Python makes this distinction for built-in data structures, breaks down the independent characteristics that are often associated with immutable data structures, and explores several approaches you can employ when addressing the above use case.</p><h3>Mutable and Immutable Built-in Types</h3><p>Each of the <a href="https://docs.python.org/3/library/stdtypes.html">built-in types</a> found in Python is either mutable or immutable:</p><ul><li>instances of the collection types <a href="https://docs.python.org/3/tutorial/datastructures.html#more-on-lists">list</a>, <a href="https://docs.python.org/3/tutorial/datastructures.html#dictionaries">dict</a>, and <a href="https://docs.python.org/3/tutorial/datastructures.html#sets">set</a> are mutable,</li><li>instances of the collection types <a href="https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences">tuple</a>, <a href="https://docs.python.org/3/library/stdtypes.html#frozenset">frozenset</a>, and <a href="https://docs.python.org/3/library/stdtypes.html#ranges">range</a> are immutable,</li><li>instances of types such as <a href="https://docs.python.org/3/library/functions.html#bool">bool</a>, <a href="https://docs.python.org/3/library/functions.html#int">int</a>, <a href="https://docs.python.org/3/library/functions.html#float">float</a>, and <a href="https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str">str</a> are immutable, and</li><li>instances of <a href="https://docs.python.org/3/library/stdtypes.html#bytes">bytes</a> are immutable but instances of <a href="https://docs.python.org/3/library/stdtypes.html#bytearray">bytearray</a> are mutable.</li></ul><p>Mutable types are accompanied by methods that modify instances of the corresponding data structure in-place (usually returning None) while immutable types are usually accompanied by functions and methods that return a new instance of that type (such as string concatenation). But why does Python distinguish between mutable and immutable built-in types? The reasons are subtle and relate to an interplay between programming language design decisions and practical performance requirements. A brief overview is provided below, and you can find a detailed <a href="https://docs.python.org/3/faq/design.html#why-must-dictionary-keys-be-immutable">answer to this question</a> in the Python documentation.</p><p>Some programming languages such as <a href="https://en.wikipedia.org/wiki/Haskell_(programming_language)">Haskell</a> have only immutable values (and thus all new values are necessarily copies or entirely new objects). One significant benefit of this approach is that code in Haskell is much easier and safer to refactor and transform (<em>e.g.</em>, for purposes of optimization) because the <em>context</em> of an expression will never affect its meaning. As an example, consider the following for loop.</p><pre><strong>&gt;&gt;&gt;</strong> <strong>for </strong>i <strong>in </strong>range(3):<br><strong>...</strong>     x = 1 + 2 + 3 + 4<br>...     print(x)<br><strong>...</strong><br>10<br>10<br>10</pre><p>Because the expression 1 + 2 + 3 + 4 is immutable (<em>i.e.</em>, its value will not change depending on where it appears in a program), it can safely be moved up and outside of the for loop without changing the behavior of the program. Modern interpreters and compilers routinely use such an approach to perform performance optimizations directly on the <a href="https://medium.com/python-supply/analyzing-and-transforming-abstract-syntax-d7d3da23b14f">abstract syntax tree of the program</a>.</p><pre><strong>&gt;&gt;&gt;</strong> immutable_value = 1 + 2 + 3 + 4<br><strong>&gt;&gt;&gt;</strong> <strong>for </strong>i <strong>in </strong>range(3):<br><strong>...</strong>     x = immutable_value<br>...     print(x)<br><strong>...</strong><br>10<br>10<br>10</pre><p>Python set instances support an extremely fast lookup/membership operation (which can be invoked using the infix <a href="https://docs.python.org/3/library/stdtypes.html#set">in</a> operator) because the Python interpreter builds a <a href="https://en.wikipedia.org/wiki/Hash_table">hash table</a> that contains hashes of all the individual elements of a set. As a reference point, consider the performance when searching for an element in a list.</p><pre><strong>&gt;&gt;&gt;</strong> import time<br><strong>&gt;&gt;&gt;</strong> l = list(range(0,1000000))<br><strong>&gt;&gt;&gt; </strong>start = time.perf_counter()<br><strong>&gt;&gt;&gt;</strong> 999999 in l<br><strong>&gt;&gt;&gt; </strong>time.perf_counter() - start<br>0.07584109995514154</pre><p>When evaluating the in operator in an expression such as 4 in {1, 2, 3, 4, 5}, the interpreter hashes 4 and finds the hash value in the hash table for {1, 2, 3, 4, 5} in nearly constant time. As shown below, this is significantly faster than performing a search through all of the elements in the set as in the above example (which is the only option without some sort of alternative comparison and sorting mechanism).</p><pre><strong>&gt;&gt;&gt;</strong> s = set(l)<br><strong>&gt;&gt;&gt; </strong>start = time.perf_counter()<br><strong>&gt;&gt;&gt;</strong> 999999 in s<br><strong>&gt;&gt;&gt; </strong>time.perf_counter() - start<br>0.00026190001517534256</pre><p>Suppose that elements in a set instance were mutable. This would mean that their hash would also be mutable, which would in term mean that the hash table for the set instance would need to be updated. But how would the interpreter even know to update the hash table? Consider the example below.</p><pre>e = [1, 2, 3]<br>u = {e, &quot;a&quot;}<br>v = {e, True, False}<br>w = {e, 1.2, 2.3, 3.4}<br>e.pop()</pre><p>Every time a statement such as e.pop() is executed, the interpreter would need to check whether e is a member of any sets (there are three such sets in this case) and would need to update the hash table for <em>every</em> one of them to accurately reflect that the hash value corresponding to e is different. If the interpreter <em>did not</em> do this, then an expression such as [1, 2, 3] in u could not be evaluated both correctly and efficiently.</p><p>It is worth noting that Python’s set and dict data structures were intentionally designed this way under an assumption: programmers will usually want to perform lookup in set and dict instances based on the <em>value</em> and not the <em>particular instance</em> of a data structure. An alternate approach <em>could</em> have been to simply build the hash table using the memory address of the each element rather than its value. However, this approach begins to fall apart as soon as strings are used as elements: after the statements k = &quot;abc!&quot; and d = {&quot;abc!&quot;: 123}, a programmer probably expects d[k] == 123 to be True even though the address of the string instance &quot;abc!&quot; in k = &quot;abc!&quot; is <em>different</em> from the address of the distinct string instance &quot;abc&quot; in d = {&quot;abc!&quot;: 123}. This can be confirmed using the built-in <a href="https://docs.python.org/3/library/functions.html#id">id</a> function (though you should be aware of <a href="https://docs.python.org/3/library/sys.html#sys.intern">interning</a> to avoid confusion when testing your own examples).</p><pre><strong>&gt;&gt;&gt;</strong> k = &quot;abc!&quot;<br><strong>&gt;&gt;&gt; </strong>d = {&quot;abc!&quot;: 123}<br><strong>&gt;&gt;&gt; </strong>id(k) == id(list(d.keys())[0])<br><strong>False</strong></pre><p>The above explains why built-in immutable types exist and why Python requires them in certain contexts. This also indicates that to satisfy the criteria in the motivating scenario described in the introduction, the data structure you define for representing a route must be one that Python recognizes as being immutable.</p><h3>Defining an Immutable Data Structure</h3><p>In Python, there are a number of approaches available to you when you are defining a data structure for a use case such as the one described in the introduction.</p><ul><li>One approach is simply to adopt a <em>convention</em> of using an existing built-in type to represent instances of your data structure. For example, a route can be represented as a <a href="https://docs.python.org/3/library/stdtypes.html#frozenset">frozenset</a> of <a href="https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences">tuple</a> instances (with two <a href="https://docs.python.org/3/library/functions.html#int">int</a> components in each tuple representing the endpoints of the edge). This may be advantageous if you are trying to avoid unnecessary clutter or would like to make it easier for other libraries or components to use your data without dealing with (or introducing within their own code) application-specific boilerplate.</li><li>A second approach is to define a derived class that <a href="https://docs.python.org/3/tutorial/classes.html#inheritance">inherits</a> the features of a built-in type (as demonstrated in an <a href="https://medium.com/python-supply/embedded-languages-via-overloading-ac942c18a65b">another article on operator overloading</a>). This has most of the benefits of the first approach above, but gives you more control over the interface of the data structure. This is useful if you would like to add custom methods, to modify how certain default methods inherited from the built-in type behave, to <a href="https://lapets.medium.com/static-checking-via-metaclasses-cec368765b58">enforce type or value constraints on method arguments</a>, to throw application-specific exceptions, or simply to provide more user-friendly and application-specific synonyms for existing functions and methods.</li><li>A third approach is to define a brand new class that conceals its internal representation. This has the benefit of encapsulation (allowing you to modify the internal representation of the actual route in the future), but requires a more careful approach on the conceptual side and more boilerplate code on the practical side.</li></ul><h4>Using Built-in Types</h4><p>A route could be represented using an instance of frozenset containing instances of tuple (with each tuple representing one edge) that in turn each contain two integers (with each integer representing one of the two nodes that an edge connects). Because integers and tuples are immutable, they can be elements inside frozenset instances.</p><pre><strong>&gt;&gt;&gt;</strong> route_one = frozenset({(0, 1), (1, 2), (2, 3)})<br><strong>&gt;&gt;&gt;</strong> route_two = frozenset({(1, 2), (0, 1), (2, 3)})<br><strong>&gt;&gt;&gt;</strong> len({route_one, route_two}) <em># Two routes with the same edges.</em><br>1</pre><p>Because frozenset instances are immutable, all four criteria for the route data structure can be satisfied. In particular, routes can be deduplicated by inserting them into a set and a mapping from routes to their distances can be implemented using a dict. Because a frozenset behaves like a mathematical set, the order of the edges does not matter.</p><pre><strong>&gt;&gt;&gt;</strong> distances = {route_one: 3} <em># No exception is raised.</em><br><strong>&gt;&gt;&gt;</strong> len({route_one, route_two}) <em># Deduplication occurs.</em><br>1</pre><p>Note that an instance of an immutable type <em>can</em> contain a mutable object inside it. However, in order for a value to be used as a key in a dict instance or an element in a set instance, all elements inside the immutable type must also be immutable (and so on, down to the leaves of the data structure instance). In the example below, the mutable object [True, False] causes an exception even though it is inside an immutable frozenset instance that is itself inside an immutable tuple instance.</p><pre><strong>&gt;&gt;&gt;</strong> <strong>try</strong>:<br><strong>...</strong>     {tuple(&quot;a&quot;, frozenset({[True, False]})): 0}<br><strong>...</strong> <strong>except </strong>Exception <strong>as </strong>e:<br><strong>...</strong>     print(e)<br><strong>...</strong><br>unhashable type: &#39;list&#39;</pre><h4>Defining a Derived Class</h4><p>It is possible to take advantage of Python’s support for <a href="https://docs.python.org/3/tutorial/classes.html#inheritance">inheritance</a> to define a class that is <em>derived</em> from one of the immutable types. This ensures that the derived class has the same attributes and methods as the base class. In the example below, the route class inherits all the features of frozenset. In addition, it has a method distance that computes the distance of the route (defined as the number of hops that occur between two distinct nodes) and a custom definition of <a href="https://docs.python.org/3/library/functions.html#repr">__repr__</a> to display an instance in a friendly way.</p><pre><strong>class </strong>route(frozenset):<br>    <strong>def </strong>distance(self):<br>        <strong>return </strong>sum([1 <strong>for </strong>e <strong>in </strong>self <strong>if </strong>e[0] != e[1]])<br>    <br>    <strong>def </strong>__repr__(self):<br>        <strong>return </strong>&quot;route({&quot; + &quot;, &quot;.join([str(e) <strong>for </strong>e <strong>in </strong>self]) + &quot;})&quot;</pre><p>To create an instance of route, it is sufficient to wrap an instance of frozenset with the constructor.</p><pre><strong>&gt;&gt;&gt;</strong> route({(0,0), (0,1), (1,1)}).distance()<br>1</pre><p>With respect to its immutability, instances of route can be used in any context in which frozenset can be used.</p><pre><strong>&gt;&gt;&gt;</strong> {route({(0,1), (1,2)}): 2}<br>{route({(0, 1), (1, 2)}): 2}</pre><p>If your data structure were simpler (<em>e.g.</em>, a record with a fixed collection of named attributes), you could have your derived class inherit from a type generated using <a href="https://docs.python.org/3/library/collections.html#collections.namedtuple">namedtuple</a> from the built-in <a href="https://docs.python.org/3/library/collections.html">collections</a> library.</p><pre><strong>&gt;&gt;&gt;</strong> <strong>from </strong>collections <strong>import </strong>namedtuple<br><strong>&gt;&gt;&gt; class </strong>record(namedtuple(&quot;record&quot;, &quot;name age&quot;)):<br>...     <strong>pass</strong><br><strong>...<br>&gt;&gt;&gt; </strong>record(&quot;Alice&quot;, 32)<br>record(name=&#39;Alice&#39;, age=32)</pre><h4>Defining a New Class</h4><p>When creating your own class for the route data structure, you first need to determine which characteristics of built-in immutable types you would like your class to possess.</p><ol><li>If you would like to be able to use instances of the class inside a set instance or as a key in a dict instance, you must define appropriate methods to make this possible.</li><li>If you would like to ensure that it is not possible to modify or extend an instance of your class once it has been created, you will need to redefine specific methods and attributes in a particular way. However, it is worth noting that this approach does not enforce immutability to the same extent as the approach of defining a class derived from an immutable type.</li></ol><p>To satisfy the first requirement, it is sufficient to provide definitions for the <a href="https://docs.python.org/3/reference/datamodel.html#object.__hash__">__hash__</a> and <a href="https://docs.python.org/3/reference/datamodel.html#object.__eq__">__eq__</a> methods. The Python interpreter will invoke these methods when building a hash table (<em>e.g.</em>, for a set or dict instance) as well as when retrieving a value. The __eq__ method is required in addition to the __hash__ method because hashes are not guaranteed to be unique and the interpreter needs to be able to disambiguate between objects of your class in such scenarios. Note that the Python interpreter expects that two values that are equal according to __eq__ must have the same hash values according to __hash__.</p><pre><strong>class </strong>route():<br>    <strong>def </strong>__init__(self, edges):<br>        self.es = edges</pre><pre>    <strong>def </strong>__hash__(self):<br>        es = sorted(list(set(self.es)))<br>        <strong>import </strong>hashlib<br>        <strong>return </strong>int(hashlib.sha256(str(es).encode()).hexdigest(), 16)</pre><pre>    <strong>def </strong>__eq__(self, other):<br>        <strong>return </strong>set(self.es) == set(other.es)</pre><pre>    <strong>def </strong>distance(self):<br>        <strong>return </strong>sum([1 <strong>for </strong>e <strong>in </strong>self.es <strong>if </strong>e[0] != e[1]])</pre><p>It is now possible to use instances of route as elements of set instances and as keys for dict instances.</p><pre><strong>&gt;&gt;&gt; </strong>route_one = route([(0,0), (0,1), (1,1)])<br><strong>&gt;&gt;&gt; </strong>route_two = route([(0,1), (1,2), (2,3)])<br><strong>&gt;&gt;&gt; </strong>route_three = route([(0,1), (1,2), (2,3)])<br><strong>&gt;&gt;&gt; </strong>distances = {route_one: 3}<br><strong>&gt;&gt;&gt; </strong>len({route_one, route_two, route_three})<br>2</pre><p>Note that in the definition of the __hash__ method, a cryptographic hash function sha256 from the built-in <a href="https://docs.python.org/3/library/hashlib.html">hashlib</a> library is applied to a string version of a normalized representation (<em>i.e.</em>, sorted and deduplicated) of a route. Normalization ensures that the order and multiplicity of edges does not change the hash of a route. Use of sha256 ensures that the hash of any instance of the same string will always be the same in any environment and under any version of Python. This is not guaranteed by the built-in <a href="https://docs.python.org/3/library/functions.html#hash">hash</a> function, which may return different results across different Python sessions (even in the same environment). Such an inconsistency could be an issue if, for example, users of your data structure store instances of it on disk and load them again at a later time.</p><p>One way to satisfy the requirement that users cannot modify or extend instances of your data structure is to explicitly set the <a href="https://docs.python.org/3/reference/datamodel.html?highlight=__slots__#object.__slots__">__slots__</a> attribute to a tuple containing only those attributes that you require for your implementation.</p><pre><strong>class </strong>route():<br>    __slots__ = (&quot;es&quot;)<br>    <br>    <strong>def </strong>__init__(self, edges):<br>        self.es = edges</pre><p>It is now not possible to create new attributes for any route instance.</p><pre><strong>&gt;&gt;&gt; try</strong>:<br>...     route_example = route([(0,1), (1,2)])<br>...     route_example.duration = 123<br><strong>... except </strong>Exception <strong>as </strong>e:<br>...     print(e)<br>...<br>&#39;route&#39; object has no attribute &#39;duration&#39;</pre><p>Another option is to provide a definition for <a href="https://docs.python.org/3/reference/datamodel.html?highlight=__slots__#object.__setattr__">__setattr__</a> that raises an exception, ensuring that it is not possible to add new attributes or to assign new values to an existing attribute.</p><h3>Further Reading</h3><p>This article provides an overview of the distinction between mutable and immutable data structures in Python, why the distinction exists, and guidance on what options are available to a programmer when they are implementing an application-specific immutable data structure. The information about the <a href="https://docs.python.org/3/library/collections.html#collections.namedtuple">namedtuple</a> factory function in the built-in <a href="https://docs.python.org/3/library/collections.html">collections</a> library may be relevant to some scenarios. The third-party <a href="https://pypi.org/project/attrs/">attrs</a> library can help reduce the amount of boilerplate required to define new classes (even if you are not relying on inheritance) and has features that help introduce <a href="https://www.attrs.org/en/stable/api.html#attr.attr.frozen">immutability</a>. To learn more about the nuances of object and memory management within the Python interpreter, you can read about <a href="https://docs.python.org/3/library/sys.html#sys.intern">interning</a>. Those more familiar with C may find it useful to explore the <a href="https://docs.python.org/3/c-api/memory.html">Memory Management</a> documentation (targeting primarily at readers who are interested in implementing their own extensions to the Python interpreter).</p><p><em>This article is also available as a </em><a href="https://github.com/python-supply/applications-of-immutability/blob/master/applications-of-immutability.ipynb"><em>Jupyter Notebook</em></a><em> and in </em><a href="https://python.supply/applications-of-immutability"><em>HTML</em></a><em> form on </em><a href="https://github.com/python-supply"><em>GitHub</em></a><em>.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=2580266ef57b" width="1" height="1" alt=""><hr><p><a href="https://medium.com/python-supply/applications-of-immutability-2580266ef57b">Applications of Immutability</a> was originally published in <a href="https://medium.com/python-supply">Python Supply</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Comprehensions and Combinations]]></title>
            <link>https://medium.com/python-supply/comprehensions-and-combinations-664f1c178c24?source=rss-f6b147043b04------2</link>
            <guid isPermaLink="false">https://medium.com/p/664f1c178c24</guid>
            <category><![CDATA[python]]></category>
            <category><![CDATA[data-structures]]></category>
            <category><![CDATA[programming-languages]]></category>
            <category><![CDATA[software-engineering]]></category>
            <category><![CDATA[programming]]></category>
            <dc:creator><![CDATA[Andrei Lapets]]></dc:creator>
            <pubDate>Thu, 27 Aug 2020 03:51:00 GMT</pubDate>
            <atom:updated>2021-04-29T07:12:52.870Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*3FlTMgsgoSwq4A3S4CjBfQ.png" /></figure><p>If you are working on a project in which you must enumerate, traverse, and/or test every possible combination of elements from one or more finite (or even infinite) collections, Python provides a variety of ways to do so. This article reviews some of the <em>syntactically concise</em> ways to do so, while also addressing some relevant memory utilization aspects. In particular, the focus is on comprehension syntax as foundational building block that can be employed in conjunction with functions and recursion.</p><h3>Conventions for Terminology and Notation</h3><p>In order to maintain consistency across examples and to keep outputs deterministic, this articles follows the conventions enumerated below.</p><ul><li>Python lists are used to represent both collections that may have duplicates and sets that do not have duplicates. For example, the set {0, 1, 2} is represented using [0, 1, 2].</li><li>When including multiple collections (<em>e.g.</em>, <em>U</em>, <em>V</em>, and <em>W</em>) within a Cartesian product expression (<em>e.g.</em>, <em>U </em>× <em>V </em>× <em>W</em>), the collections are called <em>components</em> of the Cartesian product.</li><li>When an output evaluates to an iterable, it is sometimes immediately consumed and turned into a list using the list function so that it can be reused throughout an example multiple times. In practice, this may not be necessary and may even be unnecessarily expensive in terms of both memory utilization and running time. This distinction is noted explicitly where applicable.</li><li>A single-letter variable x usually refers to individual elements in a collection, a variable such as xs usually refers to collections of elements, and a variables such as xss usually refers to a collection of collections.</li></ul><h3>Cartesian Products</h3><p>One of the simplest scenarios is one in which it is necessary to generate every combination of elements from two or more collections, where each combination has one component from each collection. This corresponds to the <a href="https://en.wikipedia.org/wiki/Cartesian_product">Cartesian product</a> of the collections. Python’s <a href="https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions">comprehension syntax</a> is a powerful language feature that, under the right circumstances, provides a way to implement such operations involving collections in a way that is concise and closely resembles widely used and recognized forms of mathematical notation. The example below demonstrates how this syntax can be used to build a Cartesian product of two lists.</p><pre><strong>&gt;&gt;&gt;</strong> [(x, y) <strong>for</strong> x <strong>in</strong> [0, 1, 2] <strong>for</strong> y <strong>in</strong> [&quot;a&quot;, &quot;b&quot;]]<br>[(0, &#39;a&#39;), (0, &#39;b&#39;), (1, &#39;a&#39;), (1, &#39;b&#39;), (2, &#39;a&#39;), (2, &#39;b&#39;)]</pre><p>Comprehension syntax can become difficult to manage and read as the number of component collections involved increases. In such cases, it may make sense to build a function so that the repetitive aspects of building the definition are handled programmatically. It is demonstrated at the end of this section how such a function can be defined. But first, some examples of how it can be used are illustrated using the product function found in the built-in <a href="https://docs.python.org/3/library/itertools.html">itertools</a> library.</p><pre><strong>&gt;&gt;&gt; from</strong> itertools <strong>import</strong> product<br><strong>&gt;&gt;&gt; </strong>u = [0, 1, 2]<br><strong>&gt;&gt;&gt; </strong>v = [&quot;a&quot;, &quot;b&quot;]<br><strong>&gt;&gt;&gt; </strong>list(product(u, v)) <em># Wrapped in `list` for display purposes.</em><br>[(0, &#39;a&#39;), (0, &#39;b&#39;), (1, &#39;a&#39;), (1, &#39;b&#39;), (2, &#39;a&#39;), (2, &#39;b&#39;)]</pre><p>The product function takes any number of <a href="https://docs.python.org/3/glossary.html#term-iterable">iterable</a> arguments and creates an iterable result containing one <a href="https://docs.python.org/3/library/stdtypes.html#tuple">tuple</a> for every combination of elements from each of the arguments.</p><pre><strong>&gt;&gt;&gt; </strong>r = [<strong>False</strong>, <strong>True</strong>]<br><strong>&gt;&gt;&gt; </strong>s = [0, 1, 2]<br><strong>&gt;&gt;&gt; </strong>t = [&#39;a&#39;, &#39;b&#39;, &#39;c&#39;, &#39;d&#39;]<br><strong>&gt;&gt;&gt; </strong>p = list(product(r, s, t))<br><strong>&gt;&gt;&gt; </strong>len(p) == 2*3*4 <strong>and</strong> len(p[0]) == 3<br><strong>True</strong></pre><p>Using the <a href="https://docs.python.org/3/tutorial/controlflow.html#unpacking-argument-lists">unpacking</a> operator in conjunction with the list repetition operator (also <a href="https://docs.python.org/3/library/itertools.html#itertools.repeat">available as a method</a>), it is possible to concisely describe a Cartesian product of any number of instances of a finite collection. In the example below, a ten-dimensional discrete finite space is created (where each dimension is [0, 1, 2]).</p><pre><strong>&gt;&gt;&gt; </strong>s = [0, 1, 2]<br><strong>&gt;&gt;&gt; </strong>p = list(product(*[s]*10))<br><strong>&gt;&gt;&gt; </strong>len(p) == 3**10<br><strong>True</strong></pre><p>How much memory does an output of product consume? Because it is an iterable that generates data dynamically, it only takes about as much memory as the component collections provided as inputs to product.</p><pre><strong>&gt;&gt;&gt; import</strong> sys<br><strong>&gt;&gt;&gt; </strong>s = [0, 1, 2, 3, 4, 5, 6, 7]<br><strong>&gt;&gt;&gt; </strong>n = 10<br><strong>&gt;&gt;&gt; </strong>p = product(*[s]*n)<br><strong>&gt;&gt;&gt; </strong>(sys.getsizeof([s]*n), sys.getsizeof(p))<br>(68, 72)</pre><p>Now that it has been demonstrated how the function can be used, consider a recursive implementation of such a function. To understand how this can be accomplished, a concrete example may help. Suppose a Cartesian product p of two collections [False, True] and [0, 1] has already been built. How do you turn it into a Cartesian product of three collections? You can iterate over all combinations of elements from the third collection [&quot;a&quot;, &quot;b&quot;] and from the Cartesian product p, concatenating each element-combination pair.</p><pre><strong>&gt;&gt;&gt;</strong> p = [(x, y) <strong>for</strong> x <strong>in</strong> [<strong>False, True</strong>] <strong>for</strong> y <strong>in</strong> [0, 1]]<br><strong>&gt;&gt;&gt; </strong>q = [(z,) + t <strong>for</strong> z <strong>in</strong> [&quot;a&quot;, &quot;b&quot;] <strong>for</strong> t <strong>in</strong> p]<br><strong>&gt;&gt;&gt; </strong>q<br>[(&#39;a&#39;, <strong>False</strong>, 0),<br> (&#39;a&#39;, <strong>False</strong>, 1),<br> (&#39;a&#39;, <strong>True</strong>, 0),<br> (&#39;a&#39;, <strong>True</strong>, 1),<br> (&#39;b&#39;, <strong>False</strong>, 0),<br> (&#39;b&#39;, <strong>False</strong>, 1),<br> (&#39;b&#39;, <strong>True</strong>, 0),<br> (&#39;b&#39;, <strong>True</strong>, 1)]</pre><p>Below is a complete implementation of the function based on the above approach. Note that the base case is a collection containing a single tuple of length zero. The recursive case consists of a comprehension that prepends every element in the first collection to every tuple in the cartesian product of all the remaining collections.</p><pre><strong>def</strong> cart(xss):<br>    <strong>if</strong> len(xss) == 0:<br>        <strong>return </strong>[()]<br>    <strong>else</strong>:<br>        <strong>return </strong>[(x,) + ys <strong>for</strong> x <strong>in</strong> xss[0] <strong>for</strong> ys <strong>in</strong> cart(xss[1:])]</pre><p>You can confirm that cart produces the same output as product. However, note that its result is generated in its entirety when the function is called.</p><pre><strong>&gt;&gt;&gt;</strong> c = cart([range(0, 100), range(0, 100)])<br><strong>&gt;&gt;&gt; </strong>p = product(*[range(0, 100), range(0, 100)])<br><strong>&gt;&gt;&gt; </strong>set(c) == set(p)<br><strong>True<br>&gt;&gt;&gt;</strong> sys.getsizeof(c), sys.getsizeof(p)<br>43808 40</pre><p>A variant of this function that uses memory more efficiently can be created by turning the original definition into a <a href="https://wiki.python.org/moin/Generators">generator</a>. One added benefit of this approach is that the component collections can now be generators themselves (and, thus, can potentially <a href="https://medium.com/python-supply/iterators-generators-and-uncertainty-a3e7f85183b">contain an unknown number or even infinitely many elements</a>).</p><pre><strong>def </strong>cart(xss):<br>    <strong>if </strong>len(xss) == 0:<br>        <strong>yield </strong>()<br>    <strong>else</strong>:<br>        <strong>for</strong> t <strong>in </strong>((x,)+ys <strong>for</strong> x <strong>in </strong>xss[0] <strong>for</strong> ys <strong>in </strong>cart(xss[1:])):<br>            <strong>yield </strong>t</pre><p>The generator variant of cart function is nearly identical to product; the only difference is the absence of argument unpacking.</p><pre><strong>&gt;&gt;&gt;</strong> list(cart([[0, 1, 2], [&quot;a&quot;, &quot;b&quot;]]))<br>[(0, &#39;a&#39;), (0, &#39;b&#39;), (1, &#39;a&#39;), (1, &#39;b&#39;), (2, &#39;a&#39;), (2, &#39;b&#39;)]</pre><p>Because this function is a generator, it consumes approximately as much memory as is needed to keep track of the two component collections.</p><pre><strong>&gt;&gt;&gt;</strong> c = cart([range(0, 1000), range(0, 1000)])<br><strong>&gt;&gt;&gt; </strong>2 * sys.getsizeof(range(0, 1000))<br>48<br><strong>&gt;&gt;&gt;</strong> sys.getsizeof(c)<br>56</pre><h3>Power Sets</h3><p>A closely related scenario is one in which it may be necessary to generate every <em>subset</em> of a finite set (also known as the <a href="https://en.wikipedia.org/wiki/Power_set"><em>power set</em></a>). There are a number of approaches to building a power set. It is possible to use the Cartesian product as a building block by noting that every element in a set s of size len(s) is either <em>absent</em> (corresponding to False) or <em>present</em> (corresponding to True). Thus, you can first build a Cartesian product of len(s) instances of the set {False, True}.</p><pre><strong>&gt;&gt;&gt;</strong> s = {0, 1, 2}<br><strong>&gt;&gt;&gt;</strong> p = list(product(*[[<strong>False</strong>, <strong>True</strong>]]*len(s)))<br><strong>&gt;&gt;&gt;</strong> p<br>[(False, False, False),<br> (False, False, True),<br> (False, True, False),<br> (False, True, True),<br> (True, False, False),<br> (True, False, True),<br> (True, True, False),<br> (True, True, True)]</pre><p>It is then possible to use the built-in <a href="https://docs.python.org/3/library/functions.html#zip">zip</a> function and a comprehension (which employs an if clause for filtering) to associate the boolean values that make up each of the tuples in the Cartesian product with the corresponding elements in the original set.</p><pre><strong>&gt;&gt;&gt;</strong> ss = [{x <strong>for</strong> (b, x) <strong>in</strong> zip(bs, s) <strong>if</strong> b} <strong>for</strong> bs <strong>in</strong> p]<br><strong>&gt;&gt;&gt;</strong> len(ss) == 2**len(s)<br><strong>True</strong><br><strong>&gt;&gt;&gt;</strong> ss<br>[set(), {2}, {1}, {1, 2}, {0}, {0, 2}, {0, 1}, {0, 1, 2}]</pre><p>Alternatively, the Python documentation provides a <a href="https://docs.python.org/3/library/itertools.html#itertools-recipes">recipe</a> for building power sets using the built-in itertools library.</p><pre><strong>from</strong> itertools <strong>import</strong> chain, combinations<br><strong>def</strong> powerset(iterable):<br>    s = list(iterable)<br>    <strong>return</strong> chain.from_iterable(<br>        combinations(s, r) <strong>for</strong> r <strong>in</strong> range(len(s) + 1)<br>    )</pre><p>The above variant yields a collection of tuples rather than sets, but otherwise produces the same collection of combinations.</p><pre><strong>&gt;&gt;&gt;</strong> list(powerset([0, 1, 2]))<br>[(), (0,), (1,), (2,), (0, 1), (0, 2), (1, 2), (0, 1, 2)]</pre><p>The recursive definition below employs an approach that is nearly identical to that of the recursive definition of the Cartesian production function. In the recursive case, every subset of all the remaining elements (excluding the current first element) is included in the overall result. However, a <em>second</em> copy of each of these subsets is taken and then paired (as in the Cartesian product function) with the current first element. This accounts for all subsets that <em>do include</em> the first element and all subsets that <em>do not include</em> it.</p><pre><strong>def</strong> powerset(xs):<br>    <strong>if </strong>len(xs) == 0:<br>        <strong>return </strong>[tuple()]<br>    <strong>else</strong>:<br>        x = xs[0]<br>        yss = powerset(xs[1:])<br>        <strong>return </strong>yss + [(x,) + ys <strong>for </strong>ys <strong>in </strong>yss]</pre><p>This approach yields the same results as the definition in the recipe, though it is worth noting that the exact implementation above leads to a different order. This distinction may be important in some applications (such as when performing a search for the largest subset that satisfies some criteria).</p><pre><strong>&gt;&gt;&gt;</strong> powerset([0, 1, 2])<br>[(), (2,), (1,), (1, 2), (0,), (0, 2), (0, 1), (0, 1, 2)]</pre><p>Can you apply the same technique used in the implementations of the cart function to turn the above recursive definition of powerset into a generator?</p><h3>Further Reading</h3><p>This article reviews how comprehensions, in concert with other built-in functions and Python features, can be used to assemble concise implementations of algorithms for generating common kinds of combinations of items from a collection. Many variants and relevant peripheral techniques can be found in a list of <a href="https://docs.python.org/3/library/itertools.html#itertools-recipes">recipes</a> found in the documentation for the built-in <a href="https://docs.python.org/3/library/itertools.html">itertools</a> library. This includes functions such as <a href="https://docs.python.org/3/library/itertools.html#itertools.combinations">combinations</a>, which can be used to obtain all subsets of a specific size. Other common set operations such as union and intersection are available as methods of the <a href="https://docs.python.org/3/library/stdtypes.html#set">built-in </a><a href="https://docs.python.org/3/library/stdtypes.html#set">set type</a>, and the built-in <a href="https://docs.python.org/3/library/collections.html">collections</a> library provides a variety of operations on collections (such as a class <a href="https://docs.python.org/3/library/collections.html#collections.Counter">Counter</a> for concisely implementing counting workflows). Popular third-party libraries such as <a href="https://docs.scipy.org/">SciPy</a> contain <a href="https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.misc.comb.html#scipy.misc.comb">specialized functions</a> for common operations in combinatorics.</p><p><em>This article is also available as a </em><a href="https://github.com/python-supply/comprehensions-and-combinations/blob/master/comprehensions-and-combinations.ipynb"><em>Jupyter Notebook</em></a><em> and in </em><a href="https://python.supply/comprehensions-and-combinations"><em>HTML</em></a><em> form on </em><a href="https://github.com/python-supply"><em>GitHub</em></a><em>.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=664f1c178c24" width="1" height="1" alt=""><hr><p><a href="https://medium.com/python-supply/comprehensions-and-combinations-664f1c178c24">Comprehensions and Combinations</a> was originally published in <a href="https://medium.com/python-supply">Python Supply</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Tools for Organizations in a Rapidly Evolving Data Privacy Landscape]]></title>
            <link>https://medium.com/nthparty/tools-for-organizations-in-a-rapidly-evolving-data-privacy-landscape-7bd58d8fa5e1?source=rss-f6b147043b04------2</link>
            <guid isPermaLink="false">https://medium.com/p/7bd58d8fa5e1</guid>
            <category><![CDATA[compliance]]></category>
            <category><![CDATA[startup]]></category>
            <category><![CDATA[privacy]]></category>
            <category><![CDATA[cryptography]]></category>
            <category><![CDATA[data]]></category>
            <dc:creator><![CDATA[Andrei Lapets]]></dc:creator>
            <pubDate>Wed, 05 Aug 2020 20:40:57 GMT</pubDate>
            <atom:updated>2020-11-02T20:10:24.893Z</atom:updated>
            <content:encoded><![CDATA[<h4>Privacy-enhancing and secure computation technologies are ready today to transform your workflows and services.</h4><p>By building and deploying web services or data analysis workflows that employ emerging privacy-enhancing and secure computation technologies such as <em>secure multi-party computation </em>(MPC), organizations can provide new services, identify and leverage new business opportunities, reduce risks and costs for both themselves and their customers, and comply with evolving regulations. MPC is already being incorporated into some industry, government, and nonprofit software solutions and workflows. But leveraging its advantages to address an organization’s data privacy challenges requires an understanding of how its features can satisfy technical, business, and legal constraints. Keep reading for an introduction to MPC and its security and privacy advantages, some context around MPC and related cryptographic techniques, and an overview of scenarios and challenges for which MPC is ready today.</p><h3>Understanding MPC and its Security and Privacy Advantages</h3><p>MPC is a family of cryptographic techniques that allows organizations and individuals to enjoy the benefits of web-based services and data analysis workflows while mitigating or removing the risks normally associated with providing or sharing the data that those services and workflows require. Thus, MPC can reduce costs associated with existing workflows and can enable new opportunities in scenarios in which data sharing is encumbered or restricted due to the security and privacy concerns of individuals, policies maintained by companies and other organizations, and legal constraints and regulations.</p><p>In traditional services and workflows that require computation over sensitive or private data that may be encrypted at rest, it is usually necessary to decrypt that data <em>at some location </em>and for <em>some period of time.</em> This is done so that existing computational tools can be applied to it (<em>e.g.</em>, within a <a href="https://digiday.com/marketing/data-clean-room/">data clean room</a> that is set up temporarily so that two organizations can run analyses on their joint datasets inside it). The result of the computation might then be encrypted again before it leaves the location where the computation took place. MPC removes the requirement that sensitive or private data must be decrypted in such scenarios. This means that the risks, liabilities, and costs associated with protecting the data while it is in a decrypted form can be reduced or eliminated.</p><p>How is computation over encrypted data possible? That depends on the choice of MPC technique. However, there are some common characteristics:</p><ul><li>First, at least two distinct parties must be involved (this is usually already the case in almost all interesting applications).</li><li>Second, these parties must have the ability to generate random numbers privately (<em>i.e.</em>, the process they use to generate the random numbers cannot be observed or influenced by other parties), which allows them to encrypt any form of data by <a href="https://en.wikipedia.org/wiki/Shamir%27s_Secret_Sharing">separating it into constituent parts that <em>appear random </em>on their own</a> (for the curious, this is equivalent to the <a href="https://en.wikipedia.org/wiki/One-time_pad"><em>one-time pad</em></a>, which is an encryption technique that provides information-theoretic security).</li><li>Third, these parties should be incentivized not to simply share or publish the random values they create or exchange as part of a secure computation process (<em>e.g.</em>, because of legal or economic incentives, contractual obligations, regulations, and so on).</li></ul><p>These conditions are sufficient to allow <em>any </em>service or workflow to operate on encrypted data, and a variety of MPC techniques exist (each representing various trade-offs between performance and other factors) that rely on such an exchange of information that appears random to participants.</p><h3>MPC in the Context of Related Privacy-Enhancing Techniques</h3><p>Many related cryptographic techniques, including both those that have been ubiquitous for decades and those that are as novel as MPC, can be integrated with MPC. Traditional techniques such as symmetric and public-key cryptography can be leveraged to ease the burden of deploying MPC and thus to accommodate a broader range of scenarios, as has been <a href="https://dl.acm.org/doi/10.1145/3209811.3212701">demonstrated in multiple real-world use cases</a>.</p><p>But MPC can also be <em>combined</em> with other emerging techniques that provide <em>complementary </em>but <em>orthogonal </em>advantages, such as <a href="https://en.wikipedia.org/wiki/Differential_privacy">differential privacy</a> (DP) and blockchain:</p><ol><li>DP techniques that protect individual records within a dataset while still allowing aggregate analyses over that dataset to be computed and shared. An MPC implementation of a DP workflow can protect all input and intermediate data within a computation (thanks to MPC), and anything the output implies about individual records in the original inputs (thanks to DP).</li><li>A range of blockchain techniques allow individuals and organizations to collectively maintain a <a href="https://en.wikipedia.org/wiki/Distributed_ledger">distributed ledger</a> that is verifiable and cannot be modified. If a scenario requires both (1) the ability to store and compute over encrypted data and (2) permanent storage and/or verifiability of encrypted data, it may make sense to combine the two technologies.</li></ol><p>It is worth noting that the performance overheads and costs of MPC and other secure computation technologies almost always exceed those of traditional, non-secure solutions. When combining these technologies, their respective overheads and costs may be cumulative.</p><h3><strong>Contemporary Challenges and the Opportunities of MPC</strong></h3><p>Today, organizations operating in a broad range of domains (health, finance, education, government, and so on) face a number of existing and emerging challenges as they create and execute workflows within their own organizations and across their partners and customers: reducing the costs and liabilities of inter-organizational data exchange, protecting customer privacy, adhering to regulatory requirements, and others. MPC can help these organizations by introducing additional options and flexibilities where they might not otherwise exist.</p><p>MPC can act as a less expensive alternative to clean rooms and trusted third parties. It can address a variety of specific use cases involving one or more organizations, helping decision-makers analyze data to answer simple questions (<em>e.g.</em>, with “yes” or “no” answers) without first undertaking a burdensome negotiation process that may involve legal expenses, delays, and risks of data exposure or unauthorized data reuse. Example of use cases include:</p><ul><li>Partner organizations can evaluate the value or effectiveness of their partnerships, such as calculating conversion rates or identifying response correlations across audience segments and outreach strategies.</li><li>Consortia of organizations can create benchmarks using pooled data without sharing any of their constituent datasets.</li><li>Industry competitors can run fully confidential surveys across customer populations.</li></ul><p>MPC can also have a significant impact on the evolution of consumer-facing web applications. In addition to letting organizations offer customer-facing applications that are privacy-preserving (<em>i.e.</em>, allowing customers to enjoy the value of a service while not sharing their data with the service provider), MPC dramatically expands the range of available tools for navigating the rapidly evolving landscape of data protection and privacy regulations. For example, the <a href="https://eur-lex.europa.eu/legal-content/EN/TXT/?qid=1589662968663&amp;uri=CELEX:32016R0679">General Data Protection Regulation (GDPR)</a> within the European Union and the <a href="https://leginfo.legislature.ca.gov/faces/billTextClient.xhtml?bill_id=201720180AB375">California Consumer Privacy Act (CCPA)</a> in the US state of California impose requirements on organizations that collect or process personal data of individuals.</p><p>The key takeaway is that MPC can allow organizations to transform their workflows and services to operate only on encrypted, pseudonymized, or de-identified data <em>while retaining some or all of the utility of those services and workflows</em>. Organizations can continue to innovate and offer a wider range of new services to customers while maintaining compliance.</p><h3>Getting Started with Secure Computation</h3><p><a href="https://nthparty.com/">Nth Party</a> builds and offers products that help organizations introduce MPC into their web services and data analysis workflows. Leveraging its expertise and relying on years of experience developing MPC <a href="https://multiparty.org/">open-source frameworks, software, and applications</a> that have been proven in real-world deployments, the Nth Party team maintains a rich suite of libraries and tools for quickly assembling MPC solutions to address contemporary privacy and security challenges.</p><p>If you would like to see the material in this article covered in greater depth and with more detailed examples, take a look at our recent <a href="https://www.nthparty.com/learn">white paper</a> and check back for upcoming articles that will delve into each of the above topic areas.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=7bd58d8fa5e1" width="1" height="1" alt=""><hr><p><a href="https://medium.com/nthparty/tools-for-organizations-in-a-rapidly-evolving-data-privacy-landscape-7bd58d8fa5e1">Tools for Organizations in a Rapidly Evolving Data Privacy Landscape</a> was originally published in <a href="https://medium.com/nthparty">Nth Party</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Working with Foreign Functions]]></title>
            <link>https://medium.com/python-supply/working-with-foreign-functions-ee3b5e40f2ca?source=rss-f6b147043b04------2</link>
            <guid isPermaLink="false">https://medium.com/p/ee3b5e40f2ca</guid>
            <category><![CDATA[python]]></category>
            <category><![CDATA[programming-languages]]></category>
            <category><![CDATA[c-programming]]></category>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[software-engineering]]></category>
            <dc:creator><![CDATA[Andrei Lapets]]></dc:creator>
            <pubDate>Thu, 30 Jul 2020 22:01:00 GMT</pubDate>
            <atom:updated>2021-04-25T21:58:34.049Z</atom:updated>
            <content:encoded><![CDATA[<figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*DwZWLFtKqm_PrAGU_nB50g.png" /></figure><p>Suppose your Python library needs to load some sensitive binary data from a file into a contiguous block of memory (<em>e.g.</em>, in order to use it for some application-specific operation). Furthermore, you have some additional requirements that must be met for security and auditing purposes:</p><ul><li>you need to ensure that your code does not inadvertently cause the interpreter to copy any part of the data loaded from the file into some other region of memory,</li><li>you need to log the memory address at which the data was stored, and</li><li>you need to clear the memory region that held the data by overwriting it with random bytes.</li></ul><p>One strategy you might employ in order maintain tight control over what your code is doing is to use C functions found in a compiled shared library to read the data from disk, to load that data into a region of memory, and at the end to clear that region. What minimal collection of built-in Python features will you need to invoke functions that are found in a shared library? How can you transform Python values (such as strings representing the location of the file) into an appropriate form on which the function can operate?</p><p>Python offers a rich set of capabilities via the built-in <a href="https://docs.python.org/3/library/ctypes.html">ctypes</a> library that make it possible to invoke (or wrap in a Python function) foreign functions that have been implemented using another language (such as C/C++) and compiled into shared libraries. This article reviews the basics of employing foreign functions by demonstrating how to load and apply to the above use case the instance of the <a href="https://www.gnu.org/software/libc/">GNU C Library</a> available on most operating systems. The same techniques can be used for any shared library. An alternative approach used by some popular Python packages is briefly reviewed, as well.</p><h3>Loading a Shared Library</h3><p>To load a shared library file for which you know the relative or absolute path, you can normally use the <a href="https://docs.python.org/3/library/ctypes.html#loading-dynamic-link-libraries">LoadLibrary</a> method of either the cdll or the windll instance (depending on your operating system) of the <a href="https://docs.python.org/3/library/ctypes.html#ctypes.LibraryLoader">LibraryLoader</a> class found in <a href="https://docs.python.org/3/library/ctypes.html">ctypes</a>. For the purposes of the use case in this article, it is sufficient to load the GNU C Library. In the example below, the <a href="https://docs.python.org/3/library/platform.html#platform.system">system</a> function from the <a href="https://docs.python.org/3/library/platform.html">platform</a> library is used to distinguish between Windows and Linux/macOS environments. In the Linux/macOS case, the <a href="https://docs.python.org/3/library/ctypes.html#finding-shared-libraries">find_library</a> function is used to determine the absolute path of the shared library.</p><pre><strong>import </strong>ctypes<br><strong>import </strong>platform</pre><pre><strong>if </strong>platform.system() == &quot;Windows&quot;:<br>    libc = ctypes.windll.msvcrt<br><strong>else</strong>:<br>    libc = ctypes.cdll.LoadLibrary(ctypes.util.find_library(&quot;c&quot;))</pre><h3>Invoking Foreign Functions</h3><p>The first portion of your workflow involves loading a file into memory. The Python code below writes a file to disk that contains a sequence of 32 random bytes. The file can be used to test the workflow.</p><pre><strong>from </strong>secrets <strong>import </strong>token_bytes<br><strong>with </strong>open(&quot;data.txt&quot;, &quot;wb&quot;) <strong>as </strong>file:<br>    file.write(token_bytes(32))</pre><p>The C function <a href="https://www.gnu.org/software/libc/manual/html_node/Opening-Streams.html">fopen</a> expects two arguments: a pointer to the first character of a string that represents the path of the file, and a pointer to the first character of the string that represents the mode (<em>i.e.</em>, reading or writing) in which the file is opened. You can use the <a href="https://docs.python.org/3/library/ctypes.html#ctypes.c_char_p">c_char_p</a> function to turn Python strings into a representation in memory that can be handled by the C function. Note the use of the <a href="https://docs.python.org/3/library/stdtypes.html#str.encode">encode</a> string method to provide an explicit encoding for the string as a byte sequence.</p><pre><strong>from </strong>ctypes <strong>import </strong>c_char_p<br>file = c_char_p(&quot;data.txt&quot;.encode(&quot;ascii&quot;))<br>mode = c_char_p(&quot;rb&quot;.encode(&quot;ascii&quot;))</pre><p>Unfortunately, it is not possible within Python to examine the libc object that was created by the <a href="https://docs.python.org/3/library/ctypes.html#ctypes.LibraryLoader">LibraryLoader</a> instance in order to determine what symbols are defined within it. However, in this case we know that the functions fopen, fread, and fclose must exist. For each of these functions, an instance of the <a href="https://docs.python.org/3/library/ctypes.html#ctypes._FuncPtr">FuncPtr</a> class can be found in libc.</p><pre>fopen = libc.fopen<br>fread = libc.fread<br>fclose = libc.fclose</pre><p>Before you can safely invoke these functions, you need to <a href="https://docs.python.org/3/library/ctypes.html#specifying-the-required-argument-types-function-prototypes">specify their argument types</a> and their <a href="https://docs.python.org/3/library/ctypes.html#return-types">return types</a>. This can be accomplished by first consulting the <a href="https://www.gnu.org/software/libc/">GNU C Library</a> documentation to find the signature for each of the C functions you would like to use. Then, the appropriate <a href="https://docs.python.org/3/library/ctypes.html#fundamental-data-types">data type classes</a> can be used to assign the correct sequence of argument types and the correct return type to the argtypes and restype attributes, respectively, of each FuncPtr class instance.</p><pre><strong>from </strong>ctypes <strong>import </strong>c_int, c_size_t, c_void_p</pre><pre>fopen.argtypes = [c_char_p, c_char_p]<br>fopen.restype = c_void_p</pre><pre>fread.argtypes = [c_void_p, c_size_t, c_size_t, c_void_p]<br>fread.restype = c_size_t</pre><pre>fclose.argtypes = [c_void_p]<br>fclose.restype = c_int</pre><p>It is now possible to invoke these functions on some inputs. You can allocate a memory buffer for the 32 bytes of data that you will be loading from the file using the <a href="https://docs.python.org/3/library/ctypes.html#ctypes.create_string_buffer">create_string_buffer</a> function.</p><pre><strong>from </strong>ctypes <strong>import </strong>create_string_buffer<br>data = ctypes.create_string_buffer(32)</pre><p>You can now open the file, load the data, and close the file.</p><pre>&gt;&gt;&gt; fp = fopen(file, mode)<br>&gt;&gt;&gt; fread(data, 32, 1, fp)<br>&gt;&gt;&gt; fclose(fp)<br>&gt;&gt;&gt; bytes(data).hex()<br>&#39;6d1481fda1e3853c14d81c6f1c4f87fcf26aca8f8d628d2cc31067781f9624ce&#39;</pre><p>You can determine the memory address corresponding to the memory buffer data using the <a href="https://docs.python.org/3/library/ctypes.html#ctypes.addressof">addressof</a> function.</p><pre>&gt;&gt;&gt; <strong>from</strong> ctypes <strong>import </strong>addressof<br>&gt;&gt;&gt; hex(addressof(data))<br>&#39;0xd788e0&#39;</pre><p>You can now clear the memory region. The example below uses the memset C function for this purpose. An example that uses a random sequence generator that is appropriate for cryptographic applications appears in the next section.</p><pre>&gt;&gt;&gt; libc.memset.argtypes = [c_void_p, c_int, c_size_t]<br>&gt;&gt;&gt; libc.memset(data, 0, 32)<br>&gt;&gt;&gt; bytes(data).hex()<br>&#39;0000000000000000000000000000000000000000000000000000000000000000&#39;</pre><h3>Alternative Approaches</h3><p>The <a href="https://cffi.readthedocs.io/">C Foreign Function Interface</a> library is similar to the built-in ctypes module and is used by some popular packages, including the <a href="https://pypi.org/project/PyNaCl/">PyNaCl</a> library that acts as a Python interface for the cryptographic library <a href="https://doc.libsodium.org/">libsodium</a>. In the example below, the C implementation of the <a href="https://libsodium.gitbook.io/doc/generating_random_data">randombytes</a> function is invoked on a character buffer bs and then the contents of that buffer are displayed.</p><pre><strong>&gt;&gt;&gt; from </strong>nacl <strong>import </strong>_sodium<br><strong>&gt;&gt;&gt; </strong>lib = _sodium.lib<br><strong>&gt;&gt;&gt;<br>&gt;&gt;&gt; from </strong>cffi <strong>import </strong>FFI<br><strong>&gt;&gt;&gt; </strong>ffi = FFI()<br><strong>&gt;&gt;&gt;<br>&gt;&gt;&gt; </strong>bs = ffi.new(&quot;unsigned char[]&quot;, 8)<br><strong>&gt;&gt;&gt; </strong>lib.randombytes(bs, 8)<br><strong>&gt;&gt;&gt; </strong>bytes(bs).hex()<br>&#39;a215d495f3e4248f&#39;</pre><p>You might choose to call the C implementation directly for a variety of reasons, including to improve performance. This may be useful to do when a more high-level library method allocates new memory for a byte sequence during every invocation, while your own solution can reuse the same memory over and over to store each new batch of bytes. In the example below, the time to invoke the C function over one million iterations is measured.</p><pre><strong>&gt;&gt;&gt;</strong> <strong>import </strong>time<br><strong>&gt;&gt;&gt;</strong> start = time.perf_counter()<br><strong>&gt;&gt;&gt;</strong> <strong>for </strong>_ <strong>in </strong>range(10**6):<br><strong>...</strong>     lib.randombytes(bs, 8)<br><strong>...</strong><br><strong>&gt;&gt;&gt;</strong> time.perf_counter() - start<br>2.583150100079365</pre><p>The below example measures the amount of time it takes to invoke the Python wrapper in PyNaCl over the same number of iterations. The longer running time may be the result of a number of factors; regardless of the underlying reason that may apply for any particular function, the example demonstrates that direct access to the C method gives you more control over those factors.</p><pre><strong>&gt;&gt;&gt;</strong> <strong>from</strong> nacl.bindings <strong>import </strong>randombytes<br><strong>&gt;&gt;&gt; </strong>start = time.perf_counter()<br><strong>&gt;&gt;&gt; for </strong>_ <strong>in </strong>range(10**6):<br><strong>...</strong>     bs = randombytes(8)<br><strong>...</strong><br><strong>&gt;&gt;&gt;</strong> time.perf_counter() - start<br>5.637964699999429</pre><h3>Further Reading</h3><p>If you are interested in learning what other C functions can be used in the manner described in this article, you can review <a href="https://www.gnu.org/software/libc/manual/html_node/index.html">The GNU C Library Reference Manual</a>. In addition to <a href="https://docs.python.org/3/library/ctypes.html">ctypes</a> and <a href="https://eli/cffi.readthedocs.io/">CFFI</a>, there exist specialized variants such as the <a href="https://numpy.org/">NumPy</a>-specific <a href="https://numpy.org/doc/stable/reference/routines.ctypeslib.html">numpy.ctypeslib</a> library (which comes with features that make it easier to package and deliver NumPy data structures to C functions). It is also possible to implement extension modules for Python in C/C++. Useful <a href="https://docs.python.org/3/extending/extending.html">definitions and guidelines</a> are provided that make it possible to write C/C++ code that interacts in appropriate ways with the Python interpreter. If you would like to leverage even more interoperation between Python and C/C++ code, you can investigate the <a href="https://cython.org/">Cython</a> compiler.</p><p><em>This article is also available as a </em><a href="https://github.com/python-supply/working-with-foreign-functions/blob/master/working-with-foreign-functions.ipynb"><em>Jupyter Notebook</em></a><em> and in </em><a href="https://python.supply/working-with-foreign-functions"><em>HTML</em></a><em> form on </em><a href="https://github.com/python-supply"><em>GitHub</em></a><em>.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=ee3b5e40f2ca" width="1" height="1" alt=""><hr><p><a href="https://medium.com/python-supply/working-with-foreign-functions-ee3b5e40f2ca">Working with Foreign Functions</a> was originally published in <a href="https://medium.com/python-supply">Python Supply</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Permutation Circuit Synthesis via Embedded Languages and Recursion]]></title>
            <link>https://medium.com/reity-llc/permutation-circuit-synthesis-via-embedded-languages-and-recursion-85e357b0f327?source=rss-f6b147043b04------2</link>
            <guid isPermaLink="false">https://medium.com/p/85e357b0f327</guid>
            <category><![CDATA[logic-circuit]]></category>
            <category><![CDATA[optimization]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[programming]]></category>
            <dc:creator><![CDATA[Andrei Lapets]]></dc:creator>
            <pubDate>Mon, 29 Jun 2020 20:10:00 GMT</pubDate>
            <atom:updated>2021-06-23T22:07:12.596Z</atom:updated>
            <content:encoded><![CDATA[<p>The ability to synthesize logical circuits as data structures (without any intention of implementing such circuits as hardware) is becoming increasingly relevant as technologies such as garbled circuit protocols and quantum computing platforms begin to mature. Consequently, there is a growing population working in research, prototyping, and even in software application development settings that may find it convenient to have the ability to synthesize circuits dynamically in popular languages such as Python.</p><p>This article describes how an embedded language approach coupled with recursion can be used to create a framework that can synthesize a relatively efficient logical circuit for any chosen permutation of the set of all bit vectors of some fixed length. The described approach can actually be applied to the synthesis of a circuit for any function over bit vectors of a fixed length. This article focuses on the case of permutations because it is more challenging to know in advance whether and how circuits that represent a permutation of a space of bit vectors can be optimized, thus motivating a general approach that can produce circuits that may be used directly or as an input to a more specialized circuit optimization process.</p><h3><strong>Embedded Language for Synthesizing Circuits</strong></h3><p>This article leverages the circuit and circuitry libraries, which constitute an embedded domain-specific language (with Python acting as the host language) for representing, building, and evaluating circuits description.</p><pre><strong>from</strong> parts <strong>import</strong> parts<br><strong>from</strong> circuit <strong>import</strong> *<br><strong>from</strong> circuitry <strong>import</strong> *</pre><h3><strong>Testing a Synthesis Approach</strong></h3><p>Before exploring and comparing synthesis techniques, it is useful to establish a standard approach for testing that the synthesis technique produces a circuit that is functionally correct. The function below performs such a test given a synthesis technique.</p><pre><strong>from</strong> itertools <strong>import</strong> product<br><strong>from</strong> random <strong>import</strong> shuffle<br><strong>from</strong> tqdm <strong>import</strong> tqdm<br><br><strong>def</strong> test(synthesis):<br>    <em># Create a permutation of all 8-bit vectors.</em><br>    vs_original = list(product(*[[0,1]]*8))<br>    vs_permuted = list(product(*[[0,1]]*8))<br>    shuffle(vs_permuted, <strong>lambda</strong>: 0.5)<br><br>    <em># Execute the synthesis function that is being tested.</em><br>    <em># A synthesis function must accept as it inputs an</em><br>    <em># initial vector to evaluate while constructing the</em><br>    <em># circuit (as necessitated by the `circuitry` library),</em><br>    <em># the original list of vectors, and a permuted list of</em><br>    <em># vectors.</em><br>    bit.circuit(circuit())<br>    bs = synthesis(<br>        bits([input(i) <strong>for</strong> i <strong>in</strong> ([0]*8)]),<br>        vs_original,<br>        vs_permuted<br>    )<br><br>    <em># Display some statistics and whether the circuit</em><br>    <em># correctly implements the permutation.</em><br>    c = bit.circuit()<br>    checks = ([<br>        (vo == tuple(c.evaluate(list(vi))))<br>        <strong>for</strong> (vi, vo) <strong>in</strong> tqdm(<br>            list(zip(vs_original, vs_permuted)),<br>            position=0, leave=<strong>True</strong><br>        )<br>    ])<br>    print(all(checks))<br>    print({<br>        o: c.count(<strong>lambda</strong> g: g.operation == o)<br>        <strong>for</strong> o <strong>in</strong> [op.and_, op.or_, op.not_]<br>    })</pre><h3>Naive Synthesis Approach</h3><p>A naive synthesis approach that utilizes logical formulas can act as a starting point. First, split the permutation <em>f</em>: {0, 1}<em>ⁿ</em> → {0, 1}<em>ⁿ</em> into <em>n </em>separate component functions {<em>f </em>∈ {0, 1}<em>ⁿ</em> → {0, 1}<em>ⁿ | i </em>∈ {0, …, <em>n</em>}}<em> </em>such that each component function computes one bit of the output bit vector. For each function <em>fᵢ</em>, convert every input vector <em>v </em>∈ {0, 1}<em>ⁿ </em>that maps to 1 into a corresponding formula <em>ψᵥ</em> that is true for exactly that vector. For example, given <em>v</em> = (0, 1, 1, 0), the formula would be <em>ψᵥ</em>(<em>a</em>, <em>b</em>, <em>c</em>, <em>d</em>)<em> </em>= (¬<em>a</em>) ∧ <em>b </em>∧<em> c </em>∧ (¬<em>d</em>).</p><p>Then, it is just a matter of taking the disjunction of all such formulas to obtain the formula <em>φᵢ</em> for the component function <em>fᵢ</em>. Finally, the output <em>f</em>(<em>v</em>) ∈ {0, 1}<em>ⁿ </em>of the overall function <em>f </em>on an input vector <em>v </em>∈ {0, 1}<em>ⁿ </em>can be computed by evaluating each of the <em>n </em>formulas <em>φᵢ </em>on the same input vector <em>v</em>. This approach can be implemented in a very concise manner, as demonstrated below.</p><pre><strong>from</strong> functools <strong>import</strong> reduce<br><br><strong>def</strong> naive(xs, vs_ins, vs_outs):<br>    <em>&quot;&quot;&quot;</em><br><em>    Synthesize a circuit for the given permutation.</em><br><em>    &quot;&quot;&quot;</em><br>    <strong>def</strong> clause(xs, kcs):<br>        <strong>if</strong> len(kcs) == 1:<br>            (k, c) = kcs[0]<br>            <strong>return</strong> xs[k] <strong>if</strong> c == 1 <strong>else</strong> ~xs[k]<br>        <strong>else</strong>:<br>            (kcs0, kcs1) = parts(kcs, 2)<br>            <strong>return</strong> clause(xs, kcs0) &amp; clause(xs, kcs1)<br><br>    <em># The set of all clauses, one for each input vector.</em><br>    cs = [clause(xs, tuple(enumerate(vi))) <strong>for</strong> vi <strong>in</strong> vs_ins]<br><br>    <em># Index sets of input vectors that should be included</em><br>    <em># for each output bit.</em><br>    ps = [<br>        reduce(<br>            (<strong>lambda</strong> p, q: p | q),<br>            [<br>                clause(xs, tuple(enumerate(vs_ins[r])))<br>                <strong>for</strong> (r, vo) <strong>in</strong> enumerate(vs_outs) <strong>if</strong> vo[c] == 1<br>            ]<br>        )<br>        <strong>for</strong> c <strong>in</strong> range(8)<br>    ]<br><br>    <strong>return</strong> outputs(ps)</pre><p>The naive approach can be evaluated and tested. The circuit generated using the approach is correct, but has a relatively large number of gates.</p><pre>&gt;&gt;&gt; test(naive)<br>100%|█████████████████████████| 256/256 [00:19&lt;00:00, 13.43it/s]<br>True<br>{(0, 0, 0, 1): 7168, (0, 1, 1, 1): 1016, (1, 0): 3711}</pre><h3>Optimized Synthesis Approach</h3><p>The naive approach described and implemented above creates a circuit that performs a large amount of redundant work. For any pair of input variables <em>a</em> and <em>b</em>, the circuit may have many instances of a gate such as <em>a</em> ∧ <em>b</em>. The optimized approach below attempts to take advantage of the fact that a circuit is a directed acyclic graph, finding opportunities to reuse gates where possible.</p><p>Note that the overall goal is <em>not</em> to implement an algorithm that can take a permutation as an input and find the optimal circuit with the minimal number of gates. Instead, the goal is to demonstrate that it is possible to leverage the embedded language for circuits to implement in a <em>concise</em> way a general-purpose <em>greedy</em> circuit synthesis algorithm that is a significant improvement over the naive approach (in terms of the size of the circuits it synthesizes for a given permutation).</p><h4>Most Frequent Pairs</h4><p>The optimized synthesis approach relies on the ability to identify a <em>pair</em> of elements (<em>e.g.</em>, logical variables) that appears <em>most frequently</em> across a collection of sets. A function for identifying such a pair given a collection of sets ss is presented below. This function takes advantage of the <a href="https://docs.python.org/3/library/collections.html#collections.Counter">Counter</a> class found in the built-in <a href="https://docs.python.org/3/library/collections.html">collections</a> library. Note that in addition to identifying a pair, the functions performs a few additional operations that will be useful within the synthesis algorithm.</p><pre><strong>from</strong> collections <strong>import</strong> Counter<br><br><strong>def</strong> pair(ss, ds):<br>    <em>&quot;&quot;&quot;</em><br><em>    Add to `ds` the pair of elements that appears most</em><br><em>    frequently across all sets in `ss`.</em><br><em>    &quot;&quot;&quot;</em><br>    <em># Collect all pairs of elements found in every set in `ss`.</em><br>    ps = [<br>        p <br>        <strong>for</strong> s <strong>in</strong> ss<br>        <strong>for</strong> p <strong>in</strong> [(x, y) <strong>for</strong> x <strong>in</strong> s <strong>for</strong> y <strong>in</strong> s <strong>if</strong> x &lt; y]<br>    ]<br><br>    <strong>if</strong> len(ps) == 0:<br>        <strong>return</strong> (ss, ds, <strong>False</strong>)<br>    <strong>else</strong>:<br>        <em># Find the most common pair.</em><br>        (p, i) = (Counter(ps).most_common(1)[0][0], len(ds))<br>        ds.append(p)<br>        <br>        <em># Replace these pairs of elements with an index into</em><br>        <em># the corresponding pair in `ds`.</em><br>        ss = [<br>            ((s - set(p)) | {i}) <strong>if</strong> set(p).issubset(s) <strong>else</strong> s<br>            <strong>for</strong> s <strong>in</strong> ss<br>        ]<br><br>        <strong>return</strong> (ss, ds, <strong>True</strong>)</pre><h4>Synthesis with Reuse</h4><p>The synthesis approach below modifies the naive synthesis approach by introducing two kinds of reuse:</p><ul><li>subformulas <em>ψᵥ</em> built for individual conjunction clauses are cached and reused (across <em>all</em> conjunctions) whenever possible and</li><li>clauses <em>ψᵥ</em> and their disjunctions are reused across the formulas constructed for the component functions <em>fᵢ</em> (via the heuristic above that looks for disjunctions of pairs of subformulas that occur most frequently at any given stage in the process).</li></ul><pre><strong>def</strong> optimized(xs, vs_ins, vs_outs):<br>    <em>&quot;&quot;&quot;</em><br><em>    Synthesize a circuit for the given permutation.</em><br><em>    &quot;&quot;&quot;</em><br>    cache = {}<br>    <strong>def</strong> clause(xs, kcs):<br>        <strong>if</strong> kcs <strong>in</strong> cache:<br>            <strong>return</strong> cache[kcs]<br>        <strong>elif</strong> len(kcs) == 1:<br>            (k, c) = kcs[0]<br>            cache[(k, c)] = xs[k] <strong>if</strong> c == 1 <strong>else</strong> ~xs[k]<br>            <strong>return</strong> cache[(k, c)]<br>        <strong>else</strong>:<br>            (kcs0, kcs1) = parts(kcs, 2)<br>            cache[kcs] = clause(xs, kcs0) &amp; clause(xs, kcs1)<br>            <strong>return</strong> cache[kcs]<br><br>    <em># Construct an initial collection of sets </em><br>    ss = [<br>        set(r <strong>for</strong> (r, vo) <strong>in</strong> enumerate(vs_outs) <strong>if</strong> vo[c] == 1)<br>        <strong>for</strong> c <strong>in</strong> range(8)<br>    ]<br><br>    <em># Keep merging the most frequent pair across all sets</em><br>    <em># until there are no pairs left.</em><br>    (ds, updated) = (list(range(len(vs_ins))), <strong>True</strong>)<br>    <strong>while</strong> updated:<br>        (ss, ds, updated) = pair(ss, ds)<br><br>    <em># Take the disjunction of every formula that corresponds</em><br>    <em># to an input vector that maps to `1`.</em><br>    cs = [clause(xs, tuple(enumerate(vi))) <strong>for</strong> vi <strong>in</strong> vs_ins]<br>    <strong>for</strong> (k, (i, j)) <strong>in</strong> enumerate(ds[len(vs_ins):]):<br>        cs.append(cs[i] | cs[j])<br><br>    <strong>return</strong> outputs([cs[k] <strong>for</strong> [k] <strong>in</strong> ss])</pre><p>A test of the optimized approach demonstrates a significant reduction in the number of gates.</p><pre>&gt;&gt;&gt; test(optimized)<br>100%|█████████████████████████| 256/256 [00:01&lt;00:00, 189.33it/s]<br>True<br>{(0, 0, 0, 1): 303, (0, 1, 1, 1): 494, (1, 0): 16}</pre><p><em>This article is also available as a </em><a href="https://github.com/reity/article-permutation-circuit-synthesis"><em>Jupyter Notebook</em></a><em> on </em><a href="https://github.com/reity"><em>GitHub</em></a><em>.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=85e357b0f327" width="1" height="1" alt=""><hr><p><a href="https://medium.com/reity-llc/permutation-circuit-synthesis-via-embedded-languages-and-recursion-85e357b0f327">Permutation Circuit Synthesis via Embedded Languages and Recursion</a> was originally published in <a href="https://medium.com/reity-llc">Reity LLC</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Static Checking via Metaclasses]]></title>
            <link>https://medium.com/python-supply/static-checking-via-metaclasses-cec368765b58?source=rss-f6b147043b04------2</link>
            <guid isPermaLink="false">https://medium.com/p/cec368765b58</guid>
            <category><![CDATA[object-oriented]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[static-code-analysis]]></category>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[software-engineering]]></category>
            <dc:creator><![CDATA[Andrei Lapets]]></dc:creator>
            <pubDate>Mon, 15 Jun 2020 03:48:00 GMT</pubDate>
            <atom:updated>2020-10-28T18:49:57.618Z</atom:updated>
            <content:encoded><![CDATA[<p>Suppose you are maintaining a domain-specific machine learning library. Users of the library’s API expect that every machine learning algorithm offered by the API will have the same interface (<em>i.e.</em>, the same methods with the same signatures) regardless of its underlying implementation. You would like to allow a community of contributors to define new algorithms that can be added to the library, but you would like to reduce your own effort and that of contributors when it comes to validating that a new algorithm conforms to the API.</p><p>Python <a href="https://docs.python.org/3/reference/datamodel.html#metaclasses">metaclasses</a> are the underlying, higher-order constructs that instantiate class definitions. Understanding what metaclasses are and how they can be used gives you a significant amount of control over what happens when a new class is introduced by users. This in turn allows you to constrain users when necessary <em>and</em> to provide assistance to users that can save them time and effort.</p><h3>How Classes are Made</h3><p>In Python, functions, classes, objects, and values are all on an equal footing. One consequence of this is that it is possible to pass any of these entities as arguments to functions and to return any of these entities as the result of a function (this fact was discussed in <a href="https://medium.com/python-supply/higher-order-functions-and-decorators-d6bb31a5c78d">another article</a> that covered Python decorators). But this also means that much of the syntax you normally use is actually just syntactic sugar for function calls.</p><p>What happens when the Python interpreter executes a class definition such as the one below?</p><pre><strong>class</strong> Document():<br>    <strong>def</strong> __init__(self):<br>        self.is_document = <strong>True</strong></pre><p>The <em>class</em> (<em>not</em> an instance or object of that class, but <em>the class itself</em>) is created and assigned to a variable that is in scope. In the example above, that variable is Document.</p><pre>&gt;&gt;&gt; Document<br>__main__.Document</pre><p>Python’s built-in <a href="https://docs.python.org/3/library/functions.html#type">type</a> function actually serves a number of purposes beyond determining the type of a value. Given a few additional parameters, the type function can be used to <em>define a new class</em>. Executing the statement in the example below is equivalent to executing the class definition for Document above.</p><pre><strong>def</strong> __init__(self):<br>    self.is_document = <strong>True<br><br></strong>Document = type(&#39;Document&#39;, (), {&#39;__init__&#39;: __init__})</pre><p>Now that Document is a class, it is possible to create objects of this class.</p><pre>&gt;&gt;&gt; d = Document()<br>&gt;&gt;&gt; d.is_document<br><strong>True</strong></pre><h3>How Metaclasses are Made</h3><p>In a manner similar to that of many programmaing languages that support the <a href="https://en.wikipedia.org/wiki/Object-oriented_programming">object-oriented programming</a> paradigm, Python allows programmers to define <em>derived</em> classes that <a href="https://docs.python.org/3/tutorial/classes.html#inheritance">inherit</a> the attributes and methods of a base class. The example below illustrates this by defining a class Passport that is derived from the Document class. Notice that the base class constructor Document is specified in the class definition.</p><pre><strong>class</strong> Passport(Document):<br>    <strong>pass</strong></pre><p>The Passport class inherits the attributes of the Document class. The example below illustrates that it inherits the __init__ method of the Document class.</p><pre>&gt;&gt;&gt; p = Passport()<br>&gt;&gt;&gt; p.is_document<br><strong>True</strong></pre><p>The example in which Document was defined using the built-in type function suggests that type can be viewed (at least using a loose analogy) as a means for creating classes. In a way, it behaves like a constructor for the &quot;class of all possible classes&quot;. Thus, if type is a kind of constructor for a class, it should be possible to use it in the same context as any other class constructor. But what should this mean? What is MetaClass in the example below?</p><pre><strong>class</strong> MetaClass(type):<br>    <strong>pass</strong></pre><p>Following the analogy to its logical conclusion, this must mean that MetaClass has inherited the capabilities of type. And, indeed, it has. In the example below, MetaClass is used to define a new class — in the same way that type was used before.</p><pre>&gt;&gt;&gt; Document = MetaClass(&#39;Document&#39;, (), {&#39;__init__&#39;: __init__})<br>&gt;&gt;&gt; d = Document()<br>&gt;&gt;&gt; d.is_document<br><strong>True</strong></pre><p>The ability to use a metaclass in place of type as in the above example is also supported by the more common class syntactic construct.</p><pre><strong>class</strong> Document(metaclass=MetaClass):<br>    <strong>def</strong> __init__(self):<br>        self.is_document = <strong>True</strong></pre><h3>Using Metaclasses to Enforce an API</h3><p>eturning to the motivating example from the first paragraph, suppose you introduce a metaclass called MetaAlgorithm for machine learning algorithms that is derived from type. This metaclass definition can override the method <a href="https://docs.python.org/3/reference/datamodel.html?highlight=__new__#object.__new__">__new__</a> that is normally invoked when a new class is defined using type (or using the equivalent class syntactic construct). This alternate definition of __new__ performs some additional checks before the class is actually created. In this use case, that additional work involves validating that the class being defined (corresponding to a new machine learning algorithm) conforms to your API.</p><pre><strong>from </strong>types <strong>import</strong> FunctionType<br><br><strong>class</strong> MetaAlgorithm(type):<br>    <strong>def</strong> __new__(cls, clsname, bases, attrs):<br>        <br>        <em># The base class does not need to conform to the API.</em><br>        <em># See the paragraph below for an explanation of this check.</em><br>        <strong>if</strong> clsname != &#39;Algorithm&#39;:<br>            <br>            <em># Check that the programmer-defined</em><br>            <em># class has a contributor string.</em><br>            <strong>if</strong> &#39;contributor&#39; <strong>not</strong> <strong>in</strong> attrs <strong>or</strong>\<br>               <strong>not</strong> isinstance(attrs[&#39;contributor&#39;], str):<br>                <strong>raise</strong> RuntimeError(&#39;missing contributor&#39;)<br>            <br>            <em># Check that the programmer-defined class has the<br>            # methods required for your API.</em><br>            <strong>if</strong> &#39;train&#39; <strong>not in</strong> attrs <strong>or</strong>\<br>               <strong>not</strong> isinstance(attrs[&#39;train&#39;], FunctionType):<br>                raise RuntimeError(&#39;missing training method&#39;)<br>            <br>            <strong>if</strong> &#39;classify&#39; <strong>not in</strong> attrs <strong>or</strong>\<br>               <strong>not</strong> isinstance(attrs[&#39;classify&#39;], FunctionType):<br>                <strong>raise</strong> RuntimeError(&#39;missing training method&#39;)<br>        <br>        <strong>return</strong>\<br>            super(MetaAlgorithm, cls)\<br>            .__new__(cls, clsname, bases, attrs)</pre><p>Now that there is a way to define new classes, there are two ways to proceed. One approach is to require that all algorithm classes that contributors implement must include the metaclass=MetaAlgorithm parameter in the class definition. However, this is easy for a contributor to forget and also may require that contributors have a solid understanding of metaclasses. An alternative is to create a base class from which all contributed algorithm classes must be derived.</p><pre><strong>class</strong> Algorithm(metaclass=MetaAlgorithm):<br>    <strong>pass</strong></pre><p>Using this approach, it is sufficient to export the Algorithm base class and to inform all contributors that their classes must be derived from this base class. The example below illustrates how a contributor might do so for a very basic algorithm.</p><pre><strong>class</strong> Guess(Algorithm):<br>    contributor = &quot;Author&quot;<br>    <br>    <strong>def</strong> train(items, labels):<br>        <strong>pass<br>    <br>    def</strong> classify(item):<br>        <strong>import </strong>random<br>        <strong>return</strong> random.choice([<strong>True</strong>, <strong>False</strong>])</pre><p>As the example below illustrates, an attempt by a user to define a class that does not conform to the API results in an error.</p><pre>&gt;&gt;&gt; try:<br>...     <strong>class</strong> Guess(Algorithm):<br>...         <strong>def</strong> classify(item):<br>...             <strong>return False<br></strong>...<br>... <strong>except</strong> RuntimeError <strong>as</strong> error:<br>...     print(&quot;RuntimeError:&quot;, str(error))<br>...<br>RuntimeError: missing contributor</pre><p>To emphasize: the error above occurs when the Python interpreter tries to execute the <em>definition</em> of the class, and <em>not</em> when an object of the class is created. It would be impossible to reach the point at which the interpreter attempts to create an object of this class because the class itself can never be defined.</p><p>Despite the fact that Python does not technically support static checking beyond ensuring that the syntax of a module is correct, it is arguably justifiable to say that what MetaAlgorithm does is a form of static checking. In many routine scenarios, the checks would be performed at the time that module is imported and before any other code has had a chance to run.</p><h3>Further Reading</h3><p>This article reviewed how user-defined classes are created in Python and how the mechanism for introducing classes can itself be customized. The built-in <a href="https://docs.python.org/3/library/types.html#module-types">types</a> library provides a number of additional methods that can assist in the dynamic creation of new types and classes. The motivating example in this article illustrated how these capabilities can be used to perform a form of <a href="https://en.wikipedia.org/wiki/Static_program_analysis">static analysis</a> of user-defined classes.</p><p>It is worth noting that the approach presented in this article is compatible with the methods for checking type annotations and unit testing functions presented in the article on <a href="https://medium.com/python-supply/advantages-of-type-annotations-6c6eb70fa631">type annotations</a>. For example, it would be straightforward to require that the training and classification method definitions include annotations specifying the exact types of the data that they can handle. It would even be possible to test the methods by generating test cases having the appropriate types. Another observation is that the motivating use case in this article can also be solved by using techniques presented in other articles, such as by applying <a href="https://medium.com/python-supply/higher-order-functions-and-decorators-d6bb31a5c78d">decorators</a> to class definitions or by performing a <a href="https://medium.com/python-supply/analyzing-and-transforming-abstract-syntax-d7d3da23b14f">static analysis</a> of the class definition itself.</p><p><em>This article is also available as a </em><a href="https://github.com/python-supply/static-checking-via-metaclasses/blob/master/static-checking-via-metaclasses.ipynb"><em>Jupyter Notebook</em></a><em> and in </em><a href="https://python.supply/static-checking-via-metaclasses"><em>HTML</em></a><em> form on </em><a href="https://github.com/python-supply"><em>GitHub</em></a><em>.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=cec368765b58" width="1" height="1" alt=""><hr><p><a href="https://medium.com/python-supply/static-checking-via-metaclasses-cec368765b58">Static Checking via Metaclasses</a> was originally published in <a href="https://medium.com/python-supply">Python Supply</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
    </channel>
</rss>