Talk Python to Me: #503: The PyArrow Revolution
Pandas is at a the core of virtually all data science done in Python, that is virtually all data science. Since it's beginning, Pandas has been based upon numpy. But changes are afoot to update those internals and you can now optionally use PyArrow. PyArrow comes with a ton of benefits including it's columnar format which makes answering analytical questions faster, support for a range of high performance file formats, inter-machine data streaming, faster file IO and more. Reuven Lerner is here to give us the low-down on the PyArrow revolution.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href='https://talkpython.fm/nordlayer'>NordLayer</a><br> <a href='https://talkpython.fm/auth0'>Auth0</a><br> <a href='https://talkpython.fm/training'>Talk Python Courses</a><br/> <br/> <h2 class="links-heading">Links from the show</h2> <div><strong>Reuven</strong>: <a href="https://github.com/reuven?featured_on=talkpython" target="_blank" >github.com/reuven</a><br/> <strong>Apache Arrow</strong>: <a href="https://github.com/apache/arrow?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>Parquet</strong>: <a href="https://parquet.apache.org/?featured_on=talkpython" target="_blank" >parquet.apache.org</a><br/> <strong>Feather format</strong>: <a href="https://arrow.apache.org/docs/python/feather.html?featured_on=talkpython" target="_blank" >arrow.apache.org</a><br/> <strong>Python Workout Book (45% off with code talkpython45)</strong>: <a href="https://mng.bz/nZNv?featured_on=talkpython" target="_blank" >manning.com</a><br/> <strong>Pandas Workout Book (45% off with code talkpython45)</strong>: <a href="https://mng.bz/Qwvm?featured_on=talkpython" target="_blank" >manning.com</a><br/> <strong>Pandas</strong>: <a href="https://pandas.pydata.org/?featured_on=talkpython" target="_blank" >pandas.pydata.org</a><br/> <strong>PyArrow CSV docs</strong>: <a href="https://arrow.apache.org/docs/python/csv.html?featured_on=talkpython" target="_blank" >arrow.apache.org</a><br/> <strong>Future string inference in Pandas</strong>: <a href="https://pandas.pydata.org/docs?featured_on=talkpython" target="_blank" >pandas.pydata.org</a><br/> <strong>Pandas NA/nullable dtypes</strong>: <a href="https://pandas.pydata.org/docs/user_guide/integer_na.html?featured_on=talkpython" target="_blank" >pandas.pydata.org</a><br/> <strong>Pandas `.iloc` indexing</strong>: <a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iloc.html?featured_on=talkpython" target="_blank" >pandas.pydata.org</a><br/> <strong>DuckDB</strong>: <a href="https://duckdb.org?featured_on=talkpython" target="_blank" >duckdb.org</a><br/> <strong>Pandas user guide</strong>: <a href="https://pandas.pydata.org/docs/user_guide/?featured_on=talkpython" target="_blank" >pandas.pydata.org</a><br/> <strong>Pandas GitHub issues</strong>: <a href="https://github.com/pandas-dev/pandas/issues?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>Watch this episode on YouTube</strong>: <a href="https://www.youtube.com/watch?v=IHd-bgeHrv0" target="_blank" >youtube.com</a><br/> <strong>Episode transcripts</strong>: <a href="https://talkpython.fm/episodes/transcript/503/the-pyarrow-revolution" target="_blank" >talkpython.fm</a><br/> <br/> <strong>--- Stay in touch with us ---</strong><br/> <strong>Subscribe to Talk Python on YouTube</strong>: <a href="https://talkpython.fm/youtube" target="_blank" >youtube.com</a><br/> <strong>Talk Python on Bluesky</strong>: <a href="https://bsky.app/profile/talkpython.fm" target="_blank" >@talkpython.fm at bsky.app</a><br/> <strong>Talk Python on Mastodon</strong>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" ><i class="fa-brands fa-mastodon"></i>talkpython</a><br/> <strong>Michael on Bluesky</strong>: <a href="https://bsky.app/profile/mkennedy.codes?featured_on=talkpython" target="_blank" >@mkennedy.codes at bsky.app</a><br/> <strong>Michael on Mastodon</strong>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" ><i class="fa-brands fa-mastodon"></i>mkennedy</a><br/></div>
https://talkpython.fm/episodes/show/503/the-pyarrow-revolution