Skip to content

ARROW-13729: [Website] setup datafusion python binding docs#10982

Closed
jimexist wants to merge 2 commits into
apache:masterfrom
jimexist:setup-datafusion-docs
Closed

ARROW-13729: [Website] setup datafusion python binding docs#10982
jimexist wants to merge 2 commits into
apache:masterfrom
jimexist:setup-datafusion-docs

Conversation

@jimexist

@jimexist jimexist commented Aug 24, 2021

Copy link
Copy Markdown
Member

@github-actions

Copy link
Copy Markdown

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW

Opening JIRAs ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename pull request title in the following format?

ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@jimexist jimexist changed the title [docs] setup datafusion python binding docs ARROW-13729: [Website] setup datafusion python binding docs Aug 24, 2021
@github-actions

Copy link
Copy Markdown

@houqp

houqp commented Aug 24, 2021

Copy link
Copy Markdown
Member

Looks amazing, thanks @jimexist !

@alamb @jorgecarleitao and @andygrove, do we need to wait for a voted datafusion 5.1.0 release before we can merge this PR?

@alamb

alamb commented Aug 25, 2021

Copy link
Copy Markdown
Contributor

@houqp I do not think any official vote is required to release documentation as I wouldn't personally consider it part of the release.

@houqp

houqp commented Aug 30, 2021

Copy link
Copy Markdown
Member

@jimexist do we need to update https://github.com/apache/arrow/blob/master/docs/source/developers/documentation.rst as well to mention installation of the datafusion package?

I am not familiar with the automated doc build and publish pipeline, @kszucs @pitrou @lidavidm @wesm @fsaintjacques do we need to update automation to build and install the datafusion python package?

Comment thread docs/source/conf.py
'..', '../..')

])
sys.path.extend([os.path.join(os.path.dirname(__file__), "..", "../..")])

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this file is auto-generated. By making style changes we will make it more difficult to diff the current file against a pristine file.

@pitrou

pitrou commented Aug 30, 2021

Copy link
Copy Markdown
Member

I don't think Datafusion is a "supported environment" at the same level as C++, Python, etc. Since this documents the Python Datafusion library, it could go into the existing "Python" section.

However, since Datafusion is a distinct library from PyArrow (and has its own release schedule), it seems to me it would be better if its documentation lived in a different place altogether. That could be on https://arrow.apache.org, but it could also be hosted on https://readthedocs.org/, for example.

@amol- What do you think?

@amol-

amol- commented Aug 30, 2021

Copy link
Copy Markdown
Member

I don't think Datafusion is a "supported environment" at the same level as C++, Python, etc. Since this documents the Python Datafusion library, it could go into the existing "Python" section.

However, since Datafusion is a distinct library from PyArrow (and has its own release schedule), it seems to me it would be better if its documentation lived in a different place altogether. That could be on https://arrow.apache.org, but it could also be hosted on https://readthedocs.org/, for example.

@amol- What do you think?

I think I would vote for having datafusion into its own documentation (on readthedocs or anywhere else as you suggested), surely it's not one of the supported environments, but in the end is not even part of libarrow/pyarrow itself so I'd avoid mixing its documentation into the one of pyarrow.

@jorgecarleitao

Copy link
Copy Markdown
Member

@houqp

houqp commented Sep 2, 2021

Copy link
Copy Markdown
Member

based on the last reply from @kszucs (https://lists.apache.org/thread.html/r5c9341fe5360ae249532724da2bf92bd2ed661d1c58f599212e82107%40%3Cdev.arrow.apache.org%3E), how about we reuse the current sphinx setup/theme for datafusion, but have a new self-contained website created under https://github.com/apache/arrow-site/tree/asf-site's /datafusion folder? From there, we could host not just the python api/user doc, but also datafusion user docs in general (related to apache/datafusion#837).

@houqp

houqp commented Sep 4, 2021

Copy link
Copy Markdown
Member

Following up on this, @pitrou @amol- based on @wesm's suggestion in https://lists.apache.org/thread.html/r9500355019f7e438ed2417bce577e1a76dbcba742c2c7e1008dcffd5%40%3Cdev.arrow.apache.org%3E, are you all cool with us hosting datafusion related docs in its own website under https://arrow.apache.org/datafusion?

@jimexist

Copy link
Copy Markdown
Member Author

@houqp thanks for migrating this code. maybe now I can close this pull request?

@houqp

houqp commented Sep 17, 2021

Copy link
Copy Markdown
Member

I think so, all of your change in this PR should have been merged into datafusion repo now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Release documentation for python binding

6 participants