Skip to content

Add basic AVRO files (translated copies of the parquet testing files to avro)#62

Merged
alamb merged 4 commits into
apache:masterfrom
Igosuki:avro
Sep 9, 2021
Merged

Add basic AVRO files (translated copies of the parquet testing files to avro)#62
alamb merged 4 commits into
apache:masterfrom
Igosuki:avro

Conversation

@Igosuki

@Igosuki Igosuki commented Aug 26, 2021

Copy link
Copy Markdown
Contributor

N.B. : I used spark for the translation so there is some additional metadata in the files, but they can be removed.

@alamb alamb changed the title These are translated copies of the parquet testing files to avro. Add basic AVRO files (translated copies of the parquet testing files to avro) Aug 29, 2021

@alamb alamb left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pitrou / @kiszk do you have any concerns or suggestions for adding several smaller AVRO files into the testing repository? They are used for apache/datafusion#910 and we may consider adding avro support to the main apache-rs repo as well.

@pitrou

pitrou commented Aug 29, 2021

Copy link
Copy Markdown
Member

This seems fine to me, but can you add a README explaining what these files are and how they were obtained?

@kiszk

kiszk commented Aug 29, 2021

Copy link
Copy Markdown
Member

Looks good to me

@alamb

alamb commented Aug 30, 2021

Copy link
Copy Markdown
Contributor

@Igosuki -- I added a basic README in 8d306ef -- can you provide the command you used to create these files from the original parquet?

Thanks!

@Igosuki

Igosuki commented Aug 31, 2021

Copy link
Copy Markdown
Contributor Author

@Igosuki

Igosuki commented Aug 31, 2021

Copy link
Copy Markdown
Contributor Author

It would be possible to use arrow-python and fastavro to achieve the same, I just have a lot of Spark experience and I prefer typed so I went that way.

@alamb

alamb commented Sep 9, 2021

Copy link
Copy Markdown
Contributor

Thanks @Igosuki ! I am sorry for the delayed response -- I am catching up from being on vacation and hope to help push your contributions over the line real soon now

Comment thread data/avro/README.md Outdated
@alamb alamb merged commit 1ec12d1 into apache:master Sep 9, 2021
@Igosuki

Igosuki commented Sep 10, 2021

Copy link
Copy Markdown
Contributor Author

All good 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants