Python Quickstart¶
Install¶
pip install vortex-array
Convert¶
You can either use your own Parquet file or download the example used here.
Use Arrow to read a Parquet file and then use array()
to construct an uncompressed
Vortex array:
>>> import pyarrow.parquet as pq
>>> import vortex as vx
>>> parquet = pq.read_table("_static/example.parquet")
>>> vtx = vx.array(parquet)
>>> vtx.nbytes
141024
Compress¶
Use compress()
to compress the Vortex array and check the relative size:
>>> cvtx = vx.compress(vtx)
>>> cvtx.nbytes
14215
>>> cvtx.nbytes / vtx.nbytes
0.10...
Vortex uses nearly ten times fewer bytes than Arrow. Fewer bytes means more of your data fits in cache and RAM.
Write¶
Use write_path()
to write the Vortex array to disk:
>>> vortex.io.write_path(cvtx, "example.vortex")
Small Vortex files (this one is just 71KiB) currently have substantial overhead relative to their size. This will be addressed shortly. On files with at least tens of megabytes of data, Vortex is similar to or smaller than Parquet.
>>> from os.path import getsize
>>> getsize("example.vortex") / getsize("_static/example.parquet")
2.0...
Read¶
Use read_path()
to read the Vortex array from disk:
>>> cvtx = vortex.io.read_path("example.vortex")