Benchmarking#
Vortex has two categories of benchmarks: microbenchmarks for individual operations, and SQL
benchmarks for end-to-end query performance. The bench-orchestrator tool coordinates running
SQL benchmarks across different engines without compiling them all into a single binary.
Microbenchmarks#
Microbenchmarks use the Divan framework and live in benches/ directories within individual
crates. They cover low-level operations such as encoding, decoding, compute kernels, buffer
operations, and scalar access.
Run microbenchmarks for a specific crate with:
cargo bench -p <crate-name>
SQL Benchmarks#
SQL benchmarks measure end-to-end query performance across different engines and file formats.
The vortex-bench crate provides a common Benchmark trait that each benchmark suite
implements, defining its queries, data generation, and expected results.
Available suites include TPC-H, TPC-DS, ClickBench, FineWeb, and others. Each suite can be run against multiple engines (DataFusion, DuckDB) and formats (Parquet, Vortex, Vortex Compact, Lance, DuckDB native).
Data Generation#
Before running SQL benchmarks, test data must be generated:
cargo run --release --bin data-gen -- <benchmark> --formats parquet,vortex
The data generator creates base Parquet data and converts it to each requested format. Scale
factors are configurable per suite (e.g. --opt scale-factor=10.0 for TPC-H SF=10).
Running SQL Benchmarks#
SQL benchmarks can be run directly via their per-engine binaries:
cargo run --release --bin datafusion-bench -- <benchmark>
cargo run --release --bin duckdb-bench -- <benchmark>
Orchestrator#
The bench-orchestrator is a Python CLI tool (vx-bench) that coordinates running benchmarks
across multiple engines. It builds and invokes the per-engine binaries, stores results, and
provides comparison tooling. This avoids compiling all engines into a single binary, which
would be slow and create dependency conflicts.
Install it with:
uv tool install "bench_orchestrator @ ./bench-orchestrator/"
Running Benchmarks#
# Run TPC-H on DataFusion and DuckDB, comparing Parquet and Vortex
vx-bench run tpch --engine datafusion,duckdb --format parquet,vortex
# Run a subset of queries with fewer iterations
vx-bench run tpch -q 1,6,12 -i 3
# Run with memory tracking
vx-bench run tpch --track-memory
# Run with CPU profiling
vx-bench run tpch --samply
Comparing Results#
# Compare formats/engines within the most recent run
vx-bench compare --run latest
# Compare across two labeled runs
vx-bench compare --runs baseline,feature
Comparison output is color-coded: green for improvements (>10%), yellow for neutral, red for regressions.
Result Storage#
Results are stored as JSON Lines files under target/vortex-bench/runs/, with each run
containing metadata (git commit, timestamp, configuration) and per-query timing data. The
vx-bench list command shows recent runs.
CI Benchmarks#
Benchmarks run automatically on all commits to develop and can be run on-demand for PRs:
Post-commit – compression, random access, and SQL benchmarks run on every commit to
develop, with results uploaded for historical tracking.PR benchmarks – triggered by the
action/benchmarklabel. Results are compared against the latestdeveloprun and posted as a PR comment.SQL benchmarks – triggered by the
action/benchmark-sqllabel. Runs a parametric matrix of suites, engines, formats, and storage backends (NVMe, S3).
All CI benchmarks run on dedicated instances with the release_debug profile and
-C target-cpu=native to produce representative numbers.
Results can be viewed at bench.vortex.dev.