References¶
Vortex is inspired by and–in some cases–directly based upon the existing, excellent work of many researchers and OSS developers. This page serves as a reference to the work that has influenced Vortex (and so we don’t keep asking one another to send links to the same papers over and over again).
Compression & Encodings¶
Scanning & Compute¶
Columnar File Formats¶
I/O & Cloud Storage¶
Exploiting Cloud Object Storage for High-Performance Analytics [DLN23]
Azim Afroozeh and Peter Boncz. The fastlanes compression layout: decoding > 100 billion integers per second with scalar code. Proc. VLDB Endow., 16(9):2132–2144, May 2023. URL: https://doi.org/10.14778/3598581.3598587, doi:10.14778/3598581.3598587.
Azim Afroozeh, Lotte Felius, and Peter Boncz. Accelerating gpu data processing using fastlanes compression. In Proceedings of the 20th International Workshop on Data Management on New Hardware, DaMoN '24. New York, NY, USA, 2024. Association for Computing Machinery. URL: https://doi.org/10.1145/3662010.3663450, doi:10.1145/3662010.3663450.
Azim Afroozeh, Leonardo X. Kuffo, and Peter Boncz. Alp: adaptive lossless floating-point compression. Proc. ACM Manag. Data, December 2023. URL: https://doi.org/10.1145/3626717, doi:10.1145/3626717.
Peter Boncz, Thomas Neumann, and Viktor Leis. Fsst: fast random access string compression. Proc. VLDB Endow., 13(12):2649–2661, July 2020. URL: https://doi.org/10.14778/3407790.3407851, doi:10.14778/3407790.3407851.
Biswapesh Chattopadhyay, Priyam Dutta, Weiran Liu, Ott Tinn, Andrew McCormick, Aniket Mokashi, Paul Harvey, Hector Gonzalez, David Lomax, Sagar Mittal, Roee Aharon Ebenstein, Nikita Mikhaylin, Hung-ching Lee, Xiaoyan Zhao, Guanzhong Xu, Luis Antonio Perez, Farhad Shahmohammadi, Tran Bui, Neil McKay, Vera Lychagina, and Brett Elliott. Procella: unifying serving and analytical data at youtube. PVLDB, 12(12):2022–2034, 2019. URL: https://dl.acm.org/citation.cfm?id=3360438.
Dominik Durner, Viktor Leis, and Thomas Neumann. Json tiles: fast analytics on semi-structured data. In Proceedings of the 2021 International Conference on Management of Data, SIGMOD '21, 445–458. New York, NY, USA, 2021. Association for Computing Machinery. URL: https://doi.org/10.1145/3448016.3452809, doi:10.1145/3448016.3452809.
Dominik Durner, Viktor Leis, and Thomas Neumann. Exploiting cloud object storage for high-performance analytics. Proc. VLDB Endow., 16(11):2769–2782, July 2023. URL: https://doi.org/10.14778/3611479.3611486, doi:10.14778/3611479.3611486.
Navid Eslami and Niv Dayan. Memento filter: a fast, dynamic, and robust range filter. Proc. ACM Manag. Data, December 2024. URL: https://doi.org/10.1145/3698820, doi:10.1145/3698820.
Maximilian Kuschewski, David Sauerwein, Adnan Alhomssi, and Viktor Leis. Btrblocks: efficient columnar compression for data lakes. Proc. ACM Manag. Data, June 2023. URL: https://doi.org/10.1145/3589263, doi:10.1145/3589263.
Yinan Li, Jianan Lu, and Badrish Chandramouli. Selection pushdown in column stores using bit manipulation instructions. Proc. ACM Manag. Data, June 2023. URL: https://doi.org/10.1145/3589323, doi:10.1145/3589323.