Class DataSource

java.lang.Object
dev.vortex.api.DataSource

public final class DataSource extends Object
A set of Vortex files opened through a Session. Data sources are cheap to open (only the first file is read eagerly, to determine the schema) and can be scanned multiple times.

Native resources are released automatically via VortexCleaner when the data source becomes unreachable.

  • Method Details

    • open

      public static DataSource open(Session session, String uri)
      Open a single URI.
    • open

      public static DataSource open(Session session, String uri, Map<String,String> properties)
      Open one or more URIs or globs. When a glob is used, the first match is opened eagerly; subsequent matches are opened lazily on scan.
      Parameters:
      session - open session
      uri - single URI or glob
      properties - object-store credentials / options
    • open

      public static DataSource open(Session session, List<String> uris, Map<String,String> properties)
      Open one or more URIs or globs. When a glob is used, the first match is opened eagerly; subsequent matches are opened lazily on scan.
      Parameters:
      session - open session
      uris - URIs or globs to scan
      properties - object-store credentials / options
    • arrowSchema

      public org.apache.arrow.vector.types.pojo.Schema arrowSchema(org.apache.arrow.memory.BufferAllocator allocator)
      Arrow schema of the data source (and of scans produced from it).
    • rowCount

      public DataSource.RowCount rowCount()
      Row count along with the precision of that estimate. Mirrors the Rust Precision<u64> returned by DataSource::row_count: DataSource.RowCount.Unknown when no estimate is available, DataSource.RowCount.Estimate for an inexact hint, DataSource.RowCount.Exact when the count is authoritative.
    • scan

      public Scan scan(ScanOptions options)
      Submit a scan.