dev.vortex.spark.read.VortexScanBuilder

All Implemented Interfaces:: org.apache.spark.sql.connector.read.ScanBuilder, org.apache.spark.sql.connector.read.SupportsPushDownRequiredColumns, org.apache.spark.sql.connector.read.SupportsPushDownV2Filters

public final class VortexScanBuilder extends Object implements org.apache.spark.sql.connector.read.ScanBuilder, org.apache.spark.sql.connector.read.SupportsPushDownRequiredColumns, org.apache.spark.sql.connector.read.SupportsPushDownV2Filters

Spark V2 ScanBuilder for table scans over Vortex files.

Constructor Summary

Constructors

Constructor

Description

VortexScanBuilder(Map<String,String> formatOptions)

Creates a new VortexScanBuilder with empty paths and columns.

VortexScanBuilder(Map<String,String> formatOptions, org.apache.spark.sql.connector.expressions.Transform[] partitionTransforms)

Creates a new VortexScanBuilder with empty paths and columns and the supplied partition transforms.
Method Summary

Modifier and Type

Method

Description

VortexScanBuilder

addAllColumns(Iterable<org.apache.spark.sql.connector.catalog.Column> columns)

Adds multiple columns to read.

VortexScanBuilder

addAllPaths(Iterable<String> paths)

Adds multiple file paths to scan.

VortexScanBuilder

addColumn(org.apache.spark.sql.connector.catalog.Column column)

Adds a column to read.

VortexScanBuilder

addPath(String path)

Adds a file path to scan.

org.apache.spark.sql.connector.read.Scan

build()

Builds a VortexScan with the configured paths and columns.

void

pruneColumns(org.apache.spark.sql.types.StructType requiredSchema)

Prunes the columns to only include those specified in the required schema.

org.apache.spark.sql.connector.expressions.filter.Predicate[]

pushedPredicates()

Returns the predicates this scan promises to apply.

org.apache.spark.sql.connector.expressions.filter.Predicate[]

pushPredicates(org.apache.spark.sql.connector.expressions.filter.Predicate[] predicates)

Splits the supplied predicates into pushed and not-pushed sets.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- VortexScanBuilder
  
  public VortexScanBuilder(Map<String,String> formatOptions)
  
  Creates a new VortexScanBuilder with empty paths and columns.
- VortexScanBuilder
  
  public VortexScanBuilder(Map<String,String> formatOptions, org.apache.spark.sql.connector.expressions.Transform[] partitionTransforms)
  
  Creates a new VortexScanBuilder with empty paths and columns and the supplied partition transforms. Filters that reference partition columns are not pushed down, since the partition columns are not stored inside the Vortex files.
Method Details
- addPath
  
  public VortexScanBuilder addPath(String path)
  
  Adds a file path to scan.
  
  Parameters:
  
  path - the file path to add
  
  Returns:
  
  this builder for method chaining
- addColumn
  
  public VortexScanBuilder addColumn(org.apache.spark.sql.connector.catalog.Column column)
  
  Adds a column to read.
  
  Parameters:
  
  column - the column to add
  
  Returns:
  
  this builder for method chaining
- addAllPaths
  
  public VortexScanBuilder addAllPaths(Iterable<String> paths)
  
  Adds multiple file paths to scan.
  
  Parameters:
  
  paths - the iterable of file paths to add
  
  Returns:
  
  this builder for method chaining
- addAllColumns
  
  public VortexScanBuilder addAllColumns(Iterable<org.apache.spark.sql.connector.catalog.Column> columns)
  
  Adds multiple columns to read.
  
  Parameters:
  
  columns - the iterable of columns to add
  
  Returns:
  
  this builder for method chaining
- build
  
  public org.apache.spark.sql.connector.read.Scan build()
  
  Builds a VortexScan with the configured paths and columns.
  
  Specified by:
  
  build in interface org.apache.spark.sql.connector.read.ScanBuilder
  
  Returns:
  
  a new VortexScan instance
  
  Throws:
  
  IllegalStateException - if no paths or columns have been added
- pruneColumns
  
  public void pruneColumns(org.apache.spark.sql.types.StructType requiredSchema)
  
  Prunes the columns to only include those specified in the required schema.
  This method clears the current column list and replaces it with columns derived from the required schema. Currently only supports top-level schema pruning - deeply nested schema pruning is not yet implemented.
  
  Specified by:
  
  pruneColumns in interface org.apache.spark.sql.connector.read.SupportsPushDownRequiredColumns
  
  Parameters:
  
  requiredSchema - the schema specifying which columns are required
- pushPredicates
  
  public org.apache.spark.sql.connector.expressions.filter.Predicate[] pushPredicates(org.apache.spark.sql.connector.expressions.filter.Predicate[] predicates)
  
  Splits the supplied predicates into pushed and not-pushed sets.
  A predicate is pushed when it references only data columns (not partition columns) and uses operators and literal types that SparkPredicateToVortexExpression can map to Vortex expressions. Predicates that reference partition columns or use unsupported features are returned to Spark for post-scan evaluation.
  
  Specified by:
  
  pushPredicates in interface org.apache.spark.sql.connector.read.SupportsPushDownV2Filters
  
  Returns:
  
  the predicates that Spark must still evaluate
- pushedPredicates
  
  public org.apache.spark.sql.connector.expressions.filter.Predicate[] pushedPredicates()
  
  Returns the predicates this scan promises to apply.
  
  Specified by:
  
  pushedPredicates in interface org.apache.spark.sql.connector.read.SupportsPushDownV2Filters

Class VortexScanBuilder

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

VortexScanBuilder

VortexScanBuilder

Method Details

addPath

addColumn

addAllPaths

addAllColumns

build

pruneColumns

pushPredicates

pushedPredicates