Package dev.vortex.spark.read
Class VortexScanBuilder
java.lang.Object
dev.vortex.spark.read.VortexScanBuilder
- All Implemented Interfaces:
org.apache.spark.sql.connector.read.ScanBuilder
,org.apache.spark.sql.connector.read.SupportsPushDownRequiredColumns
public final class VortexScanBuilder
extends Object
implements org.apache.spark.sql.connector.read.ScanBuilder, org.apache.spark.sql.connector.read.SupportsPushDownRequiredColumns
Spark V2
ScanBuilder
for table scans over Vortex files.-
Constructor Summary
ConstructorsConstructorDescriptionVortexScanBuilder
(Map<String, String> formatOptions) Creates a new VortexScanBuilder with empty paths and columns. -
Method Summary
Modifier and TypeMethodDescriptionaddAllColumns
(Iterable<org.apache.spark.sql.connector.catalog.Column> columns) Adds multiple columns to read.addAllPaths
(Iterable<String> paths) Adds multiple file paths to scan.addColumn
(org.apache.spark.sql.connector.catalog.Column column) Adds a column to read.Adds a file path to scan.org.apache.spark.sql.connector.read.Scan
build()
Builds a VortexScan with the configured paths and columns.void
pruneColumns
(org.apache.spark.sql.types.StructType requiredSchema) Prunes the columns to only include those specified in the required schema.
-
Constructor Details
-
VortexScanBuilder
Creates a new VortexScanBuilder with empty paths and columns.
-
-
Method Details
-
addPath
Adds a file path to scan.- Parameters:
path
- the file path to add- Returns:
- this builder for method chaining
-
addColumn
Adds a column to read.- Parameters:
column
- the column to add- Returns:
- this builder for method chaining
-
addAllPaths
Adds multiple file paths to scan.- Parameters:
paths
- the iterable of file paths to add- Returns:
- this builder for method chaining
-
addAllColumns
public VortexScanBuilder addAllColumns(Iterable<org.apache.spark.sql.connector.catalog.Column> columns) Adds multiple columns to read.- Parameters:
columns
- the iterable of columns to add- Returns:
- this builder for method chaining
-
build
public org.apache.spark.sql.connector.read.Scan build()Builds a VortexScan with the configured paths and columns.- Specified by:
build
in interfaceorg.apache.spark.sql.connector.read.ScanBuilder
- Returns:
- a new VortexScan instance
- Throws:
IllegalStateException
- if no paths or columns have been added
-
pruneColumns
public void pruneColumns(org.apache.spark.sql.types.StructType requiredSchema) Prunes the columns to only include those specified in the required schema.This method clears the current column list and replaces it with columns derived from the required schema. Currently only supports top-level schema pruning - deeply nested schema pruning is not yet implemented.
- Specified by:
pruneColumns
in interfaceorg.apache.spark.sql.connector.read.SupportsPushDownRequiredColumns
- Parameters:
requiredSchema
- the schema specifying which columns are required
-