Package dev.vortex.spark.write
Class VortexDataWriterFactory
java.lang.Object
dev.vortex.spark.write.VortexDataWriterFactory
- All Implemented Interfaces:
Serializable
,org.apache.spark.sql.connector.write.DataWriterFactory
public final class VortexDataWriterFactory
extends Object
implements org.apache.spark.sql.connector.write.DataWriterFactory, Serializable
Factory for creating VortexDataWriter instances on Spark executors.
This factory is serialized and sent to executors where it creates data writers for each task.
- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionVortexDataWriterFactory
(String outputUri, org.apache.spark.sql.types.StructType schema, Map<String, String> options) Creates a new VortexDataWriterFactory. -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.spark.sql.connector.write.DataWriter<org.apache.spark.sql.catalyst.InternalRow>
createWriter
(int partitionId, long taskId) Creates a new data writer for a specific partition and task.
-
Constructor Details
-
VortexDataWriterFactory
public VortexDataWriterFactory(String outputUri, org.apache.spark.sql.types.StructType schema, Map<String, String> options) Creates a new VortexDataWriterFactory.- Parameters:
outputUri
- the base path where Vortex files will be writtenschema
- the schema of the data to writeoptions
- additional write options
-
-
Method Details
-
createWriter
public org.apache.spark.sql.connector.write.DataWriter<org.apache.spark.sql.catalyst.InternalRow> createWriter(int partitionId, long taskId) Creates a new data writer for a specific partition and task.Each task writes its data to a separate Vortex file to avoid conflicts.
- Specified by:
createWriter
in interfaceorg.apache.spark.sql.connector.write.DataWriterFactory
- Parameters:
partitionId
- the partition IDtaskId
- the task ID- Returns:
- a new VortexDataWriter instance
-