Package dev.vortex.spark.write
Class VortexDataWriterFactory
java.lang.Object
dev.vortex.spark.write.VortexDataWriterFactory
- All Implemented Interfaces:
Serializable,org.apache.spark.sql.connector.write.DataWriterFactory
public final class VortexDataWriterFactory
extends Object
implements org.apache.spark.sql.connector.write.DataWriterFactory, Serializable
Factory for creating VortexDataWriter instances on Spark executors.
This factory is serialized and sent to executors where it creates data writers for each task. When partition transforms are specified, it creates partitioned writers that organize output into Hive-style partition directories.
- See Also:
-
Method Summary
Modifier and TypeMethodDescriptionorg.apache.spark.sql.connector.write.DataWriter<org.apache.spark.sql.catalyst.InternalRow>createWriter(int partitionId, long taskId) Creates a new data writer for a specific partition and task.
-
Method Details
-
createWriter
public org.apache.spark.sql.connector.write.DataWriter<org.apache.spark.sql.catalyst.InternalRow> createWriter(int partitionId, long taskId) Creates a new data writer for a specific partition and task.Each task writes its data to a separate Vortex file to avoid conflicts. When partition transforms are configured, returns a
PartitionedVortexDataWriterthat creates Hive-style partition directories.- Specified by:
createWriterin interfaceorg.apache.spark.sql.connector.write.DataWriterFactory- Parameters:
partitionId- the partition IDtaskId- the task ID- Returns:
- a new DataWriter instance
-