Class VortexDataWriterFactory

java.lang.Object
dev.vortex.spark.write.VortexDataWriterFactory
All Implemented Interfaces:
Serializable, org.apache.spark.sql.connector.write.DataWriterFactory

public final class VortexDataWriterFactory extends Object implements org.apache.spark.sql.connector.write.DataWriterFactory, Serializable
Factory for creating VortexDataWriter instances on Spark executors.

This factory is serialized and sent to executors where it creates data writers for each task. When partition transforms are specified, it creates partitioned writers that organize output into Hive-style partition directories.

See Also:
  • Method Details

    • createWriter

      public org.apache.spark.sql.connector.write.DataWriter<org.apache.spark.sql.catalyst.InternalRow> createWriter(int partitionId, long taskId)
      Creates a new data writer for a specific partition and task.

      Each task writes its data to a separate Vortex file to avoid conflicts. When partition transforms are configured, returns a PartitionedVortexDataWriter that creates Hive-style partition directories.

      Specified by:
      createWriter in interface org.apache.spark.sql.connector.write.DataWriterFactory
      Parameters:
      partitionId - the partition ID
      taskId - the task ID
      Returns:
      a new DataWriter instance