Class VortexDataWriterFactory

java.lang.Object
dev.vortex.spark.write.VortexDataWriterFactory
All Implemented Interfaces:
Serializable, org.apache.spark.sql.connector.write.DataWriterFactory

public final class VortexDataWriterFactory extends Object implements org.apache.spark.sql.connector.write.DataWriterFactory, Serializable
Factory for creating VortexDataWriter instances on Spark executors.

This factory is serialized and sent to executors where it creates data writers for each task.

See Also:
  • Constructor Details

    • VortexDataWriterFactory

      public VortexDataWriterFactory(String outputUri, org.apache.spark.sql.types.StructType schema, Map<String,String> options)
      Creates a new VortexDataWriterFactory.
      Parameters:
      outputUri - the base path where Vortex files will be written
      schema - the schema of the data to write
      options - additional write options
  • Method Details

    • createWriter

      public org.apache.spark.sql.connector.write.DataWriter<org.apache.spark.sql.catalyst.InternalRow> createWriter(int partitionId, long taskId)
      Creates a new data writer for a specific partition and task.

      Each task writes its data to a separate Vortex file to avoid conflicts.

      Specified by:
      createWriter in interface org.apache.spark.sql.connector.write.DataWriterFactory
      Parameters:
      partitionId - the partition ID
      taskId - the task ID
      Returns:
      a new VortexDataWriter instance