java.lang.Object

dev.vortex.spark.write.VortexBatchWrite

All Implemented Interfaces:: Serializable, org.apache.spark.sql.connector.write.BatchWrite, org.apache.spark.sql.connector.write.Write

public final class VortexBatchWrite extends Object implements org.apache.spark.sql.connector.write.Write, org.apache.spark.sql.connector.write.BatchWrite, Serializable

Manages the batch write operation for creating Vortex files.

This class coordinates the distributed write operation across Spark executors, handling the creation of data writers and managing commits/aborts.

See Also:

Serialized Form

Constructor Summary

Constructors

Constructor

Description

VortexBatchWrite(String outputPath, org.apache.spark.sql.types.StructType schema, Map<String,String> options, boolean overwrite)

Creates a new VortexBatchWrite.
Method Summary

Modifier and Type

Method

Description

void

abort(org.apache.spark.sql.connector.write.WriterCommitMessage[] messages)

Aborts the write job due to failures.

void

commit(org.apache.spark.sql.connector.write.WriterCommitMessage[] messages)

Commits the entire write job after all tasks complete successfully.

org.apache.spark.sql.connector.write.DataWriterFactory

createBatchWriterFactory(org.apache.spark.sql.connector.write.PhysicalWriteInfo info)

Creates a DataWriterFactory for producing data writers on executors.

void

onDataWriterCommit(org.apache.spark.sql.connector.write.WriterCommitMessage message)

Called when a single data writer task completes successfully.

org.apache.spark.sql.connector.write.BatchWrite

toBatch()

Returns this object as a BatchWrite.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.spark.sql.connector.write.BatchWrite
useCommitCoordinator

Methods inherited from interface org.apache.spark.sql.connector.write.Write
description, supportedCustomMetrics, toStreaming

Constructor Details
- VortexBatchWrite
  
  public VortexBatchWrite(String outputPath, org.apache.spark.sql.types.StructType schema, Map<String,String> options, boolean overwrite)
  
  Creates a new VortexBatchWrite.
  
  Parameters:
  
  outputPath - the base path where Vortex files will be written
  
  schema - the schema of the data to write
  
  options - additional write options
  
  overwrite - whether to overwrite existing files
Method Details
- toBatch
  
  public org.apache.spark.sql.connector.write.BatchWrite toBatch()
  
  Returns this object as a BatchWrite.
  This method is required by the Write interface to support batch writes.
  
  Specified by:
  
  toBatch in interface org.apache.spark.sql.connector.write.Write
  
  Returns:
  
  this object
- createBatchWriterFactory
  
  public org.apache.spark.sql.connector.write.DataWriterFactory createBatchWriterFactory(org.apache.spark.sql.connector.write.PhysicalWriteInfo info)
  
  Creates a DataWriterFactory for producing data writers on executors.
  This method is called once at the start of the write operation, making it the right place to handle overwrite cleanup.
  
  Specified by:
  
  createBatchWriterFactory in interface org.apache.spark.sql.connector.write.BatchWrite
  
  Returns:
  
  a new VortexDataWriterFactory
- onDataWriterCommit
  
  public void onDataWriterCommit(org.apache.spark.sql.connector.write.WriterCommitMessage message)
  
  Called when a single data writer task completes successfully.
  This is called for each successful task but individual file commits are handled in the data writer itself.
  
  Specified by:
  
  onDataWriterCommit in interface org.apache.spark.sql.connector.write.BatchWrite
  
  Parameters:
  
  message - commit message from a successful data writer task
- commit
  
  public void commit(org.apache.spark.sql.connector.write.WriterCommitMessage[] messages)
  
  Commits the entire write job after all tasks complete successfully.
  This finalizes the write operation and ensures all Vortex files are properly written.
  
  Specified by:
  
  commit in interface org.apache.spark.sql.connector.write.BatchWrite
  
  Parameters:
  
  messages - commit messages from all successful write tasks
- abort
  
  public void abort(org.apache.spark.sql.connector.write.WriterCommitMessage[] messages)
  
  Aborts the write job due to failures.
  This method cleans up any partially written files.
  
  Specified by:
  
  abort in interface org.apache.spark.sql.connector.write.BatchWrite
  
  Parameters:
  
  messages - commit messages from write tasks (may include failures)

Class VortexBatchWrite

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.spark.sql.connector.write.BatchWrite

Methods inherited from interface org.apache.spark.sql.connector.write.Write

Constructor Details

VortexBatchWrite

Method Details

toBatch

createBatchWriterFactory

onDataWriterCommit

commit

abort