Class ArrowUtils

java.lang.Object
dev.vortex.spark.ArrowUtils

public final class ArrowUtils extends Object
Utility class for converting Arrow types to Spark SQL data types.

This class provides static methods to convert Arrow field definitions and type definitions into their corresponding Spark SQL DataType representations. It handles the mapping between Arrow's type system and Spark's type system, including complex types like structs and arrays.

  • Method Summary

    Modifier and Type
    Method
    Description
    static org.apache.spark.sql.types.DataType
    fromArrowField(dev.vortex.relocated.org.apache.arrow.vector.types.pojo.Field field)
    Converts an Arrow Field to a Spark SQL DataType.
    static org.apache.spark.sql.types.DataType
    fromArrowType(dev.vortex.relocated.org.apache.arrow.vector.types.pojo.ArrowType dt)
    Converts an Arrow type to a Spark SQL DataType.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Method Details

    • fromArrowField

      public static org.apache.spark.sql.types.DataType fromArrowField(dev.vortex.relocated.org.apache.arrow.vector.types.pojo.Field field)
      Converts an Arrow Field to a Spark SQL DataType.

      This method handles complex types like structs and arrays by recursively converting their child fields. For primitive types, it delegates to fromArrowType(ArrowType).

      Parameters:
      field - the Arrow field to convert
      Returns:
      the corresponding Spark SQL DataType
      Throws:
      UnsupportedOperationException - if the Arrow type is not supported
    • fromArrowType

      public static org.apache.spark.sql.types.DataType fromArrowType(dev.vortex.relocated.org.apache.arrow.vector.types.pojo.ArrowType dt)
      Converts an Arrow type to a Spark SQL DataType.

      This method maps primitive Arrow types to their corresponding Spark SQL types. It supports most common Arrow types including integers, floating point numbers, strings, binary data, dates, timestamps, decimals, and nulls.

      Parameters:
      dt - the Arrow type to convert
      Returns:
      the corresponding Spark SQL DataType
      Throws:
      UnsupportedOperationException - if the Arrow type configuration is not supported
      RuntimeException - if the Arrow type is not recognized