types – PyArrow Extension Types

class pymongoarrow.types.BSONExtensionScalar
as_py(self, *, maps_as_pydicts=None)

Return this scalar as a Python object.

Parameters

maps_as_pydictsstr, optional, default None

Valid values are None, ‘lossy’, or ‘strict’. The default behavior (None), is to convert Arrow Map arrays to Python association lists (list-of-tuples) in the same order as the Arrow Map, as in [(key1, value1), (key2, value2), …].

If ‘lossy’ or ‘strict’, convert Arrow Map arrays to native Python dicts.

If ‘lossy’, whenever duplicate keys are detected, a warning will be printed. The last seen value of a duplicate key will be in the Python dictionary. If ‘strict’, this instead results in an exception being raised when detected.

cast(self, target_type=None, safe=None, options=None, memory_pool=None)

Cast scalar value to another data type.

See pyarrow.compute.cast() for usage.

Parameters

target_typeDataType, default None

Type to cast scalar to.

safeboolean, default True

Whether to check for conversion errors such as overflow.

optionsCastOptions, default None

Additional checks pass by CastOptions

memory_poolMemoryPool, optional

memory pool to use for allocations during function execution.

Returns

scalar : A Scalar of the given target data type.

equals(self, Scalar other)

Parameters

other : pyarrow.Scalar

Returns

bool

static from_storage(BaseExtensionType typ, value)

Construct ExtensionScalar from type and storage value.

Parameters

typDataType

The extension type for the result scalar.

valueobject

The storage value for the result scalar.

Returns

ext_scalar : ExtensionScalar

is_valid

Holds a valid (non-null) value.

type

Data type of the Scalar object.

validate(self, *, full=False)

Perform validation checks. An exception is raised if validation fails.

By default only cheap validation checks are run. Pass full=True for thorough validation checks (potentially O(n)).

Parameters

fullbool, default False

If True, run expensive checks, otherwise cheap checks only.

Raises

ArrowInvalid

value

Return storage value as a scalar.

class pymongoarrow.types.BinaryScalar
as_py(self, *, maps_as_pydicts=None)

Return this scalar as a Python object.

Parameters

maps_as_pydictsstr, optional, default None

Valid values are None, ‘lossy’, or ‘strict’. The default behavior (None), is to convert Arrow Map arrays to Python association lists (list-of-tuples) in the same order as the Arrow Map, as in [(key1, value1), (key2, value2), …].

If ‘lossy’ or ‘strict’, convert Arrow Map arrays to native Python dicts.

If ‘lossy’, whenever duplicate keys are detected, a warning will be printed. The last seen value of a duplicate key will be in the Python dictionary. If ‘strict’, this instead results in an exception being raised when detected.

cast(self, target_type=None, safe=None, options=None, memory_pool=None)

Cast scalar value to another data type.

See pyarrow.compute.cast() for usage.

Parameters

target_typeDataType, default None

Type to cast scalar to.

safeboolean, default True

Whether to check for conversion errors such as overflow.

optionsCastOptions, default None

Additional checks pass by CastOptions

memory_poolMemoryPool, optional

memory pool to use for allocations during function execution.

Returns

scalar : A Scalar of the given target data type.

equals(self, Scalar other)

Parameters

other : pyarrow.Scalar

Returns

bool

static from_storage(BaseExtensionType typ, value)

Construct ExtensionScalar from type and storage value.

Parameters

typDataType

The extension type for the result scalar.

valueobject

The storage value for the result scalar.

Returns

ext_scalar : ExtensionScalar

is_valid

Holds a valid (non-null) value.

type

Data type of the Scalar object.

validate(self, *, full=False)

Perform validation checks. An exception is raised if validation fails.

By default only cheap validation checks are run. Pass full=True for thorough validation checks (potentially O(n)).

Parameters

fullbool, default False

If True, run expensive checks, otherwise cheap checks only.

Raises

ArrowInvalid

value

Return storage value as a scalar.

class pymongoarrow.types.BinaryType(subtype)

Initialize an extension type instance.

This should be called at the end of the subclass’ __init__ method.

bit_width

The bit width of the extension type.

byte_width

The byte width of the extension type.

equals(self, other, *, check_metadata=False)

Return true if type is equivalent to passed value.

Parameters

other : DataType or string convertible to DataType check_metadata : bool

Whether nested Field metadata equality should be checked as well.

Returns

is_equal : bool

Examples

>>> import pyarrow as pa
>>> pa.int64().equals(pa.string())
False
>>> pa.int64().equals(pa.int64())
True
extension_name

The extension type name.

field(self, i) Field

Parameters

i : int

Returns

pyarrow.Field

has_variadic_buffers

If True, the number of expected buffers is only lower-bounded by num_buffers.

Examples

>>> import pyarrow as pa
>>> pa.int64().has_variadic_buffers
False
>>> pa.string_view().has_variadic_buffers
True
num_buffers

Number of data buffers required to construct Array type excluding children.

Examples

>>> import pyarrow as pa
>>> pa.int64().num_buffers
2
>>> pa.string().num_buffers
3
num_fields

The number of child fields.

Examples

>>> import pyarrow as pa
>>> pa.int64()
DataType(int64)
>>> pa.int64().num_fields
0
>>> pa.list_(pa.string())
ListType(list<item: string>)
>>> pa.list_(pa.string()).num_fields
1
>>> struct = pa.struct({'x': pa.int32(), 'y': pa.string()})
>>> struct.num_fields
2
storage_type

The underlying storage type.

to_pandas_dtype(self)

Return the equivalent NumPy / Pandas dtype.

Examples

>>> import pyarrow as pa
>>> pa.int64().to_pandas_dtype()
<class 'numpy.int64'>
wrap_array(self, storage)

Wrap the given storage array as an extension array.

Parameters

storage : Array or ChunkedArray

Returns

arrayArray or ChunkedArray

Extension array wrapping the storage array

class pymongoarrow.types.CodeScalar
as_py(self, *, maps_as_pydicts=None)

Return this scalar as a Python object.

Parameters

maps_as_pydictsstr, optional, default None

Valid values are None, ‘lossy’, or ‘strict’. The default behavior (None), is to convert Arrow Map arrays to Python association lists (list-of-tuples) in the same order as the Arrow Map, as in [(key1, value1), (key2, value2), …].

If ‘lossy’ or ‘strict’, convert Arrow Map arrays to native Python dicts.

If ‘lossy’, whenever duplicate keys are detected, a warning will be printed. The last seen value of a duplicate key will be in the Python dictionary. If ‘strict’, this instead results in an exception being raised when detected.

cast(self, target_type=None, safe=None, options=None, memory_pool=None)

Cast scalar value to another data type.

See pyarrow.compute.cast() for usage.

Parameters

target_typeDataType, default None

Type to cast scalar to.

safeboolean, default True

Whether to check for conversion errors such as overflow.

optionsCastOptions, default None

Additional checks pass by CastOptions

memory_poolMemoryPool, optional

memory pool to use for allocations during function execution.

Returns

scalar : A Scalar of the given target data type.

equals(self, Scalar other)

Parameters

other : pyarrow.Scalar

Returns

bool

static from_storage(BaseExtensionType typ, value)

Construct ExtensionScalar from type and storage value.

Parameters

typDataType

The extension type for the result scalar.

valueobject

The storage value for the result scalar.

Returns

ext_scalar : ExtensionScalar

is_valid

Holds a valid (non-null) value.

type

Data type of the Scalar object.

validate(self, *, full=False)

Perform validation checks. An exception is raised if validation fails.

By default only cheap validation checks are run. Pass full=True for thorough validation checks (potentially O(n)).

Parameters

fullbool, default False

If True, run expensive checks, otherwise cheap checks only.

Raises

ArrowInvalid

value

Return storage value as a scalar.

class pymongoarrow.types.CodeType

Initialize an extension type instance.

This should be called at the end of the subclass’ __init__ method.

bit_width

The bit width of the extension type.

byte_width

The byte width of the extension type.

equals(self, other, *, check_metadata=False)

Return true if type is equivalent to passed value.

Parameters

other : DataType or string convertible to DataType check_metadata : bool

Whether nested Field metadata equality should be checked as well.

Returns

is_equal : bool

Examples

>>> import pyarrow as pa
>>> pa.int64().equals(pa.string())
False
>>> pa.int64().equals(pa.int64())
True
extension_name

The extension type name.

field(self, i) Field

Parameters

i : int

Returns

pyarrow.Field

has_variadic_buffers

If True, the number of expected buffers is only lower-bounded by num_buffers.

Examples

>>> import pyarrow as pa
>>> pa.int64().has_variadic_buffers
False
>>> pa.string_view().has_variadic_buffers
True
num_buffers

Number of data buffers required to construct Array type excluding children.

Examples

>>> import pyarrow as pa
>>> pa.int64().num_buffers
2
>>> pa.string().num_buffers
3
num_fields

The number of child fields.

Examples

>>> import pyarrow as pa
>>> pa.int64()
DataType(int64)
>>> pa.int64().num_fields
0
>>> pa.list_(pa.string())
ListType(list<item: string>)
>>> pa.list_(pa.string()).num_fields
1
>>> struct = pa.struct({'x': pa.int32(), 'y': pa.string()})
>>> struct.num_fields
2
storage_type

The underlying storage type.

to_pandas_dtype(self)

Return the equivalent NumPy / Pandas dtype.

Examples

>>> import pyarrow as pa
>>> pa.int64().to_pandas_dtype()
<class 'numpy.int64'>
wrap_array(self, storage)

Wrap the given storage array as an extension array.

Parameters

storage : Array or ChunkedArray

Returns

arrayArray or ChunkedArray

Extension array wrapping the storage array

class pymongoarrow.types.Decimal128Scalar
as_py(self, *, maps_as_pydicts=None)

Return this scalar as a Python object.

Parameters

maps_as_pydictsstr, optional, default None

Valid values are None, ‘lossy’, or ‘strict’. The default behavior (None), is to convert Arrow Map arrays to Python association lists (list-of-tuples) in the same order as the Arrow Map, as in [(key1, value1), (key2, value2), …].

If ‘lossy’ or ‘strict’, convert Arrow Map arrays to native Python dicts.

If ‘lossy’, whenever duplicate keys are detected, a warning will be printed. The last seen value of a duplicate key will be in the Python dictionary. If ‘strict’, this instead results in an exception being raised when detected.

cast(self, target_type=None, safe=None, options=None, memory_pool=None)

Cast scalar value to another data type.

See pyarrow.compute.cast() for usage.

Parameters

target_typeDataType, default None

Type to cast scalar to.

safeboolean, default True

Whether to check for conversion errors such as overflow.

optionsCastOptions, default None

Additional checks pass by CastOptions

memory_poolMemoryPool, optional

memory pool to use for allocations during function execution.

Returns

scalar : A Scalar of the given target data type.

equals(self, Scalar other)

Parameters

other : pyarrow.Scalar

Returns

bool

static from_storage(BaseExtensionType typ, value)

Construct ExtensionScalar from type and storage value.

Parameters

typDataType

The extension type for the result scalar.

valueobject

The storage value for the result scalar.

Returns

ext_scalar : ExtensionScalar

is_valid

Holds a valid (non-null) value.

type

Data type of the Scalar object.

validate(self, *, full=False)

Perform validation checks. An exception is raised if validation fails.

By default only cheap validation checks are run. Pass full=True for thorough validation checks (potentially O(n)).

Parameters

fullbool, default False

If True, run expensive checks, otherwise cheap checks only.

Raises

ArrowInvalid

value

Return storage value as a scalar.

class pymongoarrow.types.Decimal128Type

Initialize an extension type instance.

This should be called at the end of the subclass’ __init__ method.

bit_width

The bit width of the extension type.

byte_width

The byte width of the extension type.

equals(self, other, *, check_metadata=False)

Return true if type is equivalent to passed value.

Parameters

other : DataType or string convertible to DataType check_metadata : bool

Whether nested Field metadata equality should be checked as well.

Returns

is_equal : bool

Examples

>>> import pyarrow as pa
>>> pa.int64().equals(pa.string())
False
>>> pa.int64().equals(pa.int64())
True
extension_name

The extension type name.

field(self, i) Field

Parameters

i : int

Returns

pyarrow.Field

has_variadic_buffers

If True, the number of expected buffers is only lower-bounded by num_buffers.

Examples

>>> import pyarrow as pa
>>> pa.int64().has_variadic_buffers
False
>>> pa.string_view().has_variadic_buffers
True
num_buffers

Number of data buffers required to construct Array type excluding children.

Examples

>>> import pyarrow as pa
>>> pa.int64().num_buffers
2
>>> pa.string().num_buffers
3
num_fields

The number of child fields.

Examples

>>> import pyarrow as pa
>>> pa.int64()
DataType(int64)
>>> pa.int64().num_fields
0
>>> pa.list_(pa.string())
ListType(list<item: string>)
>>> pa.list_(pa.string()).num_fields
1
>>> struct = pa.struct({'x': pa.int32(), 'y': pa.string()})
>>> struct.num_fields
2
storage_type

The underlying storage type.

to_pandas_dtype(self)

Return the equivalent NumPy / Pandas dtype.

Examples

>>> import pyarrow as pa
>>> pa.int64().to_pandas_dtype()
<class 'numpy.int64'>
wrap_array(self, storage)

Wrap the given storage array as an extension array.

Parameters

storage : Array or ChunkedArray

Returns

arrayArray or ChunkedArray

Extension array wrapping the storage array

class pymongoarrow.types.ObjectIdScalar
as_py(self, *, maps_as_pydicts=None)

Return this scalar as a Python object.

Parameters

maps_as_pydictsstr, optional, default None

Valid values are None, ‘lossy’, or ‘strict’. The default behavior (None), is to convert Arrow Map arrays to Python association lists (list-of-tuples) in the same order as the Arrow Map, as in [(key1, value1), (key2, value2), …].

If ‘lossy’ or ‘strict’, convert Arrow Map arrays to native Python dicts.

If ‘lossy’, whenever duplicate keys are detected, a warning will be printed. The last seen value of a duplicate key will be in the Python dictionary. If ‘strict’, this instead results in an exception being raised when detected.

cast(self, target_type=None, safe=None, options=None, memory_pool=None)

Cast scalar value to another data type.

See pyarrow.compute.cast() for usage.

Parameters

target_typeDataType, default None

Type to cast scalar to.

safeboolean, default True

Whether to check for conversion errors such as overflow.

optionsCastOptions, default None

Additional checks pass by CastOptions

memory_poolMemoryPool, optional

memory pool to use for allocations during function execution.

Returns

scalar : A Scalar of the given target data type.

equals(self, Scalar other)

Parameters

other : pyarrow.Scalar

Returns

bool

static from_storage(BaseExtensionType typ, value)

Construct ExtensionScalar from type and storage value.

Parameters

typDataType

The extension type for the result scalar.

valueobject

The storage value for the result scalar.

Returns

ext_scalar : ExtensionScalar

is_valid

Holds a valid (non-null) value.

type

Data type of the Scalar object.

validate(self, *, full=False)

Perform validation checks. An exception is raised if validation fails.

By default only cheap validation checks are run. Pass full=True for thorough validation checks (potentially O(n)).

Parameters

fullbool, default False

If True, run expensive checks, otherwise cheap checks only.

Raises

ArrowInvalid

value

Return storage value as a scalar.

class pymongoarrow.types.ObjectIdType

Initialize an extension type instance.

This should be called at the end of the subclass’ __init__ method.

bit_width

The bit width of the extension type.

byte_width

The byte width of the extension type.

equals(self, other, *, check_metadata=False)

Return true if type is equivalent to passed value.

Parameters

other : DataType or string convertible to DataType check_metadata : bool

Whether nested Field metadata equality should be checked as well.

Returns

is_equal : bool

Examples

>>> import pyarrow as pa
>>> pa.int64().equals(pa.string())
False
>>> pa.int64().equals(pa.int64())
True
extension_name

The extension type name.

field(self, i) Field

Parameters

i : int

Returns

pyarrow.Field

has_variadic_buffers

If True, the number of expected buffers is only lower-bounded by num_buffers.

Examples

>>> import pyarrow as pa
>>> pa.int64().has_variadic_buffers
False
>>> pa.string_view().has_variadic_buffers
True
num_buffers

Number of data buffers required to construct Array type excluding children.

Examples

>>> import pyarrow as pa
>>> pa.int64().num_buffers
2
>>> pa.string().num_buffers
3
num_fields

The number of child fields.

Examples

>>> import pyarrow as pa
>>> pa.int64()
DataType(int64)
>>> pa.int64().num_fields
0
>>> pa.list_(pa.string())
ListType(list<item: string>)
>>> pa.list_(pa.string()).num_fields
1
>>> struct = pa.struct({'x': pa.int32(), 'y': pa.string()})
>>> struct.num_fields
2
storage_type

The underlying storage type.

to_pandas_dtype(self)

Return the equivalent NumPy / Pandas dtype.

Examples

>>> import pyarrow as pa
>>> pa.int64().to_pandas_dtype()
<class 'numpy.int64'>
wrap_array(self, storage)

Wrap the given storage array as an extension array.

Parameters

storage : Array or ChunkedArray

Returns

arrayArray or ChunkedArray

Extension array wrapping the storage array

pymongoarrow.types.dtype

alias of Decimal128Type