api – PyMongoArrow APIs#

class pymongoarrow.api.Schema(schema)#

Create a Schema instance from a mapping or an iterable.

Parameters:
  • schema: A mapping.

classmethod from_arrow(aschema)#

Create a Schema instance from a Schema

Parameters:
  • aschema: PyArrow Schema

Parameters:

aschema (Schema) –

to_arrow()#

Output the Schema as an instance of class:~pyarrow.Schema.

pymongoarrow.api.aggregate_arrow_all(collection, pipeline, *, schema=None, **kwargs)#

Method that returns the results of an aggregation pipeline as a pyarrow.Table instance.

Parameters:
  • collection: Instance of Collection. against which to run the aggregate operation.

  • pipeline: A list of aggregation pipeline stages.

  • schema (optional): Instance of Schema. If the schema is not given, it will be inferred using the first document in the result set.

Additional keyword-arguments passed to this method will be passed directly to the underlying aggregate operation.

Returns:

An instance of class:pyarrow.Table.

pymongoarrow.api.aggregate_numpy_all(collection, pipeline, *, schema=None, **kwargs)#

Method that returns the results of an aggregation pipeline as a dict instance whose keys are field names and values are ndarray instances bearing the appropriate dtype.

Parameters:
  • collection: Instance of Collection. against which to run the find operation.

  • query: A mapping containing the query to use for the find operation.

  • schema (optional): Instance of Schema. If the schema is not given, it will be inferred using the first document in the result set.

Additional keyword-arguments passed to this method will be passed directly to the underlying aggregate operation.

This method attempts to create each NumPy array as a view on the Arrow data corresponding to each field in the result set. When this is not possible, the underlying data is copied into a new NumPy array. See pyarrow.Array.to_numpy() for more information.

NumPy arrays returned by this method that are views on Arrow data are not writable. Users seeking to modify such arrays must first create an editable copy using numpy.copy().

Returns:

An instance of dict.

pymongoarrow.api.aggregate_pandas_all(collection, pipeline, *, schema=None, **kwargs)#

Method that returns the results of an aggregation pipeline as a pandas.DataFrame instance.

Parameters:
  • collection: Instance of Collection. against which to run the find operation.

  • pipeline: A list of aggregation pipeline stages.

  • schema (optional): Instance of Schema. If the schema is not given, it will be inferred using the first document in the result set.

Additional keyword-arguments passed to this method will be passed directly to the underlying aggregate operation.

Returns:

An instance of class:pandas.DataFrame.

pymongoarrow.api.aggregate_polars_all(collection, pipeline, *, schema=None, **kwargs)#

Method that returns the results of an aggregation pipeline as a polars.DataFrame instance.

Parameters:
  • collection: Instance of Collection. against which to run the find operation.

  • pipeline: A list of aggregation pipeline stages.

  • schema (optional): Instance of Schema. If the schema is not given, it will be inferred using the first document in the result set.

Additional keyword-arguments passed to this method will be passed directly to the underlying aggregate operation.

Returns:

An instance of class:polars.DataFrame.

pymongoarrow.api.find_arrow_all(collection, query, *, schema=None, **kwargs)#

Method that returns the results of a find query as a pyarrow.Table instance.

Parameters:
  • collection: Instance of Collection. against which to run the find operation.

  • query: A mapping containing the query to use for the find operation.

  • schema (optional): Instance of Schema. If the schema is not given, it will be inferred using the first document in the result set.

Additional keyword-arguments passed to this method will be passed directly to the underlying find operation.

Returns:

An instance of class:pyarrow.Table.

pymongoarrow.api.find_numpy_all(collection, query, *, schema=None, **kwargs)#

Method that returns the results of a find query as a dict instance whose keys are field names and values are ndarray instances bearing the appropriate dtype.

Parameters:
  • collection: Instance of Collection. against which to run the find operation.

  • query: A mapping containing the query to use for the find operation.

  • schema (optional): Instance of Schema. If the schema is not given, it will be inferred using the first document in the result set.

Additional keyword-arguments passed to this method will be passed directly to the underlying find operation.

This method attempts to create each NumPy array as a view on the Arrow data corresponding to each field in the result set. When this is not possible, the underlying data is copied into a new NumPy array. See pyarrow.Array.to_numpy() for more information.

NumPy arrays returned by this method that are views on Arrow data are not writable. Users seeking to modify such arrays must first create an editable copy using numpy.copy().

Returns:

An instance of dict.

pymongoarrow.api.find_pandas_all(collection, query, *, schema=None, **kwargs)#

Method that returns the results of a find query as a pandas.DataFrame instance.

Parameters:
  • collection: Instance of Collection. against which to run the find operation.

  • query: A mapping containing the query to use for the find operation.

  • schema (optional): Instance of Schema. If the schema is not given, it will be inferred using the first document in the result set.

Additional keyword-arguments passed to this method will be passed directly to the underlying find operation.

Returns:

An instance of class:pandas.DataFrame.

pymongoarrow.api.find_polars_all(collection, query, *, schema=None, **kwargs)#

Method that returns the results of a find query as a polars.DataFrame instance.

Parameters:
  • collection: Instance of Collection. against which to run the find operation.

  • query: A mapping containing the query to use for the find operation.

  • schema (optional): Instance of Schema. If the schema is not given, it will be inferred using the first document in the result set.

Additional keyword-arguments passed to this method will be passed directly to the underlying find operation.

Returns:

An instance of class:polars.DataFrame.

New in version 1.3.

pymongoarrow.api.write(collection, tabular)#

Write data from tabular into the given MongoDB collection.

Parameters:
  • collection: Instance of Collection. against which to run the operation.

  • tabular: A tabular data store to use for the write operation.

Returns:

An instance of result.ArrowWriteResult.