`api` – PyMongoArrow APIs¶

class pymongoarrow.api.Schema(schema)¶

A mapping of field names to data types.

To create a schema, provide its constructor a mapping of field names to their expected types, e.g.:

schema1 = Schema({'field_1': int, 'field_2': float})

Each key in schema is a field name and its corresponding value is the expected type of the data contained in the named field.

For more examples, see Schema Examples.

Data types can be specified as pyarrow type instances (e.g. an instance of pyarrow.int64), bson types (e.g. bson.Int64), or python type-identifiers (e.g. int, float). To see a complete list of supported data types and their corresponding type-identifiers, see Data Types.

pymongoarrow.api.aggregate_arrow_all(collection, pipeline, *, schema=None, **kwargs)¶

Method that returns the results of an aggregation pipeline as a pyarrow.Table instance.

Parameters

collection: Instance of Collection. against which to run the aggregate operation.
pipeline: A list of aggregation pipeline stages.
schema (optional): Instance of Schema. If the schema is not given, it will be inferred using the first document in the result set.

Additional keyword-arguments passed to this method will be passed directly to the underlying aggregate operation.

Returns: An instance of class:pyarrow.Table.

pymongoarrow.api.aggregate_numpy_all(collection, pipeline, *, schema=None, **kwargs)¶

Method that returns the results of an aggregation pipeline as a dict instance whose keys are field names and values are ndarray instances bearing the appropriate dtype.

Parameters

collection: Instance of Collection. against which to run the find operation.
query: A mapping containing the query to use for the find operation.
schema (optional): Instance of Schema. If the schema is not given, it will be inferred using the first document in the result set.

Additional keyword-arguments passed to this method will be passed directly to the underlying aggregate operation.

This method attempts to create each NumPy array as a view on the Arrow data corresponding to each field in the result set. When this is not possible, the underlying data is copied into a new NumPy array. See pyarrow.Array.to_numpy() for more information.

NumPy arrays returned by this method that are views on Arrow data are not writable. Users seeking to modify such arrays must first create an editable copy using numpy.copy().

Returns: An instance of dict.

pymongoarrow.api.aggregate_pandas_all(collection, pipeline, *, schema=None, **kwargs)¶

Method that returns the results of an aggregation pipeline as a pandas.DataFrame instance.

Parameters

collection: Instance of Collection. against which to run the find operation.
pipeline: A list of aggregation pipeline stages.
schema (optional): Instance of Schema. If the schema is not given, it will be inferred using the first document in the result set.

Additional keyword-arguments passed to this method will be passed directly to the underlying aggregate operation.

Returns: An instance of class:pandas.DataFrame.

pymongoarrow.api.find_arrow_all(collection, query, *, schema=None, **kwargs)¶

Method that returns the results of a find query as a pyarrow.Table instance.

Parameters

collection: Instance of Collection. against which to run the find operation.
query: A mapping containing the query to use for the find operation.
schema (optional): Instance of Schema. If the schema is not given, it will be inferred using the first document in the result set.

Additional keyword-arguments passed to this method will be passed directly to the underlying find operation.

Returns: An instance of class:pyarrow.Table.

pymongoarrow.api.find_numpy_all(collection, query, *, schema=None, **kwargs)¶

Method that returns the results of a find query as a dict instance whose keys are field names and values are ndarray instances bearing the appropriate dtype.

Parameters

collection: Instance of Collection. against which to run the find operation.
query: A mapping containing the query to use for the find operation.
schema (optional): Instance of Schema. If the schema is not given, it will be inferred using the first document in the result set.

Additional keyword-arguments passed to this method will be passed directly to the underlying find operation.

This method attempts to create each NumPy array as a view on the Arrow data corresponding to each field in the result set. When this is not possible, the underlying data is copied into a new NumPy array. See pyarrow.Array.to_numpy() for more information.

NumPy arrays returned by this method that are views on Arrow data are not writable. Users seeking to modify such arrays must first create an editable copy using numpy.copy().

Returns: An instance of dict.

pymongoarrow.api.find_pandas_all(collection, query, *, schema=None, **kwargs)¶

Method that returns the results of a find query as a pandas.DataFrame instance.

Parameters

collection: Instance of Collection. against which to run the find operation.
query: A mapping containing the query to use for the find operation.
schema (optional): Instance of Schema. If the schema is not given, it will be inferred using the first document in the result set.

Additional keyword-arguments passed to this method will be passed directly to the underlying find operation.

Returns: An instance of class:pandas.DataFrame.

pymongoarrow.api.write(collection, tabular)¶

Write data from tabular into the given MongoDB collection.

Parameters

collection: Instance of Collection. against which to run the operation.
tabular: A tabular data store to use for the write operation.

Returns

An instance of result.ArrowWriteResult.

`api` – PyMongoArrow APIs¶

Previous topic

Next topic

This Page

api – PyMongoArrow APIs¶

`api` – PyMongoArrow APIs¶