api
– PyMongoArrow APIs¶
- class pymongoarrow.api.Schema(schema)¶
A mapping of field names to data types.
To create a schema, provide its constructor a mapping of field names to their expected types, e.g.:
schema1 = Schema({'field_1': int, 'field_2': float})
Each key in
schema
is a field name and its corresponding value is the expected type of the data contained in the named field.Data types can be specified as pyarrow type instances (e.g. an instance of
pyarrow.int64
), bson types (e.g.bson.Int64
), or python type-identifiers (e.g.int
,float
). To see a complete list of supported data types and their corresponding type-identifiers, see Supported Types.
- pymongoarrow.api.aggregate_arrow_all(collection, pipeline, *, schema, **kwargs)¶
Method that returns the results of an aggregation pipeline as a
pyarrow.Table
instance.- Parameters
collection: Instance of
Collection
. against which to run theaggregate
operation.pipeline: A list of aggregation pipeline stages.
schema: Instance of
Schema
.
Additional keyword-arguments passed to this method will be passed directly to the underlying
aggregate
operation.- Returns
An instance of class:pyarrow.Table.
- pymongoarrow.api.aggregate_numpy_all(collection, pipeline, *, schema, **kwargs)¶
Method that returns the results of an aggregation pipeline as a
dict
instance whose keys are field names and values arendarray
instances bearing the appropriate dtype.- Parameters
collection: Instance of
Collection
. against which to run thefind
operation.query: A mapping containing the query to use for the find operation.
schema: Instance of
Schema
.
Additional keyword-arguments passed to this method will be passed directly to the underlying
aggregate
operation.This method attempts to create each NumPy array as a view on the Arrow data corresponding to each field in the result set. When this is not possible, the underlying data is copied into a new NumPy array. See
pyarrow.Array.to_numpy()
for more information.NumPy arrays returned by this method that are views on Arrow data are not writable. Users seeking to modify such arrays must first create an editable copy using
numpy.copy()
.- Returns
An instance of
dict
.
- pymongoarrow.api.aggregate_pandas_all(collection, pipeline, *, schema, **kwargs)¶
Method that returns the results of an aggregation pipeline as a
pandas.DataFrame
instance.- Parameters
collection: Instance of
Collection
. against which to run thefind
operation.pipeline: A list of aggregation pipeline stages.
schema: Instance of
Schema
.
Additional keyword-arguments passed to this method will be passed directly to the underlying
aggregate
operation.- Returns
An instance of class:pandas.DataFrame.
- pymongoarrow.api.find_arrow_all(collection, query, *, schema, **kwargs)¶
Method that returns the results of a find query as a
pyarrow.Table
instance.- Parameters
collection: Instance of
Collection
. against which to run thefind
operation.query: A mapping containing the query to use for the find operation.
schema: Instance of
Schema
.
Additional keyword-arguments passed to this method will be passed directly to the underlying
find
operation.- Returns
An instance of class:pyarrow.Table.
- pymongoarrow.api.find_numpy_all(collection, query, *, schema, **kwargs)¶
Method that returns the results of a find query as a
dict
instance whose keys are field names and values arendarray
instances bearing the appropriate dtype.- Parameters
collection: Instance of
Collection
. against which to run thefind
operation.query: A mapping containing the query to use for the find operation.
schema: Instance of
Schema
.
Additional keyword-arguments passed to this method will be passed directly to the underlying
find
operation.This method attempts to create each NumPy array as a view on the Arrow data corresponding to each field in the result set. When this is not possible, the underlying data is copied into a new NumPy array. See
pyarrow.Array.to_numpy()
for more information.NumPy arrays returned by this method that are views on Arrow data are not writable. Users seeking to modify such arrays must first create an editable copy using
numpy.copy()
.- Returns
An instance of
dict
.
- pymongoarrow.api.find_pandas_all(collection, query, *, schema, **kwargs)¶
Method that returns the results of a find query as a
pandas.DataFrame
instance.- Parameters
collection: Instance of
Collection
. against which to run thefind
operation.query: A mapping containing the query to use for the find operation.
schema: Instance of
Schema
.
Additional keyword-arguments passed to this method will be passed directly to the underlying
find
operation.- Returns
An instance of class:pandas.DataFrame.
- pymongoarrow.api.write(collection, tabular)¶
Write data from tabular into the given MongoDB collection.
- Parameters
collection: Instance of
Collection
. against which to run the operation.tabular: A tabular data store to use for the write operation.
- Returns
An instance of
result.ArrowWriteResult
.