api
– PyMongoArrow APIs¶
- class pymongoarrow.api.Schema(schema)¶
A mapping of field names to data types.
To create a schema, provide its constructor a mapping of field names to their expected types, e.g.:
schema1 = Schema({'field_1': int, 'field_2': float})
Each key in
schema
is a field name and its corresponding value is the expected type of the data contained in the named field.For more examples, see Schema Examples.
Data types can be specified as pyarrow type instances (e.g. an instance of
pyarrow.int64
), bson types (e.g.bson.Int64
), or python type-identifiers (e.g.int
,float
). To see a complete list of supported data types and their corresponding type-identifiers, see Data Types.
- pymongoarrow.api.aggregate_arrow_all(collection, pipeline, *, schema=None, **kwargs)¶
Method that returns the results of an aggregation pipeline as a
pyarrow.Table
instance.- Parameters:
collection: Instance of
Collection
. against which to run theaggregate
operation.pipeline: A list of aggregation pipeline stages.
schema (optional): Instance of
Schema
. If the schema is not given, it will be inferred using the first document in the result set.
Additional keyword-arguments passed to this method will be passed directly to the underlying
aggregate
operation.- Returns:
An instance of class:pyarrow.Table.
- pymongoarrow.api.aggregate_numpy_all(collection, pipeline, *, schema=None, **kwargs)¶
Method that returns the results of an aggregation pipeline as a
dict
instance whose keys are field names and values arendarray
instances bearing the appropriate dtype.- Parameters:
collection: Instance of
Collection
. against which to run thefind
operation.query: A mapping containing the query to use for the find operation.
schema (optional): Instance of
Schema
. If the schema is not given, it will be inferred using the first document in the result set.
Additional keyword-arguments passed to this method will be passed directly to the underlying
aggregate
operation.This method attempts to create each NumPy array as a view on the Arrow data corresponding to each field in the result set. When this is not possible, the underlying data is copied into a new NumPy array. See
pyarrow.Array.to_numpy()
for more information.NumPy arrays returned by this method that are views on Arrow data are not writable. Users seeking to modify such arrays must first create an editable copy using
numpy.copy()
.- Returns:
An instance of
dict
.
- pymongoarrow.api.aggregate_pandas_all(collection, pipeline, *, schema=None, **kwargs)¶
Method that returns the results of an aggregation pipeline as a
pandas.DataFrame
instance.- Parameters:
collection: Instance of
Collection
. against which to run thefind
operation.pipeline: A list of aggregation pipeline stages.
schema (optional): Instance of
Schema
. If the schema is not given, it will be inferred using the first document in the result set.
Additional keyword-arguments passed to this method will be passed directly to the underlying
aggregate
operation.- Returns:
An instance of class:pandas.DataFrame.
- pymongoarrow.api.find_arrow_all(collection, query, *, schema=None, **kwargs)¶
Method that returns the results of a find query as a
pyarrow.Table
instance.- Parameters:
collection: Instance of
Collection
. against which to run thefind
operation.query: A mapping containing the query to use for the find operation.
schema (optional): Instance of
Schema
. If the schema is not given, it will be inferred using the first document in the result set.
Additional keyword-arguments passed to this method will be passed directly to the underlying
find
operation.- Returns:
An instance of class:pyarrow.Table.
- pymongoarrow.api.find_numpy_all(collection, query, *, schema=None, **kwargs)¶
Method that returns the results of a find query as a
dict
instance whose keys are field names and values arendarray
instances bearing the appropriate dtype.- Parameters:
collection: Instance of
Collection
. against which to run thefind
operation.query: A mapping containing the query to use for the find operation.
schema (optional): Instance of
Schema
. If the schema is not given, it will be inferred using the first document in the result set.
Additional keyword-arguments passed to this method will be passed directly to the underlying
find
operation.This method attempts to create each NumPy array as a view on the Arrow data corresponding to each field in the result set. When this is not possible, the underlying data is copied into a new NumPy array. See
pyarrow.Array.to_numpy()
for more information.NumPy arrays returned by this method that are views on Arrow data are not writable. Users seeking to modify such arrays must first create an editable copy using
numpy.copy()
.- Returns:
An instance of
dict
.
- pymongoarrow.api.find_pandas_all(collection, query, *, schema=None, **kwargs)¶
Method that returns the results of a find query as a
pandas.DataFrame
instance.- Parameters:
collection: Instance of
Collection
. against which to run thefind
operation.query: A mapping containing the query to use for the find operation.
schema (optional): Instance of
Schema
. If the schema is not given, it will be inferred using the first document in the result set.
Additional keyword-arguments passed to this method will be passed directly to the underlying
find
operation.- Returns:
An instance of class:pandas.DataFrame.
- pymongoarrow.api.write(collection, tabular)¶
Write data from tabular into the given MongoDB collection.
- Parameters:
collection: Instance of
Collection
. against which to run the operation.tabular: A tabular data store to use for the write operation.
- Returns:
An instance of
result.ArrowWriteResult
.