api – PyMongoArrow APIs¶
- class pymongoarrow.api.Schema(schema)¶
A mapping of field names to data types.
To create a schema, provide its constructor a mapping of field names to their expected types, e.g.:
schema1 = Schema({'field_1': int, 'field_2': float})
Each key in
schemais a field name and its corresponding value is the expected type of the data contained in the named field.For more examples, see Schema Examples.
Data types can be specified as pyarrow type instances (e.g. an instance of
pyarrow.int64), bson types (e.g.bson.Int64), or python type-identifiers (e.g.int,float). To see a complete list of supported data types and their corresponding type-identifiers, see Data Types.
- pymongoarrow.api.aggregate_arrow_all(collection, pipeline, *, schema=None, **kwargs)¶
Method that returns the results of an aggregation pipeline as a
pyarrow.Tableinstance.- Parameters
collection: Instance of
Collection. against which to run theaggregateoperation.pipeline: A list of aggregation pipeline stages.
schema (optional): Instance of
Schema. If the schema is not given, it will be inferred using the first document in the result set.
Additional keyword-arguments passed to this method will be passed directly to the underlying
aggregateoperation.- Returns
An instance of class:pyarrow.Table.
- pymongoarrow.api.aggregate_numpy_all(collection, pipeline, *, schema=None, **kwargs)¶
Method that returns the results of an aggregation pipeline as a
dictinstance whose keys are field names and values arendarrayinstances bearing the appropriate dtype.- Parameters
collection: Instance of
Collection. against which to run thefindoperation.query: A mapping containing the query to use for the find operation.
schema (optional): Instance of
Schema. If the schema is not given, it will be inferred using the first document in the result set.
Additional keyword-arguments passed to this method will be passed directly to the underlying
aggregateoperation.This method attempts to create each NumPy array as a view on the Arrow data corresponding to each field in the result set. When this is not possible, the underlying data is copied into a new NumPy array. See
pyarrow.Array.to_numpy()for more information.NumPy arrays returned by this method that are views on Arrow data are not writable. Users seeking to modify such arrays must first create an editable copy using
numpy.copy().- Returns
An instance of
dict.
- pymongoarrow.api.aggregate_pandas_all(collection, pipeline, *, schema=None, **kwargs)¶
Method that returns the results of an aggregation pipeline as a
pandas.DataFrameinstance.- Parameters
collection: Instance of
Collection. against which to run thefindoperation.pipeline: A list of aggregation pipeline stages.
schema (optional): Instance of
Schema. If the schema is not given, it will be inferred using the first document in the result set.
Additional keyword-arguments passed to this method will be passed directly to the underlying
aggregateoperation.- Returns
An instance of class:pandas.DataFrame.
- pymongoarrow.api.find_arrow_all(collection, query, *, schema=None, **kwargs)¶
Method that returns the results of a find query as a
pyarrow.Tableinstance.- Parameters
collection: Instance of
Collection. against which to run thefindoperation.query: A mapping containing the query to use for the find operation.
schema (optional): Instance of
Schema. If the schema is not given, it will be inferred using the first document in the result set.
Additional keyword-arguments passed to this method will be passed directly to the underlying
findoperation.- Returns
An instance of class:pyarrow.Table.
- pymongoarrow.api.find_numpy_all(collection, query, *, schema=None, **kwargs)¶
Method that returns the results of a find query as a
dictinstance whose keys are field names and values arendarrayinstances bearing the appropriate dtype.- Parameters
collection: Instance of
Collection. against which to run thefindoperation.query: A mapping containing the query to use for the find operation.
schema (optional): Instance of
Schema. If the schema is not given, it will be inferred using the first document in the result set.
Additional keyword-arguments passed to this method will be passed directly to the underlying
findoperation.This method attempts to create each NumPy array as a view on the Arrow data corresponding to each field in the result set. When this is not possible, the underlying data is copied into a new NumPy array. See
pyarrow.Array.to_numpy()for more information.NumPy arrays returned by this method that are views on Arrow data are not writable. Users seeking to modify such arrays must first create an editable copy using
numpy.copy().- Returns
An instance of
dict.
- pymongoarrow.api.find_pandas_all(collection, query, *, schema=None, **kwargs)¶
Method that returns the results of a find query as a
pandas.DataFrameinstance.- Parameters
collection: Instance of
Collection. against which to run thefindoperation.query: A mapping containing the query to use for the find operation.
schema (optional): Instance of
Schema. If the schema is not given, it will be inferred using the first document in the result set.
Additional keyword-arguments passed to this method will be passed directly to the underlying
findoperation.- Returns
An instance of class:pandas.DataFrame.
- pymongoarrow.api.write(collection, tabular)¶
Write data from tabular into the given MongoDB collection.
- Parameters
collection: Instance of
Collection. against which to run the operation.tabular: A tabular data store to use for the write operation.
- Returns
An instance of
result.ArrowWriteResult.