pandas_types – Pandas Extension Types¶
- class pymongoarrow.pandas_types.PandasBSONDtype¶
The base class for BSON Pandas extension data types.
- classmethod construct_array_type()¶
Return the array type associated with this dtype.
Returns¶
type
- Return type:
type_t[ExtensionArray]
- classmethod construct_from_string(string)¶
Construct this type from a string.
This is useful mainly for data types that accept parameters. For example, a period dtype accepts a frequency parameter that can be set as
period[h](where H means hourly frequency).By default, in the abstract class, just the name of the type is expected. But subclasses can overwrite this method to accept parameters.
Parameters¶
- stringstr
The name of the type, for example
category.
Returns¶
- ExtensionDtype
Instance of the dtype.
Raises¶
- TypeError
If a class cannot be constructed from this ‘string’.
Examples¶
For extension dtypes with arguments the following may be an adequate implementation.
>>> import re >>> @classmethod ... def construct_from_string(cls, string): ... pattern = re.compile(r"^my_type\[(?P<arg_name>.+)\]$") ... match = pattern.match(string) ... if match: ... return cls(**match.groupdict()) ... else: ... raise TypeError( ... f"Cannot construct a '{cls.__name__}' from '{string}'" ... )
- empty(shape)¶
Construct an ExtensionArray of this dtype with the given shape.
Analogous to numpy.empty.
Parameters¶
shape : int or tuple[int]
Returns¶
ExtensionArray
- Parameters:
shape (Shape)
- Return type:
ExtensionArray
- index_class¶
The Index subclass to return from Index.__new__ when this dtype is encountered.
- classmethod is_dtype(dtype)¶
Check if we match ‘dtype’.
Parameters¶
- dtypeobject
The object to check.
Returns¶
bool
Notes¶
The default implementation is True if
cls.construct_from_string(dtype)is an instance ofcls.dtypeis an object and is an instance ofclsdtypehas adtypeattribute, and any of the above conditions is true fordtype.dtype.
- Parameters:
dtype (object)
- Return type:
bool
- property kind: str¶
A character code (one of ‘biufcmMOSUV’), default ‘O’
This should match the NumPy dtype used when the array is converted to an ndarray, which is probably ‘O’ for object if the extension type cannot be represented as a built-in NumPy type.
See Also¶
numpy.dtype.kind
- property name: str¶
A string identifying the data type.
Will be used for display in, e.g.
Series.dtype
- property names: list[str] | None¶
Ordered list of field names, or None if there are no fields.
This is for compatibility with NumPy arrays, and may be removed in the future.
- property type: type_t[Any]¶
The scalar type for the array, e.g.
intIt’s expected
ExtensionArray[item]returns an instance ofExtensionDtype.typefor scalaritem, assuming that value is valid (not NA). NA values do not need to be instances of type.
- class pymongoarrow.pandas_types.PandasBSONExtensionArray(values, dtype, copy=False)¶
The base class for Pandas BSON extension arrays.
- argmax(skipna=True)¶
Return the index of maximum value.
In case of multiple occurrences of the maximum value, the index corresponding to the first occurrence is returned.
Parameters¶
skipna : bool, default True
Returns¶
int
See Also¶
ExtensionArray.argmin : Return the index of the minimum value.
Examples¶
>>> arr = pd.array([3, 1, 2, 5, 4]) >>> arr.argmax() 3
- Parameters:
skipna (bool)
- Return type:
int
- argmin(skipna=True)¶
Return the index of minimum value.
In case of multiple occurrences of the minimum value, the index corresponding to the first occurrence is returned.
Parameters¶
skipna : bool, default True
Returns¶
int
See Also¶
ExtensionArray.argmax : Return the index of the maximum value.
Examples¶
>>> arr = pd.array([3, 1, 2, 5, 4]) >>> arr.argmin() 1
- Parameters:
skipna (bool)
- Return type:
int
- argsort(*, ascending=True, kind='quicksort', na_position='last', **kwargs)¶
Return the indices that would sort this array.
Parameters¶
- ascendingbool, default True
Whether the indices should result in an ascending or descending sort.
- kind{‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, optional
Sorting algorithm.
- na_position{‘first’, ‘last’}, default ‘last’
If
'first', putNaNvalues at the beginning. If'last', putNaNvalues at the end.- *args, **kwargs:
Passed through to
numpy.argsort().
Returns¶
- np.ndarray[np.intp]
Array of indices that sort
self. If NaN values are contained, NaN values are placed at the end.
See Also¶
numpy.argsort : Sorting implementation used internally.
Examples¶
>>> arr = pd.array([3, 1, 2, 5, 4]) >>> arr.argsort() array([1, 2, 0, 4, 3])
- Parameters:
ascending (bool)
kind (SortKind)
na_position (str)
- Return type:
np.ndarray
- astype(dtype, copy=True)¶
Cast to a NumPy array or ExtensionArray with ‘dtype’.
Parameters¶
- dtypestr or dtype
Typecode or data-type to which the array is cast.
- copybool, default True
Whether to copy the data, even if not necessary. If False, a copy is made only if the old dtype does not match the new dtype.
Returns¶
- np.ndarray or pandas.api.extensions.ExtensionArray
An
ExtensionArrayifdtypeisExtensionDtype, otherwise a Numpy ndarray withdtypefor its dtype.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
Casting to another
ExtensionDtypereturns anExtensionArray:>>> arr1 = arr.astype('Float64') >>> arr1 <FloatingArray> [1.0, 2.0, 3.0] Length: 3, dtype: Float64 >>> arr1.dtype Float64Dtype()
Otherwise, we will get a Numpy ndarray:
>>> arr2 = arr.astype('float64') >>> arr2 array([1., 2., 3.]) >>> arr2.dtype dtype('float64')
- Parameters:
dtype (AstypeArg)
copy (bool)
- Return type:
ArrayLike
- copy()¶
Return a copy of the array.
Returns¶
ExtensionArray
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr2 = arr.copy() >>> arr[0] = 2 >>> arr2 <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- dropna()¶
Return ExtensionArray without NA values.
Returns¶
Examples¶
>>> pd.array([1, 2, np.nan]).dropna() <IntegerArray> [1, 2] Length: 2, dtype: Int64
- Return type:
Self
- duplicated(keep='first')¶
Return boolean ndarray denoting duplicate values.
Parameters¶
- keep{‘first’, ‘last’, False}, default ‘first’
first: Mark duplicates asTrueexcept for the first occurrence.last: Mark duplicates asTrueexcept for the last occurrence.False : Mark all duplicates as
True.
Returns¶
ndarray[bool]
Examples¶
>>> pd.array([1, 1, 2, 3, 3], dtype="Int64").duplicated() array([False, True, False, False, True])
- Parameters:
keep (Literal['first', 'last', False])
- Return type:
npt.NDArray[np.bool_]
- equals(other)¶
Return if another array is equivalent to this array.
Equivalent means that both arrays have the same shape and dtype, and all values compare equal. Missing values in the same location are considered equal (in contrast with normal equality).
Parameters¶
- otherExtensionArray
Array to compare to this Array.
Returns¶
- boolean
Whether the arrays are equivalent.
Examples¶
>>> arr1 = pd.array([1, 2, np.nan]) >>> arr2 = pd.array([1, 2, np.nan]) >>> arr1.equals(arr2) True
- Parameters:
other (object)
- Return type:
bool
- factorize(use_na_sentinel=True)¶
Encode the extension array as an enumerated type.
Parameters¶
- use_na_sentinelbool, default True
If True, the sentinel -1 will be used for NaN values. If False, NaN values will be encoded as non-negative integers and will not drop the NaN from the uniques of the values.
Added in version 1.5.0.
Returns¶
- codesndarray
An integer NumPy array that’s an indexer into the original ExtensionArray.
- uniquesExtensionArray
An ExtensionArray containing the unique values of self.
Note
uniques will not contain an entry for the NA value of the ExtensionArray if there are any missing values present in self.
See Also¶
factorize : Top-level factorize method that dispatches here.
Notes¶
pandas.factorize()offers a sort keyword as well.Examples¶
>>> idx1 = pd.PeriodIndex(["2014-01", "2014-01", "2014-02", "2014-02", ... "2014-03", "2014-03"], freq="M") >>> arr, idx = idx1.factorize() >>> arr array([0, 0, 1, 1, 2, 2]) >>> idx PeriodIndex(['2014-01', '2014-02', '2014-03'], dtype='period[M]')
- Parameters:
use_na_sentinel (bool)
- Return type:
tuple[ndarray, ExtensionArray]
- fillna(value=None, method=None, limit=None, copy=True)¶
Fill NA/NaN values using the specified method.
Parameters¶
- valuescalar, array-like
If a scalar value is passed it is used to fill all missing values. Alternatively, an array-like “value” can be given. It’s expected that the array-like have the same length as ‘self’.
- method{‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}, default None
Method to use for filling holes in reindexed Series:
pad / ffill: propagate last valid observation forward to next valid.
backfill / bfill: use NEXT valid observation to fill gap.
Deprecated since version 2.1.0.
- limitint, default None
If method is specified, this is the maximum number of consecutive NaN values to forward/backward fill. In other words, if there is a gap with more than this number of consecutive NaNs, it will only be partially filled. If method is not specified, this is the maximum number of entries along the entire axis where NaNs will be filled.
Deprecated since version 2.1.0.
- copybool, default True
Whether to make a copy of the data before filling. If False, then the original should be modified and no new memory should be allocated. For ExtensionArray subclasses that cannot do this, it is at the author’s discretion whether to ignore “copy=False” or to raise. The base class implementation ignores the keyword in pad/backfill cases.
Returns¶
- ExtensionArray
With NA/NaN filled.
Examples¶
>>> arr = pd.array([np.nan, np.nan, 2, 3, np.nan, np.nan]) >>> arr.fillna(0) <IntegerArray> [0, 0, 2, 3, 0, 0] Length: 6, dtype: Int64
- Parameters:
value (object | ArrayLike | None)
method (FillnaOptions | None)
limit (int | None)
copy (bool)
- Return type:
Self
- insert(loc, item)¶
Insert an item at the given position.
Parameters¶
loc : int item : scalar-like
Returns¶
same type as self
Notes¶
This method should be both type and dtype-preserving. If the item cannot be held in an array of this type/dtype, either ValueError or TypeError should be raised.
The default implementation relies on _from_sequence to raise on invalid items.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.insert(2, -1) <IntegerArray> [1, 2, -1, 3] Length: 4, dtype: Int64
- Parameters:
loc (int)
- Return type:
Self
- interpolate(*, method, axis, index, limit, limit_direction, limit_area, copy, **kwargs)¶
See DataFrame.interpolate.__doc__.
Examples¶
>>> arr = pd.arrays.NumpyExtensionArray(np.array([0, 1, np.nan, 3])) >>> arr.interpolate(method="linear", ... limit=3, ... limit_direction="forward", ... index=pd.Index([1, 2, 3, 4]), ... fill_value=1, ... copy=False, ... axis=0, ... limit_area="inside" ... ) <NumpyExtensionArray> [0.0, 1.0, 2.0, 3.0] Length: 4, dtype: float64
- Parameters:
method (InterpolateOptions)
axis (int)
index (Index)
copy (bool)
- Return type:
Self
- isin(values)¶
Pointwise comparison for set containment in the given values.
Roughly equivalent to np.array([x in values for x in self])
Parameters¶
values : np.ndarray or ExtensionArray
Returns¶
np.ndarray[bool]
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.isin([1]) <BooleanArray> [True, False, False] Length: 3, dtype: boolean
- Parameters:
values (ArrayLike)
- Return type:
npt.NDArray[np.bool_]
- isna()¶
A 1-D array indicating if each value is missing.
Returns¶
- numpy.ndarray or pandas.api.extensions.ExtensionArray
In most cases, this should return a NumPy ndarray. For exceptional cases like
SparseArray, where returning an ndarray would be expensive, an ExtensionArray may be returned.
Notes¶
If returning an ExtensionArray, then
na_values._is_booleanshould be Truena_values should implement
ExtensionArray._reduce()na_values.anyandna_values.allshould be implemented
Examples¶
>>> arr = pd.array([1, 2, np.nan, np.nan]) >>> arr.isna() array([False, False, True, True])
- map(mapper, na_action=None)¶
Map values using an input mapping or function.
Parameters¶
- mapperfunction, dict, or Series
Mapping correspondence.
- na_action{None, ‘ignore’}, default None
If ‘ignore’, propagate NA values, without passing them to the mapping correspondence. If ‘ignore’ is not supported, a
NotImplementedErrorshould be raised.
Returns¶
- Union[ndarray, Index, ExtensionArray]
The output of the mapping function applied to the array. If the function returns a tuple with more than one element a MultiIndex will be returned.
- nbytes()¶
The number of bytes needed to store this object in memory.
Examples¶
>>> pd.array([1, 2, 3]).nbytes 27
- property ndim: int¶
Extension Arrays are only allowed to be 1-dimensional.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.ndim 1
- ravel(order='C')¶
Return a flattened view on this array.
Parameters¶
order : {None, ‘C’, ‘F’, ‘A’, ‘K’}, default ‘C’
Returns¶
ExtensionArray
Notes¶
Because ExtensionArrays are 1D-only, this is a no-op.
The “order” argument is ignored, is for compatibility with NumPy.
Examples¶
>>> pd.array([1, 2, 3]).ravel() <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- Parameters:
order (Literal['C', 'F', 'A', 'K'] | None)
- Return type:
ExtensionArray
- repeat(repeats, axis=None)¶
Repeat elements of a ExtensionArray.
Returns a new ExtensionArray where each element of the current ExtensionArray is repeated consecutively a given number of times.
Parameters¶
- repeatsint or array of ints
The number of repetitions for each element. This should be a non-negative integer. Repeating 0 times will return an empty ExtensionArray.
- axisNone
Must be
None. Has no effect but is accepted for compatibility with numpy.
Returns¶
- ExtensionArray
Newly created ExtensionArray with repeated elements.
See Also¶
Series.repeat : Equivalent function for Series. Index.repeat : Equivalent function for Index. numpy.repeat : Similar method for
numpy.ndarray. ExtensionArray.take : Take arbitrary positions.Examples¶
>>> cat = pd.Categorical(['a', 'b', 'c']) >>> cat ['a', 'b', 'c'] Categories (3, object): ['a', 'b', 'c'] >>> cat.repeat(2) ['a', 'a', 'b', 'b', 'c', 'c'] Categories (3, object): ['a', 'b', 'c'] >>> cat.repeat([1, 2, 3]) ['a', 'b', 'b', 'c', 'c', 'c'] Categories (3, object): ['a', 'b', 'c']
- Parameters:
repeats (int | Sequence[int])
axis (AxisInt | None)
- Return type:
Self
- searchsorted(value, side='left', sorter=None)¶
Find indices where elements should be inserted to maintain order.
Find the indices into a sorted array self (a) such that, if the corresponding elements in value were inserted before the indices, the order of self would be preserved.
Assuming that self is sorted:
side
returned index i satisfies
left
self[i-1] < value <= self[i]right
self[i-1] <= value < self[i]Parameters¶
- valuearray-like, list or scalar
Value(s) to insert into self.
- side{‘left’, ‘right’}, optional
If ‘left’, the index of the first suitable location found is given. If ‘right’, return the last such index. If there is no suitable index, return either 0 or N (where N is the length of self).
- sorter1-D array-like, optional
Optional array of integer indices that sort array a into ascending order. They are typically the result of argsort.
Returns¶
- array of ints or int
If value is array-like, array of insertion points. If value is scalar, a single integer.
See Also¶
numpy.searchsorted : Similar method from NumPy.
Examples¶
>>> arr = pd.array([1, 2, 3, 5]) >>> arr.searchsorted([4]) array([3])
- Parameters:
value (NumpyValueArrayLike | ExtensionArray)
side (Literal['left', 'right'])
sorter (NumpySorter | None)
- Return type:
npt.NDArray[np.intp] | np.intp
- property shape: Shape¶
Return a tuple of the array dimensions.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.shape (3,)
- shift(periods=1, fill_value=None)¶
Shift values by desired number.
Newly introduced missing values are filled with
self.dtype.na_value.Parameters¶
- periodsint, default 1
The number of periods to shift. Negative values are allowed for shifting backwards.
- fill_valueobject, optional
The scalar value to use for newly introduced missing values. The default is
self.dtype.na_value.
Returns¶
- ExtensionArray
Shifted.
Notes¶
If
selfis empty orperiodsis 0, a copy ofselfis returned.If
periods > len(self), then an array of size len(self) is returned, with all values filled withself.dtype.na_value.For 2-dimensional ExtensionArrays, we are always shifting along axis=0.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.shift(2) <IntegerArray> [<NA>, <NA>, 1] Length: 3, dtype: Int64
- Parameters:
periods (int)
fill_value (object | None)
- Return type:
- property size: int¶
The number of elements in the array.
- take(indexer, allow_fill=False, fill_value=None)¶
Take elements from an array.
Parameters¶
- indicessequence of int or one-dimensional np.ndarray of int
Indices to be taken.
- allow_fillbool, default False
How to handle negative values in indices.
False: negative values in indices indicate positional indices from the right (the default). This is similar to
numpy.take().True: negative values in indices indicate missing values. These values are set to fill_value. Any other other negative values raise a
ValueError.
- fill_valueany, optional
Fill value to use for NA-indices when allow_fill is True. This may be
None, in which case the default NA value for the type,self.dtype.na_value, is used.For many ExtensionArrays, there will be two representations of fill_value: a user-facing “boxed” scalar, and a low-level physical NA value. fill_value should be the user-facing version, and the implementation should handle translating that to the physical version for processing the take if necessary.
Returns¶
ExtensionArray
Raises¶
- IndexError
When the indices are out of bounds for the array.
- ValueError
When indices contains negative values other than
-1and allow_fill is True.
See Also¶
numpy.take : Take elements from an array along an axis. api.extensions.take : Take elements from an array.
Notes¶
ExtensionArray.take is called by
Series.__getitem__,.loc,iloc, when indices is a sequence of values. Additionally, it’s called bySeries.reindex(), or any other method that causes realignment, with a fill_value.Examples¶
Here’s an example implementation, which relies on casting the extension array to object dtype. This uses the helper method
pandas.api.extensions.take().def take(self, indices, allow_fill=False, fill_value=None): from pandas.core.algorithms import take # If the ExtensionArray is backed by an ndarray, then # just pass that here instead of coercing to object. data = self.astype(object) if allow_fill and fill_value is None: fill_value = self.dtype.na_value # fill value should always be translated from the scalar # type for the array, to the physical storage type for # the data, before passing to take. result = take(data, indices, fill_value=fill_value, allow_fill=allow_fill) return self._from_sequence(result, dtype=self.dtype)
- to_numpy(dtype=None, copy=False, na_value=<no_default>)¶
Convert to a NumPy ndarray.
This is similar to
numpy.asarray(), but may provide additional control over how the conversion is done.Parameters¶
- dtypestr or numpy.dtype, optional
The dtype to pass to
numpy.asarray().- copybool, default False
Whether to ensure that the returned value is a not a view on another array. Note that
copy=Falsedoes not ensure thatto_numpy()is no-copy. Rather,copy=Trueensure that a copy is made, even if not strictly necessary.- na_valueAny, optional
The value to use for missing values. The default value depends on dtype and the type of the array.
Returns¶
numpy.ndarray
- Parameters:
dtype (npt.DTypeLike | None)
copy (bool)
na_value (object)
- Return type:
np.ndarray
- tolist()¶
Return a list of the values.
These are each a scalar type, which is a Python scalar (for str, int, float) or a pandas scalar (for Timestamp/Timedelta/Interval/Period)
Returns¶
list
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.tolist() [1, 2, 3]
- Return type:
list
- transpose(*axes)¶
Return a transposed view on this array.
Because ExtensionArrays are always 1D, this is a no-op. It is included for compatibility with np.ndarray.
Returns¶
ExtensionArray
Examples¶
>>> pd.array([1, 2, 3]).transpose() <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- Parameters:
axes (int)
- Return type:
- unique()¶
Compute the ExtensionArray of unique values.
Returns¶
pandas.api.extensions.ExtensionArray
Examples¶
>>> arr = pd.array([1, 2, 3, 1, 2, 3]) >>> arr.unique() <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- Return type:
Self
- view(dtype=None)¶
Return a view on the array.
Parameters¶
- dtypestr, np.dtype, or ExtensionDtype, optional
Default None.
Returns¶
- ExtensionArray or np.ndarray
A view on the
ExtensionArray’s data.
Examples¶
This gives view on the underlying data of an
ExtensionArrayand is not a copy. Modifications on either the view or the originalExtensionArraywill be reflectd on the underlying data:>>> arr = pd.array([1, 2, 3]) >>> arr2 = arr.view() >>> arr[0] = 2 >>> arr2 <IntegerArray> [2, 2, 3] Length: 3, dtype: Int64
- Parameters:
dtype (Dtype | None)
- Return type:
ArrayLike
- class pymongoarrow.pandas_types.PandasBinary(subtype)¶
A pandas extension type for BSON Binary data type.
- classmethod construct_array_type()¶
Return the array type associated with this dtype.
Returns¶
type
- Return type:
type[PandasBinaryArray]
- classmethod construct_from_string(string)¶
Construct this type from a string.
This is useful mainly for data types that accept parameters. For example, a period dtype accepts a frequency parameter that can be set as
period[h](where H means hourly frequency).By default, in the abstract class, just the name of the type is expected. But subclasses can overwrite this method to accept parameters.
Parameters¶
- stringstr
The name of the type, for example
category.
Returns¶
- ExtensionDtype
Instance of the dtype.
Raises¶
- TypeError
If a class cannot be constructed from this ‘string’.
Examples¶
For extension dtypes with arguments the following may be an adequate implementation.
>>> import re >>> @classmethod ... def construct_from_string(cls, string): ... pattern = re.compile(r"^my_type\[(?P<arg_name>.+)\]$") ... match = pattern.match(string) ... if match: ... return cls(**match.groupdict()) ... else: ... raise TypeError( ... f"Cannot construct a '{cls.__name__}' from '{string}'" ... )
- empty(shape)¶
Construct an ExtensionArray of this dtype with the given shape.
Analogous to numpy.empty.
Parameters¶
shape : int or tuple[int]
Returns¶
ExtensionArray
- Parameters:
shape (Shape)
- Return type:
ExtensionArray
- index_class¶
The Index subclass to return from Index.__new__ when this dtype is encountered.
- classmethod is_dtype(dtype)¶
Check if we match ‘dtype’.
Parameters¶
- dtypeobject
The object to check.
Returns¶
bool
Notes¶
The default implementation is True if
cls.construct_from_string(dtype)is an instance ofcls.dtypeis an object and is an instance ofclsdtypehas adtypeattribute, and any of the above conditions is true fordtype.dtype.
- Parameters:
dtype (object)
- Return type:
bool
- property kind: str¶
A character code (one of ‘biufcmMOSUV’), default ‘O’
This should match the NumPy dtype used when the array is converted to an ndarray, which is probably ‘O’ for object if the extension type cannot be represented as a built-in NumPy type.
See Also¶
numpy.dtype.kind
- property name: str¶
A string identifying the data type.
Will be used for display in, e.g.
Series.dtype
- property names: list[str] | None¶
Ordered list of field names, or None if there are no fields.
This is for compatibility with NumPy arrays, and may be removed in the future.
- class pymongoarrow.pandas_types.PandasBinaryArray(values, dtype, copy=False)¶
A pandas extension type for BSON Binary data arrays.
- argmax(skipna=True)¶
Return the index of maximum value.
In case of multiple occurrences of the maximum value, the index corresponding to the first occurrence is returned.
Parameters¶
skipna : bool, default True
Returns¶
int
See Also¶
ExtensionArray.argmin : Return the index of the minimum value.
Examples¶
>>> arr = pd.array([3, 1, 2, 5, 4]) >>> arr.argmax() 3
- Parameters:
skipna (bool)
- Return type:
int
- argmin(skipna=True)¶
Return the index of minimum value.
In case of multiple occurrences of the minimum value, the index corresponding to the first occurrence is returned.
Parameters¶
skipna : bool, default True
Returns¶
int
See Also¶
ExtensionArray.argmax : Return the index of the maximum value.
Examples¶
>>> arr = pd.array([3, 1, 2, 5, 4]) >>> arr.argmin() 1
- Parameters:
skipna (bool)
- Return type:
int
- argsort(*, ascending=True, kind='quicksort', na_position='last', **kwargs)¶
Return the indices that would sort this array.
Parameters¶
- ascendingbool, default True
Whether the indices should result in an ascending or descending sort.
- kind{‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, optional
Sorting algorithm.
- na_position{‘first’, ‘last’}, default ‘last’
If
'first', putNaNvalues at the beginning. If'last', putNaNvalues at the end.- *args, **kwargs:
Passed through to
numpy.argsort().
Returns¶
- np.ndarray[np.intp]
Array of indices that sort
self. If NaN values are contained, NaN values are placed at the end.
See Also¶
numpy.argsort : Sorting implementation used internally.
Examples¶
>>> arr = pd.array([3, 1, 2, 5, 4]) >>> arr.argsort() array([1, 2, 0, 4, 3])
- Parameters:
ascending (bool)
kind (SortKind)
na_position (str)
- Return type:
np.ndarray
- astype(dtype, copy=True)¶
Cast to a NumPy array or ExtensionArray with ‘dtype’.
Parameters¶
- dtypestr or dtype
Typecode or data-type to which the array is cast.
- copybool, default True
Whether to copy the data, even if not necessary. If False, a copy is made only if the old dtype does not match the new dtype.
Returns¶
- np.ndarray or pandas.api.extensions.ExtensionArray
An
ExtensionArrayifdtypeisExtensionDtype, otherwise a Numpy ndarray withdtypefor its dtype.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
Casting to another
ExtensionDtypereturns anExtensionArray:>>> arr1 = arr.astype('Float64') >>> arr1 <FloatingArray> [1.0, 2.0, 3.0] Length: 3, dtype: Float64 >>> arr1.dtype Float64Dtype()
Otherwise, we will get a Numpy ndarray:
>>> arr2 = arr.astype('float64') >>> arr2 array([1., 2., 3.]) >>> arr2.dtype dtype('float64')
- Parameters:
dtype (AstypeArg)
copy (bool)
- Return type:
ArrayLike
- copy()¶
Return a copy of the array.
Returns¶
ExtensionArray
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr2 = arr.copy() >>> arr[0] = 2 >>> arr2 <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- dropna()¶
Return ExtensionArray without NA values.
Returns¶
Examples¶
>>> pd.array([1, 2, np.nan]).dropna() <IntegerArray> [1, 2] Length: 2, dtype: Int64
- Return type:
Self
- duplicated(keep='first')¶
Return boolean ndarray denoting duplicate values.
Parameters¶
- keep{‘first’, ‘last’, False}, default ‘first’
first: Mark duplicates asTrueexcept for the first occurrence.last: Mark duplicates asTrueexcept for the last occurrence.False : Mark all duplicates as
True.
Returns¶
ndarray[bool]
Examples¶
>>> pd.array([1, 1, 2, 3, 3], dtype="Int64").duplicated() array([False, True, False, False, True])
- Parameters:
keep (Literal['first', 'last', False])
- Return type:
npt.NDArray[np.bool_]
- equals(other)¶
Return if another array is equivalent to this array.
Equivalent means that both arrays have the same shape and dtype, and all values compare equal. Missing values in the same location are considered equal (in contrast with normal equality).
Parameters¶
- otherExtensionArray
Array to compare to this Array.
Returns¶
- boolean
Whether the arrays are equivalent.
Examples¶
>>> arr1 = pd.array([1, 2, np.nan]) >>> arr2 = pd.array([1, 2, np.nan]) >>> arr1.equals(arr2) True
- Parameters:
other (object)
- Return type:
bool
- factorize(use_na_sentinel=True)¶
Encode the extension array as an enumerated type.
Parameters¶
- use_na_sentinelbool, default True
If True, the sentinel -1 will be used for NaN values. If False, NaN values will be encoded as non-negative integers and will not drop the NaN from the uniques of the values.
Added in version 1.5.0.
Returns¶
- codesndarray
An integer NumPy array that’s an indexer into the original ExtensionArray.
- uniquesExtensionArray
An ExtensionArray containing the unique values of self.
Note
uniques will not contain an entry for the NA value of the ExtensionArray if there are any missing values present in self.
See Also¶
factorize : Top-level factorize method that dispatches here.
Notes¶
pandas.factorize()offers a sort keyword as well.Examples¶
>>> idx1 = pd.PeriodIndex(["2014-01", "2014-01", "2014-02", "2014-02", ... "2014-03", "2014-03"], freq="M") >>> arr, idx = idx1.factorize() >>> arr array([0, 0, 1, 1, 2, 2]) >>> idx PeriodIndex(['2014-01', '2014-02', '2014-03'], dtype='period[M]')
- Parameters:
use_na_sentinel (bool)
- Return type:
tuple[ndarray, ExtensionArray]
- fillna(value=None, method=None, limit=None, copy=True)¶
Fill NA/NaN values using the specified method.
Parameters¶
- valuescalar, array-like
If a scalar value is passed it is used to fill all missing values. Alternatively, an array-like “value” can be given. It’s expected that the array-like have the same length as ‘self’.
- method{‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}, default None
Method to use for filling holes in reindexed Series:
pad / ffill: propagate last valid observation forward to next valid.
backfill / bfill: use NEXT valid observation to fill gap.
Deprecated since version 2.1.0.
- limitint, default None
If method is specified, this is the maximum number of consecutive NaN values to forward/backward fill. In other words, if there is a gap with more than this number of consecutive NaNs, it will only be partially filled. If method is not specified, this is the maximum number of entries along the entire axis where NaNs will be filled.
Deprecated since version 2.1.0.
- copybool, default True
Whether to make a copy of the data before filling. If False, then the original should be modified and no new memory should be allocated. For ExtensionArray subclasses that cannot do this, it is at the author’s discretion whether to ignore “copy=False” or to raise. The base class implementation ignores the keyword in pad/backfill cases.
Returns¶
- ExtensionArray
With NA/NaN filled.
Examples¶
>>> arr = pd.array([np.nan, np.nan, 2, 3, np.nan, np.nan]) >>> arr.fillna(0) <IntegerArray> [0, 0, 2, 3, 0, 0] Length: 6, dtype: Int64
- Parameters:
value (object | ArrayLike | None)
method (FillnaOptions | None)
limit (int | None)
copy (bool)
- Return type:
Self
- insert(loc, item)¶
Insert an item at the given position.
Parameters¶
loc : int item : scalar-like
Returns¶
same type as self
Notes¶
This method should be both type and dtype-preserving. If the item cannot be held in an array of this type/dtype, either ValueError or TypeError should be raised.
The default implementation relies on _from_sequence to raise on invalid items.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.insert(2, -1) <IntegerArray> [1, 2, -1, 3] Length: 4, dtype: Int64
- Parameters:
loc (int)
- Return type:
Self
- interpolate(*, method, axis, index, limit, limit_direction, limit_area, copy, **kwargs)¶
See DataFrame.interpolate.__doc__.
Examples¶
>>> arr = pd.arrays.NumpyExtensionArray(np.array([0, 1, np.nan, 3])) >>> arr.interpolate(method="linear", ... limit=3, ... limit_direction="forward", ... index=pd.Index([1, 2, 3, 4]), ... fill_value=1, ... copy=False, ... axis=0, ... limit_area="inside" ... ) <NumpyExtensionArray> [0.0, 1.0, 2.0, 3.0] Length: 4, dtype: float64
- Parameters:
method (InterpolateOptions)
axis (int)
index (Index)
copy (bool)
- Return type:
Self
- isin(values)¶
Pointwise comparison for set containment in the given values.
Roughly equivalent to np.array([x in values for x in self])
Parameters¶
values : np.ndarray or ExtensionArray
Returns¶
np.ndarray[bool]
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.isin([1]) <BooleanArray> [True, False, False] Length: 3, dtype: boolean
- Parameters:
values (ArrayLike)
- Return type:
npt.NDArray[np.bool_]
- isna()¶
A 1-D array indicating if each value is missing.
Returns¶
- numpy.ndarray or pandas.api.extensions.ExtensionArray
In most cases, this should return a NumPy ndarray. For exceptional cases like
SparseArray, where returning an ndarray would be expensive, an ExtensionArray may be returned.
Notes¶
If returning an ExtensionArray, then
na_values._is_booleanshould be Truena_values should implement
ExtensionArray._reduce()na_values.anyandna_values.allshould be implemented
Examples¶
>>> arr = pd.array([1, 2, np.nan, np.nan]) >>> arr.isna() array([False, False, True, True])
- map(mapper, na_action=None)¶
Map values using an input mapping or function.
Parameters¶
- mapperfunction, dict, or Series
Mapping correspondence.
- na_action{None, ‘ignore’}, default None
If ‘ignore’, propagate NA values, without passing them to the mapping correspondence. If ‘ignore’ is not supported, a
NotImplementedErrorshould be raised.
Returns¶
- Union[ndarray, Index, ExtensionArray]
The output of the mapping function applied to the array. If the function returns a tuple with more than one element a MultiIndex will be returned.
- nbytes()¶
The number of bytes needed to store this object in memory.
Examples¶
>>> pd.array([1, 2, 3]).nbytes 27
- property ndim: int¶
Extension Arrays are only allowed to be 1-dimensional.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.ndim 1
- ravel(order='C')¶
Return a flattened view on this array.
Parameters¶
order : {None, ‘C’, ‘F’, ‘A’, ‘K’}, default ‘C’
Returns¶
ExtensionArray
Notes¶
Because ExtensionArrays are 1D-only, this is a no-op.
The “order” argument is ignored, is for compatibility with NumPy.
Examples¶
>>> pd.array([1, 2, 3]).ravel() <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- Parameters:
order (Literal['C', 'F', 'A', 'K'] | None)
- Return type:
ExtensionArray
- repeat(repeats, axis=None)¶
Repeat elements of a ExtensionArray.
Returns a new ExtensionArray where each element of the current ExtensionArray is repeated consecutively a given number of times.
Parameters¶
- repeatsint or array of ints
The number of repetitions for each element. This should be a non-negative integer. Repeating 0 times will return an empty ExtensionArray.
- axisNone
Must be
None. Has no effect but is accepted for compatibility with numpy.
Returns¶
- ExtensionArray
Newly created ExtensionArray with repeated elements.
See Also¶
Series.repeat : Equivalent function for Series. Index.repeat : Equivalent function for Index. numpy.repeat : Similar method for
numpy.ndarray. ExtensionArray.take : Take arbitrary positions.Examples¶
>>> cat = pd.Categorical(['a', 'b', 'c']) >>> cat ['a', 'b', 'c'] Categories (3, object): ['a', 'b', 'c'] >>> cat.repeat(2) ['a', 'a', 'b', 'b', 'c', 'c'] Categories (3, object): ['a', 'b', 'c'] >>> cat.repeat([1, 2, 3]) ['a', 'b', 'b', 'c', 'c', 'c'] Categories (3, object): ['a', 'b', 'c']
- Parameters:
repeats (int | Sequence[int])
axis (AxisInt | None)
- Return type:
Self
- searchsorted(value, side='left', sorter=None)¶
Find indices where elements should be inserted to maintain order.
Find the indices into a sorted array self (a) such that, if the corresponding elements in value were inserted before the indices, the order of self would be preserved.
Assuming that self is sorted:
side
returned index i satisfies
left
self[i-1] < value <= self[i]right
self[i-1] <= value < self[i]Parameters¶
- valuearray-like, list or scalar
Value(s) to insert into self.
- side{‘left’, ‘right’}, optional
If ‘left’, the index of the first suitable location found is given. If ‘right’, return the last such index. If there is no suitable index, return either 0 or N (where N is the length of self).
- sorter1-D array-like, optional
Optional array of integer indices that sort array a into ascending order. They are typically the result of argsort.
Returns¶
- array of ints or int
If value is array-like, array of insertion points. If value is scalar, a single integer.
See Also¶
numpy.searchsorted : Similar method from NumPy.
Examples¶
>>> arr = pd.array([1, 2, 3, 5]) >>> arr.searchsorted([4]) array([3])
- Parameters:
value (NumpyValueArrayLike | ExtensionArray)
side (Literal['left', 'right'])
sorter (NumpySorter | None)
- Return type:
npt.NDArray[np.intp] | np.intp
- property shape: Shape¶
Return a tuple of the array dimensions.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.shape (3,)
- shift(periods=1, fill_value=None)¶
Shift values by desired number.
Newly introduced missing values are filled with
self.dtype.na_value.Parameters¶
- periodsint, default 1
The number of periods to shift. Negative values are allowed for shifting backwards.
- fill_valueobject, optional
The scalar value to use for newly introduced missing values. The default is
self.dtype.na_value.
Returns¶
- ExtensionArray
Shifted.
Notes¶
If
selfis empty orperiodsis 0, a copy ofselfis returned.If
periods > len(self), then an array of size len(self) is returned, with all values filled withself.dtype.na_value.For 2-dimensional ExtensionArrays, we are always shifting along axis=0.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.shift(2) <IntegerArray> [<NA>, <NA>, 1] Length: 3, dtype: Int64
- Parameters:
periods (int)
fill_value (object | None)
- Return type:
- property size: int¶
The number of elements in the array.
- take(indexer, allow_fill=False, fill_value=None)¶
Take elements from an array.
Parameters¶
- indicessequence of int or one-dimensional np.ndarray of int
Indices to be taken.
- allow_fillbool, default False
How to handle negative values in indices.
False: negative values in indices indicate positional indices from the right (the default). This is similar to
numpy.take().True: negative values in indices indicate missing values. These values are set to fill_value. Any other other negative values raise a
ValueError.
- fill_valueany, optional
Fill value to use for NA-indices when allow_fill is True. This may be
None, in which case the default NA value for the type,self.dtype.na_value, is used.For many ExtensionArrays, there will be two representations of fill_value: a user-facing “boxed” scalar, and a low-level physical NA value. fill_value should be the user-facing version, and the implementation should handle translating that to the physical version for processing the take if necessary.
Returns¶
ExtensionArray
Raises¶
- IndexError
When the indices are out of bounds for the array.
- ValueError
When indices contains negative values other than
-1and allow_fill is True.
See Also¶
numpy.take : Take elements from an array along an axis. api.extensions.take : Take elements from an array.
Notes¶
ExtensionArray.take is called by
Series.__getitem__,.loc,iloc, when indices is a sequence of values. Additionally, it’s called bySeries.reindex(), or any other method that causes realignment, with a fill_value.Examples¶
Here’s an example implementation, which relies on casting the extension array to object dtype. This uses the helper method
pandas.api.extensions.take().def take(self, indices, allow_fill=False, fill_value=None): from pandas.core.algorithms import take # If the ExtensionArray is backed by an ndarray, then # just pass that here instead of coercing to object. data = self.astype(object) if allow_fill and fill_value is None: fill_value = self.dtype.na_value # fill value should always be translated from the scalar # type for the array, to the physical storage type for # the data, before passing to take. result = take(data, indices, fill_value=fill_value, allow_fill=allow_fill) return self._from_sequence(result, dtype=self.dtype)
- to_numpy(dtype=None, copy=False, na_value=<no_default>)¶
Convert to a NumPy ndarray.
This is similar to
numpy.asarray(), but may provide additional control over how the conversion is done.Parameters¶
- dtypestr or numpy.dtype, optional
The dtype to pass to
numpy.asarray().- copybool, default False
Whether to ensure that the returned value is a not a view on another array. Note that
copy=Falsedoes not ensure thatto_numpy()is no-copy. Rather,copy=Trueensure that a copy is made, even if not strictly necessary.- na_valueAny, optional
The value to use for missing values. The default value depends on dtype and the type of the array.
Returns¶
numpy.ndarray
- Parameters:
dtype (npt.DTypeLike | None)
copy (bool)
na_value (object)
- Return type:
np.ndarray
- tolist()¶
Return a list of the values.
These are each a scalar type, which is a Python scalar (for str, int, float) or a pandas scalar (for Timestamp/Timedelta/Interval/Period)
Returns¶
list
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.tolist() [1, 2, 3]
- Return type:
list
- transpose(*axes)¶
Return a transposed view on this array.
Because ExtensionArrays are always 1D, this is a no-op. It is included for compatibility with np.ndarray.
Returns¶
ExtensionArray
Examples¶
>>> pd.array([1, 2, 3]).transpose() <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- Parameters:
axes (int)
- Return type:
- unique()¶
Compute the ExtensionArray of unique values.
Returns¶
pandas.api.extensions.ExtensionArray
Examples¶
>>> arr = pd.array([1, 2, 3, 1, 2, 3]) >>> arr.unique() <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- Return type:
Self
- view(dtype=None)¶
Return a view on the array.
Parameters¶
- dtypestr, np.dtype, or ExtensionDtype, optional
Default None.
Returns¶
- ExtensionArray or np.ndarray
A view on the
ExtensionArray’s data.
Examples¶
This gives view on the underlying data of an
ExtensionArrayand is not a copy. Modifications on either the view or the originalExtensionArraywill be reflectd on the underlying data:>>> arr = pd.array([1, 2, 3]) >>> arr2 = arr.view() >>> arr[0] = 2 >>> arr2 <IntegerArray> [2, 2, 3] Length: 3, dtype: Int64
- Parameters:
dtype (Dtype | None)
- Return type:
ArrayLike
- class pymongoarrow.pandas_types.PandasCode¶
A pandas extension type for BSON Code data type.
- classmethod construct_array_type()¶
Return the array type associated with this dtype.
Returns¶
type
- Return type:
type[PandasCodeArray]
- classmethod construct_from_string(string)¶
Construct this type from a string.
This is useful mainly for data types that accept parameters. For example, a period dtype accepts a frequency parameter that can be set as
period[h](where H means hourly frequency).By default, in the abstract class, just the name of the type is expected. But subclasses can overwrite this method to accept parameters.
Parameters¶
- stringstr
The name of the type, for example
category.
Returns¶
- ExtensionDtype
Instance of the dtype.
Raises¶
- TypeError
If a class cannot be constructed from this ‘string’.
Examples¶
For extension dtypes with arguments the following may be an adequate implementation.
>>> import re >>> @classmethod ... def construct_from_string(cls, string): ... pattern = re.compile(r"^my_type\[(?P<arg_name>.+)\]$") ... match = pattern.match(string) ... if match: ... return cls(**match.groupdict()) ... else: ... raise TypeError( ... f"Cannot construct a '{cls.__name__}' from '{string}'" ... )
- empty(shape)¶
Construct an ExtensionArray of this dtype with the given shape.
Analogous to numpy.empty.
Parameters¶
shape : int or tuple[int]
Returns¶
ExtensionArray
- Parameters:
shape (Shape)
- Return type:
ExtensionArray
- index_class¶
The Index subclass to return from Index.__new__ when this dtype is encountered.
- classmethod is_dtype(dtype)¶
Check if we match ‘dtype’.
Parameters¶
- dtypeobject
The object to check.
Returns¶
bool
Notes¶
The default implementation is True if
cls.construct_from_string(dtype)is an instance ofcls.dtypeis an object and is an instance ofclsdtypehas adtypeattribute, and any of the above conditions is true fordtype.dtype.
- Parameters:
dtype (object)
- Return type:
bool
- property kind: str¶
A character code (one of ‘biufcmMOSUV’), default ‘O’
This should match the NumPy dtype used when the array is converted to an ndarray, which is probably ‘O’ for object if the extension type cannot be represented as a built-in NumPy type.
See Also¶
numpy.dtype.kind
- property name: str¶
A string identifying the data type.
Will be used for display in, e.g.
Series.dtype
- property names: list[str] | None¶
Ordered list of field names, or None if there are no fields.
This is for compatibility with NumPy arrays, and may be removed in the future.
- class pymongoarrow.pandas_types.PandasCodeArray(values, dtype, copy=False)¶
A pandas extension type for BSON Code data arrays.
- argmax(skipna=True)¶
Return the index of maximum value.
In case of multiple occurrences of the maximum value, the index corresponding to the first occurrence is returned.
Parameters¶
skipna : bool, default True
Returns¶
int
See Also¶
ExtensionArray.argmin : Return the index of the minimum value.
Examples¶
>>> arr = pd.array([3, 1, 2, 5, 4]) >>> arr.argmax() 3
- Parameters:
skipna (bool)
- Return type:
int
- argmin(skipna=True)¶
Return the index of minimum value.
In case of multiple occurrences of the minimum value, the index corresponding to the first occurrence is returned.
Parameters¶
skipna : bool, default True
Returns¶
int
See Also¶
ExtensionArray.argmax : Return the index of the maximum value.
Examples¶
>>> arr = pd.array([3, 1, 2, 5, 4]) >>> arr.argmin() 1
- Parameters:
skipna (bool)
- Return type:
int
- argsort(*, ascending=True, kind='quicksort', na_position='last', **kwargs)¶
Return the indices that would sort this array.
Parameters¶
- ascendingbool, default True
Whether the indices should result in an ascending or descending sort.
- kind{‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, optional
Sorting algorithm.
- na_position{‘first’, ‘last’}, default ‘last’
If
'first', putNaNvalues at the beginning. If'last', putNaNvalues at the end.- *args, **kwargs:
Passed through to
numpy.argsort().
Returns¶
- np.ndarray[np.intp]
Array of indices that sort
self. If NaN values are contained, NaN values are placed at the end.
See Also¶
numpy.argsort : Sorting implementation used internally.
Examples¶
>>> arr = pd.array([3, 1, 2, 5, 4]) >>> arr.argsort() array([1, 2, 0, 4, 3])
- Parameters:
ascending (bool)
kind (SortKind)
na_position (str)
- Return type:
np.ndarray
- astype(dtype, copy=True)¶
Cast to a NumPy array or ExtensionArray with ‘dtype’.
Parameters¶
- dtypestr or dtype
Typecode or data-type to which the array is cast.
- copybool, default True
Whether to copy the data, even if not necessary. If False, a copy is made only if the old dtype does not match the new dtype.
Returns¶
- np.ndarray or pandas.api.extensions.ExtensionArray
An
ExtensionArrayifdtypeisExtensionDtype, otherwise a Numpy ndarray withdtypefor its dtype.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
Casting to another
ExtensionDtypereturns anExtensionArray:>>> arr1 = arr.astype('Float64') >>> arr1 <FloatingArray> [1.0, 2.0, 3.0] Length: 3, dtype: Float64 >>> arr1.dtype Float64Dtype()
Otherwise, we will get a Numpy ndarray:
>>> arr2 = arr.astype('float64') >>> arr2 array([1., 2., 3.]) >>> arr2.dtype dtype('float64')
- Parameters:
dtype (AstypeArg)
copy (bool)
- Return type:
ArrayLike
- copy()¶
Return a copy of the array.
Returns¶
ExtensionArray
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr2 = arr.copy() >>> arr[0] = 2 >>> arr2 <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- dropna()¶
Return ExtensionArray without NA values.
Returns¶
Examples¶
>>> pd.array([1, 2, np.nan]).dropna() <IntegerArray> [1, 2] Length: 2, dtype: Int64
- Return type:
Self
- duplicated(keep='first')¶
Return boolean ndarray denoting duplicate values.
Parameters¶
- keep{‘first’, ‘last’, False}, default ‘first’
first: Mark duplicates asTrueexcept for the first occurrence.last: Mark duplicates asTrueexcept for the last occurrence.False : Mark all duplicates as
True.
Returns¶
ndarray[bool]
Examples¶
>>> pd.array([1, 1, 2, 3, 3], dtype="Int64").duplicated() array([False, True, False, False, True])
- Parameters:
keep (Literal['first', 'last', False])
- Return type:
npt.NDArray[np.bool_]
- equals(other)¶
Return if another array is equivalent to this array.
Equivalent means that both arrays have the same shape and dtype, and all values compare equal. Missing values in the same location are considered equal (in contrast with normal equality).
Parameters¶
- otherExtensionArray
Array to compare to this Array.
Returns¶
- boolean
Whether the arrays are equivalent.
Examples¶
>>> arr1 = pd.array([1, 2, np.nan]) >>> arr2 = pd.array([1, 2, np.nan]) >>> arr1.equals(arr2) True
- Parameters:
other (object)
- Return type:
bool
- factorize(use_na_sentinel=True)¶
Encode the extension array as an enumerated type.
Parameters¶
- use_na_sentinelbool, default True
If True, the sentinel -1 will be used for NaN values. If False, NaN values will be encoded as non-negative integers and will not drop the NaN from the uniques of the values.
Added in version 1.5.0.
Returns¶
- codesndarray
An integer NumPy array that’s an indexer into the original ExtensionArray.
- uniquesExtensionArray
An ExtensionArray containing the unique values of self.
Note
uniques will not contain an entry for the NA value of the ExtensionArray if there are any missing values present in self.
See Also¶
factorize : Top-level factorize method that dispatches here.
Notes¶
pandas.factorize()offers a sort keyword as well.Examples¶
>>> idx1 = pd.PeriodIndex(["2014-01", "2014-01", "2014-02", "2014-02", ... "2014-03", "2014-03"], freq="M") >>> arr, idx = idx1.factorize() >>> arr array([0, 0, 1, 1, 2, 2]) >>> idx PeriodIndex(['2014-01', '2014-02', '2014-03'], dtype='period[M]')
- Parameters:
use_na_sentinel (bool)
- Return type:
tuple[ndarray, ExtensionArray]
- fillna(value=None, method=None, limit=None, copy=True)¶
Fill NA/NaN values using the specified method.
Parameters¶
- valuescalar, array-like
If a scalar value is passed it is used to fill all missing values. Alternatively, an array-like “value” can be given. It’s expected that the array-like have the same length as ‘self’.
- method{‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}, default None
Method to use for filling holes in reindexed Series:
pad / ffill: propagate last valid observation forward to next valid.
backfill / bfill: use NEXT valid observation to fill gap.
Deprecated since version 2.1.0.
- limitint, default None
If method is specified, this is the maximum number of consecutive NaN values to forward/backward fill. In other words, if there is a gap with more than this number of consecutive NaNs, it will only be partially filled. If method is not specified, this is the maximum number of entries along the entire axis where NaNs will be filled.
Deprecated since version 2.1.0.
- copybool, default True
Whether to make a copy of the data before filling. If False, then the original should be modified and no new memory should be allocated. For ExtensionArray subclasses that cannot do this, it is at the author’s discretion whether to ignore “copy=False” or to raise. The base class implementation ignores the keyword in pad/backfill cases.
Returns¶
- ExtensionArray
With NA/NaN filled.
Examples¶
>>> arr = pd.array([np.nan, np.nan, 2, 3, np.nan, np.nan]) >>> arr.fillna(0) <IntegerArray> [0, 0, 2, 3, 0, 0] Length: 6, dtype: Int64
- Parameters:
value (object | ArrayLike | None)
method (FillnaOptions | None)
limit (int | None)
copy (bool)
- Return type:
Self
- insert(loc, item)¶
Insert an item at the given position.
Parameters¶
loc : int item : scalar-like
Returns¶
same type as self
Notes¶
This method should be both type and dtype-preserving. If the item cannot be held in an array of this type/dtype, either ValueError or TypeError should be raised.
The default implementation relies on _from_sequence to raise on invalid items.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.insert(2, -1) <IntegerArray> [1, 2, -1, 3] Length: 4, dtype: Int64
- Parameters:
loc (int)
- Return type:
Self
- interpolate(*, method, axis, index, limit, limit_direction, limit_area, copy, **kwargs)¶
See DataFrame.interpolate.__doc__.
Examples¶
>>> arr = pd.arrays.NumpyExtensionArray(np.array([0, 1, np.nan, 3])) >>> arr.interpolate(method="linear", ... limit=3, ... limit_direction="forward", ... index=pd.Index([1, 2, 3, 4]), ... fill_value=1, ... copy=False, ... axis=0, ... limit_area="inside" ... ) <NumpyExtensionArray> [0.0, 1.0, 2.0, 3.0] Length: 4, dtype: float64
- Parameters:
method (InterpolateOptions)
axis (int)
index (Index)
copy (bool)
- Return type:
Self
- isin(values)¶
Pointwise comparison for set containment in the given values.
Roughly equivalent to np.array([x in values for x in self])
Parameters¶
values : np.ndarray or ExtensionArray
Returns¶
np.ndarray[bool]
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.isin([1]) <BooleanArray> [True, False, False] Length: 3, dtype: boolean
- Parameters:
values (ArrayLike)
- Return type:
npt.NDArray[np.bool_]
- isna()¶
A 1-D array indicating if each value is missing.
Returns¶
- numpy.ndarray or pandas.api.extensions.ExtensionArray
In most cases, this should return a NumPy ndarray. For exceptional cases like
SparseArray, where returning an ndarray would be expensive, an ExtensionArray may be returned.
Notes¶
If returning an ExtensionArray, then
na_values._is_booleanshould be Truena_values should implement
ExtensionArray._reduce()na_values.anyandna_values.allshould be implemented
Examples¶
>>> arr = pd.array([1, 2, np.nan, np.nan]) >>> arr.isna() array([False, False, True, True])
- map(mapper, na_action=None)¶
Map values using an input mapping or function.
Parameters¶
- mapperfunction, dict, or Series
Mapping correspondence.
- na_action{None, ‘ignore’}, default None
If ‘ignore’, propagate NA values, without passing them to the mapping correspondence. If ‘ignore’ is not supported, a
NotImplementedErrorshould be raised.
Returns¶
- Union[ndarray, Index, ExtensionArray]
The output of the mapping function applied to the array. If the function returns a tuple with more than one element a MultiIndex will be returned.
- nbytes()¶
The number of bytes needed to store this object in memory.
Examples¶
>>> pd.array([1, 2, 3]).nbytes 27
- property ndim: int¶
Extension Arrays are only allowed to be 1-dimensional.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.ndim 1
- ravel(order='C')¶
Return a flattened view on this array.
Parameters¶
order : {None, ‘C’, ‘F’, ‘A’, ‘K’}, default ‘C’
Returns¶
ExtensionArray
Notes¶
Because ExtensionArrays are 1D-only, this is a no-op.
The “order” argument is ignored, is for compatibility with NumPy.
Examples¶
>>> pd.array([1, 2, 3]).ravel() <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- Parameters:
order (Literal['C', 'F', 'A', 'K'] | None)
- Return type:
ExtensionArray
- repeat(repeats, axis=None)¶
Repeat elements of a ExtensionArray.
Returns a new ExtensionArray where each element of the current ExtensionArray is repeated consecutively a given number of times.
Parameters¶
- repeatsint or array of ints
The number of repetitions for each element. This should be a non-negative integer. Repeating 0 times will return an empty ExtensionArray.
- axisNone
Must be
None. Has no effect but is accepted for compatibility with numpy.
Returns¶
- ExtensionArray
Newly created ExtensionArray with repeated elements.
See Also¶
Series.repeat : Equivalent function for Series. Index.repeat : Equivalent function for Index. numpy.repeat : Similar method for
numpy.ndarray. ExtensionArray.take : Take arbitrary positions.Examples¶
>>> cat = pd.Categorical(['a', 'b', 'c']) >>> cat ['a', 'b', 'c'] Categories (3, object): ['a', 'b', 'c'] >>> cat.repeat(2) ['a', 'a', 'b', 'b', 'c', 'c'] Categories (3, object): ['a', 'b', 'c'] >>> cat.repeat([1, 2, 3]) ['a', 'b', 'b', 'c', 'c', 'c'] Categories (3, object): ['a', 'b', 'c']
- Parameters:
repeats (int | Sequence[int])
axis (AxisInt | None)
- Return type:
Self
- searchsorted(value, side='left', sorter=None)¶
Find indices where elements should be inserted to maintain order.
Find the indices into a sorted array self (a) such that, if the corresponding elements in value were inserted before the indices, the order of self would be preserved.
Assuming that self is sorted:
side
returned index i satisfies
left
self[i-1] < value <= self[i]right
self[i-1] <= value < self[i]Parameters¶
- valuearray-like, list or scalar
Value(s) to insert into self.
- side{‘left’, ‘right’}, optional
If ‘left’, the index of the first suitable location found is given. If ‘right’, return the last such index. If there is no suitable index, return either 0 or N (where N is the length of self).
- sorter1-D array-like, optional
Optional array of integer indices that sort array a into ascending order. They are typically the result of argsort.
Returns¶
- array of ints or int
If value is array-like, array of insertion points. If value is scalar, a single integer.
See Also¶
numpy.searchsorted : Similar method from NumPy.
Examples¶
>>> arr = pd.array([1, 2, 3, 5]) >>> arr.searchsorted([4]) array([3])
- Parameters:
value (NumpyValueArrayLike | ExtensionArray)
side (Literal['left', 'right'])
sorter (NumpySorter | None)
- Return type:
npt.NDArray[np.intp] | np.intp
- property shape: Shape¶
Return a tuple of the array dimensions.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.shape (3,)
- shift(periods=1, fill_value=None)¶
Shift values by desired number.
Newly introduced missing values are filled with
self.dtype.na_value.Parameters¶
- periodsint, default 1
The number of periods to shift. Negative values are allowed for shifting backwards.
- fill_valueobject, optional
The scalar value to use for newly introduced missing values. The default is
self.dtype.na_value.
Returns¶
- ExtensionArray
Shifted.
Notes¶
If
selfis empty orperiodsis 0, a copy ofselfis returned.If
periods > len(self), then an array of size len(self) is returned, with all values filled withself.dtype.na_value.For 2-dimensional ExtensionArrays, we are always shifting along axis=0.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.shift(2) <IntegerArray> [<NA>, <NA>, 1] Length: 3, dtype: Int64
- Parameters:
periods (int)
fill_value (object | None)
- Return type:
- property size: int¶
The number of elements in the array.
- take(indexer, allow_fill=False, fill_value=None)¶
Take elements from an array.
Parameters¶
- indicessequence of int or one-dimensional np.ndarray of int
Indices to be taken.
- allow_fillbool, default False
How to handle negative values in indices.
False: negative values in indices indicate positional indices from the right (the default). This is similar to
numpy.take().True: negative values in indices indicate missing values. These values are set to fill_value. Any other other negative values raise a
ValueError.
- fill_valueany, optional
Fill value to use for NA-indices when allow_fill is True. This may be
None, in which case the default NA value for the type,self.dtype.na_value, is used.For many ExtensionArrays, there will be two representations of fill_value: a user-facing “boxed” scalar, and a low-level physical NA value. fill_value should be the user-facing version, and the implementation should handle translating that to the physical version for processing the take if necessary.
Returns¶
ExtensionArray
Raises¶
- IndexError
When the indices are out of bounds for the array.
- ValueError
When indices contains negative values other than
-1and allow_fill is True.
See Also¶
numpy.take : Take elements from an array along an axis. api.extensions.take : Take elements from an array.
Notes¶
ExtensionArray.take is called by
Series.__getitem__,.loc,iloc, when indices is a sequence of values. Additionally, it’s called bySeries.reindex(), or any other method that causes realignment, with a fill_value.Examples¶
Here’s an example implementation, which relies on casting the extension array to object dtype. This uses the helper method
pandas.api.extensions.take().def take(self, indices, allow_fill=False, fill_value=None): from pandas.core.algorithms import take # If the ExtensionArray is backed by an ndarray, then # just pass that here instead of coercing to object. data = self.astype(object) if allow_fill and fill_value is None: fill_value = self.dtype.na_value # fill value should always be translated from the scalar # type for the array, to the physical storage type for # the data, before passing to take. result = take(data, indices, fill_value=fill_value, allow_fill=allow_fill) return self._from_sequence(result, dtype=self.dtype)
- to_numpy(dtype=None, copy=False, na_value=<no_default>)¶
Convert to a NumPy ndarray.
This is similar to
numpy.asarray(), but may provide additional control over how the conversion is done.Parameters¶
- dtypestr or numpy.dtype, optional
The dtype to pass to
numpy.asarray().- copybool, default False
Whether to ensure that the returned value is a not a view on another array. Note that
copy=Falsedoes not ensure thatto_numpy()is no-copy. Rather,copy=Trueensure that a copy is made, even if not strictly necessary.- na_valueAny, optional
The value to use for missing values. The default value depends on dtype and the type of the array.
Returns¶
numpy.ndarray
- Parameters:
dtype (npt.DTypeLike | None)
copy (bool)
na_value (object)
- Return type:
np.ndarray
- tolist()¶
Return a list of the values.
These are each a scalar type, which is a Python scalar (for str, int, float) or a pandas scalar (for Timestamp/Timedelta/Interval/Period)
Returns¶
list
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.tolist() [1, 2, 3]
- Return type:
list
- transpose(*axes)¶
Return a transposed view on this array.
Because ExtensionArrays are always 1D, this is a no-op. It is included for compatibility with np.ndarray.
Returns¶
ExtensionArray
Examples¶
>>> pd.array([1, 2, 3]).transpose() <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- Parameters:
axes (int)
- Return type:
- unique()¶
Compute the ExtensionArray of unique values.
Returns¶
pandas.api.extensions.ExtensionArray
Examples¶
>>> arr = pd.array([1, 2, 3, 1, 2, 3]) >>> arr.unique() <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- Return type:
Self
- view(dtype=None)¶
Return a view on the array.
Parameters¶
- dtypestr, np.dtype, or ExtensionDtype, optional
Default None.
Returns¶
- ExtensionArray or np.ndarray
A view on the
ExtensionArray’s data.
Examples¶
This gives view on the underlying data of an
ExtensionArrayand is not a copy. Modifications on either the view or the originalExtensionArraywill be reflectd on the underlying data:>>> arr = pd.array([1, 2, 3]) >>> arr2 = arr.view() >>> arr[0] = 2 >>> arr2 <IntegerArray> [2, 2, 3] Length: 3, dtype: Int64
- Parameters:
dtype (Dtype | None)
- Return type:
ArrayLike
- class pymongoarrow.pandas_types.PandasDecimal128¶
A pandas extension type for BSON Decimal128 data type.
- classmethod construct_array_type()¶
Return the array type associated with this dtype.
Returns¶
type
- Return type:
type[PandasDecimal128Array]
- classmethod construct_from_string(string)¶
Construct this type from a string.
This is useful mainly for data types that accept parameters. For example, a period dtype accepts a frequency parameter that can be set as
period[h](where H means hourly frequency).By default, in the abstract class, just the name of the type is expected. But subclasses can overwrite this method to accept parameters.
Parameters¶
- stringstr
The name of the type, for example
category.
Returns¶
- ExtensionDtype
Instance of the dtype.
Raises¶
- TypeError
If a class cannot be constructed from this ‘string’.
Examples¶
For extension dtypes with arguments the following may be an adequate implementation.
>>> import re >>> @classmethod ... def construct_from_string(cls, string): ... pattern = re.compile(r"^my_type\[(?P<arg_name>.+)\]$") ... match = pattern.match(string) ... if match: ... return cls(**match.groupdict()) ... else: ... raise TypeError( ... f"Cannot construct a '{cls.__name__}' from '{string}'" ... )
- empty(shape)¶
Construct an ExtensionArray of this dtype with the given shape.
Analogous to numpy.empty.
Parameters¶
shape : int or tuple[int]
Returns¶
ExtensionArray
- Parameters:
shape (Shape)
- Return type:
ExtensionArray
- index_class¶
The Index subclass to return from Index.__new__ when this dtype is encountered.
- classmethod is_dtype(dtype)¶
Check if we match ‘dtype’.
Parameters¶
- dtypeobject
The object to check.
Returns¶
bool
Notes¶
The default implementation is True if
cls.construct_from_string(dtype)is an instance ofcls.dtypeis an object and is an instance ofclsdtypehas adtypeattribute, and any of the above conditions is true fordtype.dtype.
- Parameters:
dtype (object)
- Return type:
bool
- property kind: str¶
A character code (one of ‘biufcmMOSUV’), default ‘O’
This should match the NumPy dtype used when the array is converted to an ndarray, which is probably ‘O’ for object if the extension type cannot be represented as a built-in NumPy type.
See Also¶
numpy.dtype.kind
- property name: str¶
A string identifying the data type.
Will be used for display in, e.g.
Series.dtype
- property names: list[str] | None¶
Ordered list of field names, or None if there are no fields.
This is for compatibility with NumPy arrays, and may be removed in the future.
- type¶
alias of
Decimal128
- class pymongoarrow.pandas_types.PandasDecimal128Array(values, dtype, copy=False)¶
A pandas extension type for BSON Binary data arrays.
- argmax(skipna=True)¶
Return the index of maximum value.
In case of multiple occurrences of the maximum value, the index corresponding to the first occurrence is returned.
Parameters¶
skipna : bool, default True
Returns¶
int
See Also¶
ExtensionArray.argmin : Return the index of the minimum value.
Examples¶
>>> arr = pd.array([3, 1, 2, 5, 4]) >>> arr.argmax() 3
- Parameters:
skipna (bool)
- Return type:
int
- argmin(skipna=True)¶
Return the index of minimum value.
In case of multiple occurrences of the minimum value, the index corresponding to the first occurrence is returned.
Parameters¶
skipna : bool, default True
Returns¶
int
See Also¶
ExtensionArray.argmax : Return the index of the maximum value.
Examples¶
>>> arr = pd.array([3, 1, 2, 5, 4]) >>> arr.argmin() 1
- Parameters:
skipna (bool)
- Return type:
int
- argsort(*, ascending=True, kind='quicksort', na_position='last', **kwargs)¶
Return the indices that would sort this array.
Parameters¶
- ascendingbool, default True
Whether the indices should result in an ascending or descending sort.
- kind{‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, optional
Sorting algorithm.
- na_position{‘first’, ‘last’}, default ‘last’
If
'first', putNaNvalues at the beginning. If'last', putNaNvalues at the end.- *args, **kwargs:
Passed through to
numpy.argsort().
Returns¶
- np.ndarray[np.intp]
Array of indices that sort
self. If NaN values are contained, NaN values are placed at the end.
See Also¶
numpy.argsort : Sorting implementation used internally.
Examples¶
>>> arr = pd.array([3, 1, 2, 5, 4]) >>> arr.argsort() array([1, 2, 0, 4, 3])
- Parameters:
ascending (bool)
kind (SortKind)
na_position (str)
- Return type:
np.ndarray
- astype(dtype, copy=True)¶
Cast to a NumPy array or ExtensionArray with ‘dtype’.
Parameters¶
- dtypestr or dtype
Typecode or data-type to which the array is cast.
- copybool, default True
Whether to copy the data, even if not necessary. If False, a copy is made only if the old dtype does not match the new dtype.
Returns¶
- np.ndarray or pandas.api.extensions.ExtensionArray
An
ExtensionArrayifdtypeisExtensionDtype, otherwise a Numpy ndarray withdtypefor its dtype.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
Casting to another
ExtensionDtypereturns anExtensionArray:>>> arr1 = arr.astype('Float64') >>> arr1 <FloatingArray> [1.0, 2.0, 3.0] Length: 3, dtype: Float64 >>> arr1.dtype Float64Dtype()
Otherwise, we will get a Numpy ndarray:
>>> arr2 = arr.astype('float64') >>> arr2 array([1., 2., 3.]) >>> arr2.dtype dtype('float64')
- Parameters:
dtype (AstypeArg)
copy (bool)
- Return type:
ArrayLike
- copy()¶
Return a copy of the array.
Returns¶
ExtensionArray
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr2 = arr.copy() >>> arr[0] = 2 >>> arr2 <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- dropna()¶
Return ExtensionArray without NA values.
Returns¶
Examples¶
>>> pd.array([1, 2, np.nan]).dropna() <IntegerArray> [1, 2] Length: 2, dtype: Int64
- Return type:
Self
- duplicated(keep='first')¶
Return boolean ndarray denoting duplicate values.
Parameters¶
- keep{‘first’, ‘last’, False}, default ‘first’
first: Mark duplicates asTrueexcept for the first occurrence.last: Mark duplicates asTrueexcept for the last occurrence.False : Mark all duplicates as
True.
Returns¶
ndarray[bool]
Examples¶
>>> pd.array([1, 1, 2, 3, 3], dtype="Int64").duplicated() array([False, True, False, False, True])
- Parameters:
keep (Literal['first', 'last', False])
- Return type:
npt.NDArray[np.bool_]
- equals(other)¶
Return if another array is equivalent to this array.
Equivalent means that both arrays have the same shape and dtype, and all values compare equal. Missing values in the same location are considered equal (in contrast with normal equality).
Parameters¶
- otherExtensionArray
Array to compare to this Array.
Returns¶
- boolean
Whether the arrays are equivalent.
Examples¶
>>> arr1 = pd.array([1, 2, np.nan]) >>> arr2 = pd.array([1, 2, np.nan]) >>> arr1.equals(arr2) True
- Parameters:
other (object)
- Return type:
bool
- factorize(use_na_sentinel=True)¶
Encode the extension array as an enumerated type.
Parameters¶
- use_na_sentinelbool, default True
If True, the sentinel -1 will be used for NaN values. If False, NaN values will be encoded as non-negative integers and will not drop the NaN from the uniques of the values.
Added in version 1.5.0.
Returns¶
- codesndarray
An integer NumPy array that’s an indexer into the original ExtensionArray.
- uniquesExtensionArray
An ExtensionArray containing the unique values of self.
Note
uniques will not contain an entry for the NA value of the ExtensionArray if there are any missing values present in self.
See Also¶
factorize : Top-level factorize method that dispatches here.
Notes¶
pandas.factorize()offers a sort keyword as well.Examples¶
>>> idx1 = pd.PeriodIndex(["2014-01", "2014-01", "2014-02", "2014-02", ... "2014-03", "2014-03"], freq="M") >>> arr, idx = idx1.factorize() >>> arr array([0, 0, 1, 1, 2, 2]) >>> idx PeriodIndex(['2014-01', '2014-02', '2014-03'], dtype='period[M]')
- Parameters:
use_na_sentinel (bool)
- Return type:
tuple[ndarray, ExtensionArray]
- fillna(value=None, method=None, limit=None, copy=True)¶
Fill NA/NaN values using the specified method.
Parameters¶
- valuescalar, array-like
If a scalar value is passed it is used to fill all missing values. Alternatively, an array-like “value” can be given. It’s expected that the array-like have the same length as ‘self’.
- method{‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}, default None
Method to use for filling holes in reindexed Series:
pad / ffill: propagate last valid observation forward to next valid.
backfill / bfill: use NEXT valid observation to fill gap.
Deprecated since version 2.1.0.
- limitint, default None
If method is specified, this is the maximum number of consecutive NaN values to forward/backward fill. In other words, if there is a gap with more than this number of consecutive NaNs, it will only be partially filled. If method is not specified, this is the maximum number of entries along the entire axis where NaNs will be filled.
Deprecated since version 2.1.0.
- copybool, default True
Whether to make a copy of the data before filling. If False, then the original should be modified and no new memory should be allocated. For ExtensionArray subclasses that cannot do this, it is at the author’s discretion whether to ignore “copy=False” or to raise. The base class implementation ignores the keyword in pad/backfill cases.
Returns¶
- ExtensionArray
With NA/NaN filled.
Examples¶
>>> arr = pd.array([np.nan, np.nan, 2, 3, np.nan, np.nan]) >>> arr.fillna(0) <IntegerArray> [0, 0, 2, 3, 0, 0] Length: 6, dtype: Int64
- Parameters:
value (object | ArrayLike | None)
method (FillnaOptions | None)
limit (int | None)
copy (bool)
- Return type:
Self
- insert(loc, item)¶
Insert an item at the given position.
Parameters¶
loc : int item : scalar-like
Returns¶
same type as self
Notes¶
This method should be both type and dtype-preserving. If the item cannot be held in an array of this type/dtype, either ValueError or TypeError should be raised.
The default implementation relies on _from_sequence to raise on invalid items.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.insert(2, -1) <IntegerArray> [1, 2, -1, 3] Length: 4, dtype: Int64
- Parameters:
loc (int)
- Return type:
Self
- interpolate(*, method, axis, index, limit, limit_direction, limit_area, copy, **kwargs)¶
See DataFrame.interpolate.__doc__.
Examples¶
>>> arr = pd.arrays.NumpyExtensionArray(np.array([0, 1, np.nan, 3])) >>> arr.interpolate(method="linear", ... limit=3, ... limit_direction="forward", ... index=pd.Index([1, 2, 3, 4]), ... fill_value=1, ... copy=False, ... axis=0, ... limit_area="inside" ... ) <NumpyExtensionArray> [0.0, 1.0, 2.0, 3.0] Length: 4, dtype: float64
- Parameters:
method (InterpolateOptions)
axis (int)
index (Index)
copy (bool)
- Return type:
Self
- isin(values)¶
Pointwise comparison for set containment in the given values.
Roughly equivalent to np.array([x in values for x in self])
Parameters¶
values : np.ndarray or ExtensionArray
Returns¶
np.ndarray[bool]
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.isin([1]) <BooleanArray> [True, False, False] Length: 3, dtype: boolean
- Parameters:
values (ArrayLike)
- Return type:
npt.NDArray[np.bool_]
- isna()¶
A 1-D array indicating if each value is missing.
Returns¶
- numpy.ndarray or pandas.api.extensions.ExtensionArray
In most cases, this should return a NumPy ndarray. For exceptional cases like
SparseArray, where returning an ndarray would be expensive, an ExtensionArray may be returned.
Notes¶
If returning an ExtensionArray, then
na_values._is_booleanshould be Truena_values should implement
ExtensionArray._reduce()na_values.anyandna_values.allshould be implemented
Examples¶
>>> arr = pd.array([1, 2, np.nan, np.nan]) >>> arr.isna() array([False, False, True, True])
- map(mapper, na_action=None)¶
Map values using an input mapping or function.
Parameters¶
- mapperfunction, dict, or Series
Mapping correspondence.
- na_action{None, ‘ignore’}, default None
If ‘ignore’, propagate NA values, without passing them to the mapping correspondence. If ‘ignore’ is not supported, a
NotImplementedErrorshould be raised.
Returns¶
- Union[ndarray, Index, ExtensionArray]
The output of the mapping function applied to the array. If the function returns a tuple with more than one element a MultiIndex will be returned.
- nbytes()¶
The number of bytes needed to store this object in memory.
Examples¶
>>> pd.array([1, 2, 3]).nbytes 27
- property ndim: int¶
Extension Arrays are only allowed to be 1-dimensional.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.ndim 1
- ravel(order='C')¶
Return a flattened view on this array.
Parameters¶
order : {None, ‘C’, ‘F’, ‘A’, ‘K’}, default ‘C’
Returns¶
ExtensionArray
Notes¶
Because ExtensionArrays are 1D-only, this is a no-op.
The “order” argument is ignored, is for compatibility with NumPy.
Examples¶
>>> pd.array([1, 2, 3]).ravel() <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- Parameters:
order (Literal['C', 'F', 'A', 'K'] | None)
- Return type:
ExtensionArray
- repeat(repeats, axis=None)¶
Repeat elements of a ExtensionArray.
Returns a new ExtensionArray where each element of the current ExtensionArray is repeated consecutively a given number of times.
Parameters¶
- repeatsint or array of ints
The number of repetitions for each element. This should be a non-negative integer. Repeating 0 times will return an empty ExtensionArray.
- axisNone
Must be
None. Has no effect but is accepted for compatibility with numpy.
Returns¶
- ExtensionArray
Newly created ExtensionArray with repeated elements.
See Also¶
Series.repeat : Equivalent function for Series. Index.repeat : Equivalent function for Index. numpy.repeat : Similar method for
numpy.ndarray. ExtensionArray.take : Take arbitrary positions.Examples¶
>>> cat = pd.Categorical(['a', 'b', 'c']) >>> cat ['a', 'b', 'c'] Categories (3, object): ['a', 'b', 'c'] >>> cat.repeat(2) ['a', 'a', 'b', 'b', 'c', 'c'] Categories (3, object): ['a', 'b', 'c'] >>> cat.repeat([1, 2, 3]) ['a', 'b', 'b', 'c', 'c', 'c'] Categories (3, object): ['a', 'b', 'c']
- Parameters:
repeats (int | Sequence[int])
axis (AxisInt | None)
- Return type:
Self
- searchsorted(value, side='left', sorter=None)¶
Find indices where elements should be inserted to maintain order.
Find the indices into a sorted array self (a) such that, if the corresponding elements in value were inserted before the indices, the order of self would be preserved.
Assuming that self is sorted:
side
returned index i satisfies
left
self[i-1] < value <= self[i]right
self[i-1] <= value < self[i]Parameters¶
- valuearray-like, list or scalar
Value(s) to insert into self.
- side{‘left’, ‘right’}, optional
If ‘left’, the index of the first suitable location found is given. If ‘right’, return the last such index. If there is no suitable index, return either 0 or N (where N is the length of self).
- sorter1-D array-like, optional
Optional array of integer indices that sort array a into ascending order. They are typically the result of argsort.
Returns¶
- array of ints or int
If value is array-like, array of insertion points. If value is scalar, a single integer.
See Also¶
numpy.searchsorted : Similar method from NumPy.
Examples¶
>>> arr = pd.array([1, 2, 3, 5]) >>> arr.searchsorted([4]) array([3])
- Parameters:
value (NumpyValueArrayLike | ExtensionArray)
side (Literal['left', 'right'])
sorter (NumpySorter | None)
- Return type:
npt.NDArray[np.intp] | np.intp
- property shape: Shape¶
Return a tuple of the array dimensions.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.shape (3,)
- shift(periods=1, fill_value=None)¶
Shift values by desired number.
Newly introduced missing values are filled with
self.dtype.na_value.Parameters¶
- periodsint, default 1
The number of periods to shift. Negative values are allowed for shifting backwards.
- fill_valueobject, optional
The scalar value to use for newly introduced missing values. The default is
self.dtype.na_value.
Returns¶
- ExtensionArray
Shifted.
Notes¶
If
selfis empty orperiodsis 0, a copy ofselfis returned.If
periods > len(self), then an array of size len(self) is returned, with all values filled withself.dtype.na_value.For 2-dimensional ExtensionArrays, we are always shifting along axis=0.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.shift(2) <IntegerArray> [<NA>, <NA>, 1] Length: 3, dtype: Int64
- Parameters:
periods (int)
fill_value (object | None)
- Return type:
- property size: int¶
The number of elements in the array.
- take(indexer, allow_fill=False, fill_value=None)¶
Take elements from an array.
Parameters¶
- indicessequence of int or one-dimensional np.ndarray of int
Indices to be taken.
- allow_fillbool, default False
How to handle negative values in indices.
False: negative values in indices indicate positional indices from the right (the default). This is similar to
numpy.take().True: negative values in indices indicate missing values. These values are set to fill_value. Any other other negative values raise a
ValueError.
- fill_valueany, optional
Fill value to use for NA-indices when allow_fill is True. This may be
None, in which case the default NA value for the type,self.dtype.na_value, is used.For many ExtensionArrays, there will be two representations of fill_value: a user-facing “boxed” scalar, and a low-level physical NA value. fill_value should be the user-facing version, and the implementation should handle translating that to the physical version for processing the take if necessary.
Returns¶
ExtensionArray
Raises¶
- IndexError
When the indices are out of bounds for the array.
- ValueError
When indices contains negative values other than
-1and allow_fill is True.
See Also¶
numpy.take : Take elements from an array along an axis. api.extensions.take : Take elements from an array.
Notes¶
ExtensionArray.take is called by
Series.__getitem__,.loc,iloc, when indices is a sequence of values. Additionally, it’s called bySeries.reindex(), or any other method that causes realignment, with a fill_value.Examples¶
Here’s an example implementation, which relies on casting the extension array to object dtype. This uses the helper method
pandas.api.extensions.take().def take(self, indices, allow_fill=False, fill_value=None): from pandas.core.algorithms import take # If the ExtensionArray is backed by an ndarray, then # just pass that here instead of coercing to object. data = self.astype(object) if allow_fill and fill_value is None: fill_value = self.dtype.na_value # fill value should always be translated from the scalar # type for the array, to the physical storage type for # the data, before passing to take. result = take(data, indices, fill_value=fill_value, allow_fill=allow_fill) return self._from_sequence(result, dtype=self.dtype)
- to_numpy(dtype=None, copy=False, na_value=<no_default>)¶
Convert to a NumPy ndarray.
This is similar to
numpy.asarray(), but may provide additional control over how the conversion is done.Parameters¶
- dtypestr or numpy.dtype, optional
The dtype to pass to
numpy.asarray().- copybool, default False
Whether to ensure that the returned value is a not a view on another array. Note that
copy=Falsedoes not ensure thatto_numpy()is no-copy. Rather,copy=Trueensure that a copy is made, even if not strictly necessary.- na_valueAny, optional
The value to use for missing values. The default value depends on dtype and the type of the array.
Returns¶
numpy.ndarray
- Parameters:
dtype (npt.DTypeLike | None)
copy (bool)
na_value (object)
- Return type:
np.ndarray
- tolist()¶
Return a list of the values.
These are each a scalar type, which is a Python scalar (for str, int, float) or a pandas scalar (for Timestamp/Timedelta/Interval/Period)
Returns¶
list
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.tolist() [1, 2, 3]
- Return type:
list
- transpose(*axes)¶
Return a transposed view on this array.
Because ExtensionArrays are always 1D, this is a no-op. It is included for compatibility with np.ndarray.
Returns¶
ExtensionArray
Examples¶
>>> pd.array([1, 2, 3]).transpose() <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- Parameters:
axes (int)
- Return type:
- unique()¶
Compute the ExtensionArray of unique values.
Returns¶
pandas.api.extensions.ExtensionArray
Examples¶
>>> arr = pd.array([1, 2, 3, 1, 2, 3]) >>> arr.unique() <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- Return type:
Self
- view(dtype=None)¶
Return a view on the array.
Parameters¶
- dtypestr, np.dtype, or ExtensionDtype, optional
Default None.
Returns¶
- ExtensionArray or np.ndarray
A view on the
ExtensionArray’s data.
Examples¶
This gives view on the underlying data of an
ExtensionArrayand is not a copy. Modifications on either the view or the originalExtensionArraywill be reflectd on the underlying data:>>> arr = pd.array([1, 2, 3]) >>> arr2 = arr.view() >>> arr[0] = 2 >>> arr2 <IntegerArray> [2, 2, 3] Length: 3, dtype: Int64
- Parameters:
dtype (Dtype | None)
- Return type:
ArrayLike
- class pymongoarrow.pandas_types.PandasObjectId¶
A pandas extension type for BSON ObjectId data type.
- classmethod construct_array_type()¶
Return the array type associated with this dtype.
Returns¶
type
- Return type:
type[PandasObjectIdArray]
- classmethod construct_from_string(string)¶
Construct this type from a string.
This is useful mainly for data types that accept parameters. For example, a period dtype accepts a frequency parameter that can be set as
period[h](where H means hourly frequency).By default, in the abstract class, just the name of the type is expected. But subclasses can overwrite this method to accept parameters.
Parameters¶
- stringstr
The name of the type, for example
category.
Returns¶
- ExtensionDtype
Instance of the dtype.
Raises¶
- TypeError
If a class cannot be constructed from this ‘string’.
Examples¶
For extension dtypes with arguments the following may be an adequate implementation.
>>> import re >>> @classmethod ... def construct_from_string(cls, string): ... pattern = re.compile(r"^my_type\[(?P<arg_name>.+)\]$") ... match = pattern.match(string) ... if match: ... return cls(**match.groupdict()) ... else: ... raise TypeError( ... f"Cannot construct a '{cls.__name__}' from '{string}'" ... )
- empty(shape)¶
Construct an ExtensionArray of this dtype with the given shape.
Analogous to numpy.empty.
Parameters¶
shape : int or tuple[int]
Returns¶
ExtensionArray
- Parameters:
shape (Shape)
- Return type:
ExtensionArray
- index_class¶
The Index subclass to return from Index.__new__ when this dtype is encountered.
- classmethod is_dtype(dtype)¶
Check if we match ‘dtype’.
Parameters¶
- dtypeobject
The object to check.
Returns¶
bool
Notes¶
The default implementation is True if
cls.construct_from_string(dtype)is an instance ofcls.dtypeis an object and is an instance ofclsdtypehas adtypeattribute, and any of the above conditions is true fordtype.dtype.
- Parameters:
dtype (object)
- Return type:
bool
- property kind: str¶
A character code (one of ‘biufcmMOSUV’), default ‘O’
This should match the NumPy dtype used when the array is converted to an ndarray, which is probably ‘O’ for object if the extension type cannot be represented as a built-in NumPy type.
See Also¶
numpy.dtype.kind
- property name: str¶
A string identifying the data type.
Will be used for display in, e.g.
Series.dtype
- property names: list[str] | None¶
Ordered list of field names, or None if there are no fields.
This is for compatibility with NumPy arrays, and may be removed in the future.
- class pymongoarrow.pandas_types.PandasObjectIdArray(values, dtype, copy=False)¶
A pandas extension type for BSON Binary data arrays.
- argmax(skipna=True)¶
Return the index of maximum value.
In case of multiple occurrences of the maximum value, the index corresponding to the first occurrence is returned.
Parameters¶
skipna : bool, default True
Returns¶
int
See Also¶
ExtensionArray.argmin : Return the index of the minimum value.
Examples¶
>>> arr = pd.array([3, 1, 2, 5, 4]) >>> arr.argmax() 3
- Parameters:
skipna (bool)
- Return type:
int
- argmin(skipna=True)¶
Return the index of minimum value.
In case of multiple occurrences of the minimum value, the index corresponding to the first occurrence is returned.
Parameters¶
skipna : bool, default True
Returns¶
int
See Also¶
ExtensionArray.argmax : Return the index of the maximum value.
Examples¶
>>> arr = pd.array([3, 1, 2, 5, 4]) >>> arr.argmin() 1
- Parameters:
skipna (bool)
- Return type:
int
- argsort(*, ascending=True, kind='quicksort', na_position='last', **kwargs)¶
Return the indices that would sort this array.
Parameters¶
- ascendingbool, default True
Whether the indices should result in an ascending or descending sort.
- kind{‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, optional
Sorting algorithm.
- na_position{‘first’, ‘last’}, default ‘last’
If
'first', putNaNvalues at the beginning. If'last', putNaNvalues at the end.- *args, **kwargs:
Passed through to
numpy.argsort().
Returns¶
- np.ndarray[np.intp]
Array of indices that sort
self. If NaN values are contained, NaN values are placed at the end.
See Also¶
numpy.argsort : Sorting implementation used internally.
Examples¶
>>> arr = pd.array([3, 1, 2, 5, 4]) >>> arr.argsort() array([1, 2, 0, 4, 3])
- Parameters:
ascending (bool)
kind (SortKind)
na_position (str)
- Return type:
np.ndarray
- astype(dtype, copy=True)¶
Cast to a NumPy array or ExtensionArray with ‘dtype’.
Parameters¶
- dtypestr or dtype
Typecode or data-type to which the array is cast.
- copybool, default True
Whether to copy the data, even if not necessary. If False, a copy is made only if the old dtype does not match the new dtype.
Returns¶
- np.ndarray or pandas.api.extensions.ExtensionArray
An
ExtensionArrayifdtypeisExtensionDtype, otherwise a Numpy ndarray withdtypefor its dtype.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
Casting to another
ExtensionDtypereturns anExtensionArray:>>> arr1 = arr.astype('Float64') >>> arr1 <FloatingArray> [1.0, 2.0, 3.0] Length: 3, dtype: Float64 >>> arr1.dtype Float64Dtype()
Otherwise, we will get a Numpy ndarray:
>>> arr2 = arr.astype('float64') >>> arr2 array([1., 2., 3.]) >>> arr2.dtype dtype('float64')
- Parameters:
dtype (AstypeArg)
copy (bool)
- Return type:
ArrayLike
- copy()¶
Return a copy of the array.
Returns¶
ExtensionArray
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr2 = arr.copy() >>> arr[0] = 2 >>> arr2 <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- dropna()¶
Return ExtensionArray without NA values.
Returns¶
Examples¶
>>> pd.array([1, 2, np.nan]).dropna() <IntegerArray> [1, 2] Length: 2, dtype: Int64
- Return type:
Self
- duplicated(keep='first')¶
Return boolean ndarray denoting duplicate values.
Parameters¶
- keep{‘first’, ‘last’, False}, default ‘first’
first: Mark duplicates asTrueexcept for the first occurrence.last: Mark duplicates asTrueexcept for the last occurrence.False : Mark all duplicates as
True.
Returns¶
ndarray[bool]
Examples¶
>>> pd.array([1, 1, 2, 3, 3], dtype="Int64").duplicated() array([False, True, False, False, True])
- Parameters:
keep (Literal['first', 'last', False])
- Return type:
npt.NDArray[np.bool_]
- equals(other)¶
Return if another array is equivalent to this array.
Equivalent means that both arrays have the same shape and dtype, and all values compare equal. Missing values in the same location are considered equal (in contrast with normal equality).
Parameters¶
- otherExtensionArray
Array to compare to this Array.
Returns¶
- boolean
Whether the arrays are equivalent.
Examples¶
>>> arr1 = pd.array([1, 2, np.nan]) >>> arr2 = pd.array([1, 2, np.nan]) >>> arr1.equals(arr2) True
- Parameters:
other (object)
- Return type:
bool
- factorize(use_na_sentinel=True)¶
Encode the extension array as an enumerated type.
Parameters¶
- use_na_sentinelbool, default True
If True, the sentinel -1 will be used for NaN values. If False, NaN values will be encoded as non-negative integers and will not drop the NaN from the uniques of the values.
Added in version 1.5.0.
Returns¶
- codesndarray
An integer NumPy array that’s an indexer into the original ExtensionArray.
- uniquesExtensionArray
An ExtensionArray containing the unique values of self.
Note
uniques will not contain an entry for the NA value of the ExtensionArray if there are any missing values present in self.
See Also¶
factorize : Top-level factorize method that dispatches here.
Notes¶
pandas.factorize()offers a sort keyword as well.Examples¶
>>> idx1 = pd.PeriodIndex(["2014-01", "2014-01", "2014-02", "2014-02", ... "2014-03", "2014-03"], freq="M") >>> arr, idx = idx1.factorize() >>> arr array([0, 0, 1, 1, 2, 2]) >>> idx PeriodIndex(['2014-01', '2014-02', '2014-03'], dtype='period[M]')
- Parameters:
use_na_sentinel (bool)
- Return type:
tuple[ndarray, ExtensionArray]
- fillna(value=None, method=None, limit=None, copy=True)¶
Fill NA/NaN values using the specified method.
Parameters¶
- valuescalar, array-like
If a scalar value is passed it is used to fill all missing values. Alternatively, an array-like “value” can be given. It’s expected that the array-like have the same length as ‘self’.
- method{‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}, default None
Method to use for filling holes in reindexed Series:
pad / ffill: propagate last valid observation forward to next valid.
backfill / bfill: use NEXT valid observation to fill gap.
Deprecated since version 2.1.0.
- limitint, default None
If method is specified, this is the maximum number of consecutive NaN values to forward/backward fill. In other words, if there is a gap with more than this number of consecutive NaNs, it will only be partially filled. If method is not specified, this is the maximum number of entries along the entire axis where NaNs will be filled.
Deprecated since version 2.1.0.
- copybool, default True
Whether to make a copy of the data before filling. If False, then the original should be modified and no new memory should be allocated. For ExtensionArray subclasses that cannot do this, it is at the author’s discretion whether to ignore “copy=False” or to raise. The base class implementation ignores the keyword in pad/backfill cases.
Returns¶
- ExtensionArray
With NA/NaN filled.
Examples¶
>>> arr = pd.array([np.nan, np.nan, 2, 3, np.nan, np.nan]) >>> arr.fillna(0) <IntegerArray> [0, 0, 2, 3, 0, 0] Length: 6, dtype: Int64
- Parameters:
value (object | ArrayLike | None)
method (FillnaOptions | None)
limit (int | None)
copy (bool)
- Return type:
Self
- insert(loc, item)¶
Insert an item at the given position.
Parameters¶
loc : int item : scalar-like
Returns¶
same type as self
Notes¶
This method should be both type and dtype-preserving. If the item cannot be held in an array of this type/dtype, either ValueError or TypeError should be raised.
The default implementation relies on _from_sequence to raise on invalid items.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.insert(2, -1) <IntegerArray> [1, 2, -1, 3] Length: 4, dtype: Int64
- Parameters:
loc (int)
- Return type:
Self
- interpolate(*, method, axis, index, limit, limit_direction, limit_area, copy, **kwargs)¶
See DataFrame.interpolate.__doc__.
Examples¶
>>> arr = pd.arrays.NumpyExtensionArray(np.array([0, 1, np.nan, 3])) >>> arr.interpolate(method="linear", ... limit=3, ... limit_direction="forward", ... index=pd.Index([1, 2, 3, 4]), ... fill_value=1, ... copy=False, ... axis=0, ... limit_area="inside" ... ) <NumpyExtensionArray> [0.0, 1.0, 2.0, 3.0] Length: 4, dtype: float64
- Parameters:
method (InterpolateOptions)
axis (int)
index (Index)
copy (bool)
- Return type:
Self
- isin(values)¶
Pointwise comparison for set containment in the given values.
Roughly equivalent to np.array([x in values for x in self])
Parameters¶
values : np.ndarray or ExtensionArray
Returns¶
np.ndarray[bool]
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.isin([1]) <BooleanArray> [True, False, False] Length: 3, dtype: boolean
- Parameters:
values (ArrayLike)
- Return type:
npt.NDArray[np.bool_]
- isna()¶
A 1-D array indicating if each value is missing.
Returns¶
- numpy.ndarray or pandas.api.extensions.ExtensionArray
In most cases, this should return a NumPy ndarray. For exceptional cases like
SparseArray, where returning an ndarray would be expensive, an ExtensionArray may be returned.
Notes¶
If returning an ExtensionArray, then
na_values._is_booleanshould be Truena_values should implement
ExtensionArray._reduce()na_values.anyandna_values.allshould be implemented
Examples¶
>>> arr = pd.array([1, 2, np.nan, np.nan]) >>> arr.isna() array([False, False, True, True])
- map(mapper, na_action=None)¶
Map values using an input mapping or function.
Parameters¶
- mapperfunction, dict, or Series
Mapping correspondence.
- na_action{None, ‘ignore’}, default None
If ‘ignore’, propagate NA values, without passing them to the mapping correspondence. If ‘ignore’ is not supported, a
NotImplementedErrorshould be raised.
Returns¶
- Union[ndarray, Index, ExtensionArray]
The output of the mapping function applied to the array. If the function returns a tuple with more than one element a MultiIndex will be returned.
- nbytes()¶
The number of bytes needed to store this object in memory.
Examples¶
>>> pd.array([1, 2, 3]).nbytes 27
- property ndim: int¶
Extension Arrays are only allowed to be 1-dimensional.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.ndim 1
- ravel(order='C')¶
Return a flattened view on this array.
Parameters¶
order : {None, ‘C’, ‘F’, ‘A’, ‘K’}, default ‘C’
Returns¶
ExtensionArray
Notes¶
Because ExtensionArrays are 1D-only, this is a no-op.
The “order” argument is ignored, is for compatibility with NumPy.
Examples¶
>>> pd.array([1, 2, 3]).ravel() <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- Parameters:
order (Literal['C', 'F', 'A', 'K'] | None)
- Return type:
ExtensionArray
- repeat(repeats, axis=None)¶
Repeat elements of a ExtensionArray.
Returns a new ExtensionArray where each element of the current ExtensionArray is repeated consecutively a given number of times.
Parameters¶
- repeatsint or array of ints
The number of repetitions for each element. This should be a non-negative integer. Repeating 0 times will return an empty ExtensionArray.
- axisNone
Must be
None. Has no effect but is accepted for compatibility with numpy.
Returns¶
- ExtensionArray
Newly created ExtensionArray with repeated elements.
See Also¶
Series.repeat : Equivalent function for Series. Index.repeat : Equivalent function for Index. numpy.repeat : Similar method for
numpy.ndarray. ExtensionArray.take : Take arbitrary positions.Examples¶
>>> cat = pd.Categorical(['a', 'b', 'c']) >>> cat ['a', 'b', 'c'] Categories (3, object): ['a', 'b', 'c'] >>> cat.repeat(2) ['a', 'a', 'b', 'b', 'c', 'c'] Categories (3, object): ['a', 'b', 'c'] >>> cat.repeat([1, 2, 3]) ['a', 'b', 'b', 'c', 'c', 'c'] Categories (3, object): ['a', 'b', 'c']
- Parameters:
repeats (int | Sequence[int])
axis (AxisInt | None)
- Return type:
Self
- searchsorted(value, side='left', sorter=None)¶
Find indices where elements should be inserted to maintain order.
Find the indices into a sorted array self (a) such that, if the corresponding elements in value were inserted before the indices, the order of self would be preserved.
Assuming that self is sorted:
side
returned index i satisfies
left
self[i-1] < value <= self[i]right
self[i-1] <= value < self[i]Parameters¶
- valuearray-like, list or scalar
Value(s) to insert into self.
- side{‘left’, ‘right’}, optional
If ‘left’, the index of the first suitable location found is given. If ‘right’, return the last such index. If there is no suitable index, return either 0 or N (where N is the length of self).
- sorter1-D array-like, optional
Optional array of integer indices that sort array a into ascending order. They are typically the result of argsort.
Returns¶
- array of ints or int
If value is array-like, array of insertion points. If value is scalar, a single integer.
See Also¶
numpy.searchsorted : Similar method from NumPy.
Examples¶
>>> arr = pd.array([1, 2, 3, 5]) >>> arr.searchsorted([4]) array([3])
- Parameters:
value (NumpyValueArrayLike | ExtensionArray)
side (Literal['left', 'right'])
sorter (NumpySorter | None)
- Return type:
npt.NDArray[np.intp] | np.intp
- property shape: Shape¶
Return a tuple of the array dimensions.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.shape (3,)
- shift(periods=1, fill_value=None)¶
Shift values by desired number.
Newly introduced missing values are filled with
self.dtype.na_value.Parameters¶
- periodsint, default 1
The number of periods to shift. Negative values are allowed for shifting backwards.
- fill_valueobject, optional
The scalar value to use for newly introduced missing values. The default is
self.dtype.na_value.
Returns¶
- ExtensionArray
Shifted.
Notes¶
If
selfis empty orperiodsis 0, a copy ofselfis returned.If
periods > len(self), then an array of size len(self) is returned, with all values filled withself.dtype.na_value.For 2-dimensional ExtensionArrays, we are always shifting along axis=0.
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.shift(2) <IntegerArray> [<NA>, <NA>, 1] Length: 3, dtype: Int64
- Parameters:
periods (int)
fill_value (object | None)
- Return type:
- property size: int¶
The number of elements in the array.
- take(indexer, allow_fill=False, fill_value=None)¶
Take elements from an array.
Parameters¶
- indicessequence of int or one-dimensional np.ndarray of int
Indices to be taken.
- allow_fillbool, default False
How to handle negative values in indices.
False: negative values in indices indicate positional indices from the right (the default). This is similar to
numpy.take().True: negative values in indices indicate missing values. These values are set to fill_value. Any other other negative values raise a
ValueError.
- fill_valueany, optional
Fill value to use for NA-indices when allow_fill is True. This may be
None, in which case the default NA value for the type,self.dtype.na_value, is used.For many ExtensionArrays, there will be two representations of fill_value: a user-facing “boxed” scalar, and a low-level physical NA value. fill_value should be the user-facing version, and the implementation should handle translating that to the physical version for processing the take if necessary.
Returns¶
ExtensionArray
Raises¶
- IndexError
When the indices are out of bounds for the array.
- ValueError
When indices contains negative values other than
-1and allow_fill is True.
See Also¶
numpy.take : Take elements from an array along an axis. api.extensions.take : Take elements from an array.
Notes¶
ExtensionArray.take is called by
Series.__getitem__,.loc,iloc, when indices is a sequence of values. Additionally, it’s called bySeries.reindex(), or any other method that causes realignment, with a fill_value.Examples¶
Here’s an example implementation, which relies on casting the extension array to object dtype. This uses the helper method
pandas.api.extensions.take().def take(self, indices, allow_fill=False, fill_value=None): from pandas.core.algorithms import take # If the ExtensionArray is backed by an ndarray, then # just pass that here instead of coercing to object. data = self.astype(object) if allow_fill and fill_value is None: fill_value = self.dtype.na_value # fill value should always be translated from the scalar # type for the array, to the physical storage type for # the data, before passing to take. result = take(data, indices, fill_value=fill_value, allow_fill=allow_fill) return self._from_sequence(result, dtype=self.dtype)
- to_numpy(dtype=None, copy=False, na_value=<no_default>)¶
Convert to a NumPy ndarray.
This is similar to
numpy.asarray(), but may provide additional control over how the conversion is done.Parameters¶
- dtypestr or numpy.dtype, optional
The dtype to pass to
numpy.asarray().- copybool, default False
Whether to ensure that the returned value is a not a view on another array. Note that
copy=Falsedoes not ensure thatto_numpy()is no-copy. Rather,copy=Trueensure that a copy is made, even if not strictly necessary.- na_valueAny, optional
The value to use for missing values. The default value depends on dtype and the type of the array.
Returns¶
numpy.ndarray
- Parameters:
dtype (npt.DTypeLike | None)
copy (bool)
na_value (object)
- Return type:
np.ndarray
- tolist()¶
Return a list of the values.
These are each a scalar type, which is a Python scalar (for str, int, float) or a pandas scalar (for Timestamp/Timedelta/Interval/Period)
Returns¶
list
Examples¶
>>> arr = pd.array([1, 2, 3]) >>> arr.tolist() [1, 2, 3]
- Return type:
list
- transpose(*axes)¶
Return a transposed view on this array.
Because ExtensionArrays are always 1D, this is a no-op. It is included for compatibility with np.ndarray.
Returns¶
ExtensionArray
Examples¶
>>> pd.array([1, 2, 3]).transpose() <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- Parameters:
axes (int)
- Return type:
- unique()¶
Compute the ExtensionArray of unique values.
Returns¶
pandas.api.extensions.ExtensionArray
Examples¶
>>> arr = pd.array([1, 2, 3, 1, 2, 3]) >>> arr.unique() <IntegerArray> [1, 2, 3] Length: 3, dtype: Int64
- Return type:
Self
- view(dtype=None)¶
Return a view on the array.
Parameters¶
- dtypestr, np.dtype, or ExtensionDtype, optional
Default None.
Returns¶
- ExtensionArray or np.ndarray
A view on the
ExtensionArray’s data.
Examples¶
This gives view on the underlying data of an
ExtensionArrayand is not a copy. Modifications on either the view or the originalExtensionArraywill be reflectd on the underlying data:>>> arr = pd.array([1, 2, 3]) >>> arr2 = arr.view() >>> arr[0] = 2 >>> arr2 <IntegerArray> [2, 2, 3] Length: 3, dtype: Int64
- Parameters:
dtype (Dtype | None)
- Return type:
ArrayLike