Index.get_duplicates() [source]
Extract duplicated index elements.
Returns a sorted list of index elements which appear more than once in the index.
Deprecated since version 0.23.0: Use idx[idx.duplicated()].unique() instead
| Returns: |
array-like List of duplicated indexes. |
|---|
See also
Index.duplicated
Index.drop_duplicates
Works on different Index of types.
>>> pd.Index([1, 2, 2, 3, 3, 3, 4]).get_duplicates() [2, 3] >>> pd.Index([1., 2., 2., 3., 3., 3., 4.]).get_duplicates() [2.0, 3.0] >>> pd.Index(['a', 'b', 'b', 'c', 'c', 'c', 'd']).get_duplicates() ['b', 'c']
Note that for a DatetimeIndex, it does not return a list but a new DatetimeIndex:
>>> dates = pd.to_datetime(['2018-01-01', '2018-01-02', '2018-01-03',
... '2018-01-03', '2018-01-04', '2018-01-04'],
... format='%Y-%m-%d')
>>> pd.Index(dates).get_duplicates()
DatetimeIndex(['2018-01-03', '2018-01-04'],
dtype='datetime64[ns]', freq=None)
Sorts duplicated elements even when indexes are unordered.
>>> pd.Index([1, 2, 3, 2, 3, 4, 3]).get_duplicates() [2, 3]
Return empty array-like structure when all elements are unique.
>>> pd.Index([1, 2, 3, 4]).get_duplicates() [] >>> dates = pd.to_datetime(['2018-01-01', '2018-01-02', '2018-01-03'], ... format='%Y-%m-%d') >>> pd.Index(dates).get_duplicates() DatetimeIndex([], dtype='datetime64[ns]', freq=None)
© 2008–2012, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
http://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.Index.get_duplicates.html