pandas.to_datetime(arg, errors='raise', dayfirst=False, yearfirst=False, utc=None, box=True, format=None, exact=True, unit=None, infer_datetime_format=False, origin='unix', cache=False)
[source]
Convert argument to datetime.
Parameters: |
arg : integer, float, string, datetime, list, tuple, 1-d array, Series New in version 0.18.1: or DataFrame/dict-like errors : {‘ignore’, ‘raise’, ‘coerce’}, default ‘raise’
dayfirst : boolean, default False Specify a date parse order if yearfirst : boolean, default False Specify a date parse order if
Warning: yearfirst=True is not strict, but will prefer to parse with year first (this is a known bug, based on dateutil beahavior). New in version 0.16.1. utc : boolean, default None Return UTC DatetimeIndex if True (converting any tz-aware datetime.datetime objects as well). box : boolean, default True
format : string, default None strftime to parse time, eg “%d/%m/%Y”, note that “%f” will parse all the way up to nanoseconds. exact : boolean, True by default
unit : string, default ‘ns’ unit of the arg (D,s,ms,us,ns) denote the unit, which is an integer or float number. This will be based off the origin. Example, with unit=’ms’ and origin=’unix’ (the default), this would calculate the number of milliseconds to the unix epoch start. infer_datetime_format : boolean, default False If True and no origin : scalar, default is ‘unix’ Define the reference date. The numeric values would be parsed as number of units (defined by
New in version 0.20.0. cache : boolean, default False If True, use a cache of unique, converted dates to apply the datetime conversion. May produce sigificant speed-up when parsing duplicate date strings, especially ones with timezone offsets. New in version 0.23.0. |
---|---|
Returns: |
ret : datetime if parsing succeeded. Return type depends on input:
In case when it is not possible to return designated types (e.g. when any element of input is before Timestamp.min or after Timestamp.max) return will have datetime.datetime type (or corresponding array/Series). |
See also
pandas.DataFrame.astype
pandas.to_timedelta
Assembling a datetime from multiple columns of a DataFrame. The keys can be common abbreviations like [‘year’, ‘month’, ‘day’, ‘minute’, ‘second’, ‘ms’, ‘us’, ‘ns’]) or plurals of the same
>>> df = pd.DataFrame({'year': [2015, 2016], 'month': [2, 3], 'day': [4, 5]}) >>> pd.to_datetime(df) 0 2015-02-04 1 2016-03-05 dtype: datetime64[ns]
If a date does not meet the timestamp limitations, passing errors=’ignore’ will return the original input instead of raising any exception.
Passing errors=’coerce’ will force an out-of-bounds date to NaT, in addition to forcing non-dates (or non-parseable dates) to NaT.
>>> pd.to_datetime('13000101', format='%Y%m%d', errors='ignore') datetime.datetime(1300, 1, 1, 0, 0) >>> pd.to_datetime('13000101', format='%Y%m%d', errors='coerce') NaT
Passing infer_datetime_format=True can often-times speedup a parsing if its not an ISO8601 format exactly, but in a regular format.
>>> s = pd.Series(['3/11/2000', '3/12/2000', '3/13/2000']*1000)
>>> s.head() 0 3/11/2000 1 3/12/2000 2 3/13/2000 3 3/11/2000 4 3/12/2000 dtype: object
>>> %timeit pd.to_datetime(s,infer_datetime_format=True) 100 loops, best of 3: 10.4 ms per loop
>>> %timeit pd.to_datetime(s,infer_datetime_format=False) 1 loop, best of 3: 471 ms per loop
Using a unix epoch time
>>> pd.to_datetime(1490195805, unit='s') Timestamp('2017-03-22 15:16:45') >>> pd.to_datetime(1490195805433502912, unit='ns') Timestamp('2017-03-22 15:16:45.433502912')
Warning
For float arg, precision rounding might happen. To prevent unexpected behavior use a fixed-width exact type.
Using a non-unix epoch origin
>>> pd.to_datetime([1, 2, 3], unit='D', origin=pd.Timestamp('1960-01-01')) 0 1960-01-02 1 1960-01-03 2 1960-01-04
© 2008–2012, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
http://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.to_datetime.html