W3cubDocs

/pandas 0.23

pandas.Series.str.split

Series.str.split(pat=None, n=-1, expand=False) [source]

Split strings around given separator/delimiter.

Split each string in the caller’s values by given pattern, propagating NaN values. Equivalent to str.split().

Parameters:

Parameters:	pat : str, optional String or regular expression to split on. If not specified, split on whitespace. n : int, default -1 (all) Limit number of splits in output. `None`, 0 and -1 will be interpreted as return all splits. expand : bool, default False Expand the splitted strings into separate columns. If `True`, return DataFrame/MultiIndex expanding dimensionality. If `False`, return Series/Index, containing lists of strings.
Returns:	Series, Index, DataFrame or MultiIndex Type matches caller unless `expand=True` (see Notes).

pat : str, optional

String or regular expression to split on. If not specified, split on whitespace.

n : int, default -1 (all)

Limit number of splits in output. None, 0 and -1 will be interpreted as return all splits.

expand : bool, default False

Expand the splitted strings into separate columns.

If True, return DataFrame/MultiIndex expanding dimensionality.
If False, return Series/Index, containing lists of strings.

Returns:

Series, Index, DataFrame or MultiIndex

Type matches caller unless expand=True (see Notes).

Notes

The handling of the n keyword depends on the number of found splits:

If found splits > n, make first n splits only
If found splits <= n, make all splits
If for a certain row the number of found splits < n, append None for padding up to n if expand=True

If using expand=True, Series and Index callers return DataFrame and MultiIndex objects, respectively.

Examples

>>> s = pd.Series(["this is good text", "but this is even better"])

By default, split will return an object of the same size having lists containing the split elements

>>> s.str.split()
0           [this, is, good, text]
1    [but, this, is, even, better]
dtype: object
>>> s.str.split("random")
0          [this is good text]
1    [but this is even better]
dtype: object

When using expand=True, the split elements will expand out into separate columns.

For Series object, output return type is DataFrame.

>>> s.str.split(expand=True)
      0     1     2     3       4
0  this    is  good  text    None
1   but  this    is  even  better
>>> s.str.split(" is ", expand=True)
          0            1
0      this    good text
1  but this  even better

For Index object, output return type is MultiIndex.

>>> i = pd.Index(["ba 100 001", "ba 101 002", "ba 102 003"])
>>> i.str.split(expand=True)
MultiIndex(levels=[['ba'], ['100', '101', '102'], ['001', '002', '003']],
       labels=[[0, 0, 0], [0, 1, 2], [0, 1, 2]])

Parameter n can be used to limit the number of splits in the output.

>>> s.str.split("is", n=1)
0          [th,  is good text]
1    [but th,  is even better]
dtype: object
>>> s.str.split("is", n=1, expand=True)
        0                1
0      th     is good text
1  but th   is even better

If NaN is present, it is propagated throughout the columns during the split.

>>> s = pd.Series(["this is good text", "but this is even better", np.nan])
>>> s.str.split(n=3, expand=True)
      0     1     2            3
0  this    is  good         text
1   but  this    is  even better
2   NaN   NaN   NaN          NaN

© 2008–2012, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
http://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.Series.str.split.html