pyspark.pandas.DataFrame.explode#

DataFrame.explode(column, ignore_index=False)[source]#

Transform each element of a list-like to a row, replicating index values.

Parameters
columnstr or tuple

Column to explode.

ignore_indexbool, default False

If True, the resulting index will be labeled 0, 1, …, n - 1.

Returns
DataFrame

Exploded lists to rows of the subset columns; index will be duplicated for these rows.

See also

DataFrame.unstack

Pivot a level of the (necessarily hierarchical) index labels.

DataFrame.melt

Unpivot a DataFrame from wide format to long format.

Examples

>>> df = ps.DataFrame({'A': [[1, 2, 3], [], [3, 4]], 'B': 1})
>>> df
           A  B
0  [1, 2, 3]  1
1         []  1
2     [3, 4]  1
>>> df.explode('A')
     A  B
0  1.0  1
0  2.0  1
0  3.0  1
1  NaN  1
2  3.0  1
2  4.0  1
>>> df.explode('A', ignore_index=True)
     A  B
0  1.0  1
1  2.0  1
2  3.0  1
3  NaN  1
4  3.0  1
5  4.0  1