python - Pivot each group in Pandas -


using pandas i've invoked groupby on dataframe , obtained following:

>>>grouped = df.groupby(['cid']) key, gr in grouped:         print(key)         print(gr) out: cid  price      121  12      121  10      121  9 

i want have each group pivoted like:

cid price1 price2 price3 121     12     10      9 

what correct way pandas?

assuming have frame looking like

>>> df = pd.dataframe({"cid": np.arange(64)//8, "price": np.arange(64)}) >>> df.head()    cid  price 0    0      0 1    0      1 2    0      2 3    0      3 4    0      4 

then think can want combining groupby , pivot:

df = pd.dataframe({"cid": np.arange(64)//8, "price": np.arange(64)}) df["num"] = df.groupby("cid")["price"].cumcount() + 1 pivoted = df.pivot(index="cid", columns="num", values="price") pivoted.columns = "price" + pivoted.columns.astype(str) pivoted = pivoted.reset_index() 

which gives

>>> pivoted    cid  price1  price2  price3  price4  price5  price6  price7  price8 0    0       0       1       2       3       4       5       6       7 1    1       8       9      10      11      12      13      14      15 2    2      16      17      18      19      20      21      22      23 3    3      24      25      26      27      28      29      30      31 4    4      32      33      34      35      36      37      38      39 5    5      40      41      42      43      44      45      46      47 6    6      48      49      50      51      52      53      54      55 7    7      56      57      58      59      60      61      62      63 

aside: sticking numbers after end of strings, e.g. "price5", not idea. can't work them, don't sort way you'd expect, etc.


first, create column showing index in price:

>>> df["num"] = df.groupby("cid")["price"].cumcount() + 1 >>> df.head(10)    cid  price  num 0    0      0    1 1    0      1    2 2    0      2    3 [etc.] 7    0      7    8 8    1      8    1 9    1      9    2 

then pivot:

>>> pivoted = df.pivot(index="cid", columns="num", values="price") >>> pivoted num   1   2   3   4   5   6   7   8 cid                                 0     0   1   2   3   4   5   6   7 1     8   9  10  11  12  13  14  15 2    16  17  18  19  20  21  22  23 3    24  25  26  27  28  29  30  31 4    32  33  34  35  36  37  38  39 5    40  41  42  43  44  45  46  47 6    48  49  50  51  52  53  54  55 7    56  57  58  59  60  61  62  63 

then fix columns:

>>> pivoted.columns = "price" + pivoted.columns.astype(str) >>> pivoted      price1  price2  price3  price4  price5  price6  price7  price8 cid                                                                 0         0       1       2       3       4       5       6       7 1         8       9      10      11      12      13      14      15 2        16      17      18      19      20      21      22      23 3        24      25      26      27      28      29      30      31 4        32      33      34      35      36      37      38      39 5        40      41      42      43      44      45      46      47 6        48      49      50      51      52      53      54      55 7        56      57      58      59      60      61      62      63 

and reset index:

>>> pivoted = pivoted.reset_index() >>> pivoted    cid  price1  price2  price3  price4  price5  price6  price7  price8 0    0       0       1       2       3       4       5       6       7 1    1       8       9      10      11      12      13      14      15 2    2      16      17      18      19      20      21      22      23 3    3      24      25      26      27      28      29      30      31 4    4      32      33      34      35      36      37      38      39 5    5      40      41      42      43      44      45      46      47 6    6      48      49      50      51      52      53      54      55 7    7      56      57      58      59      60      61      62      63 

Comments

Popular posts from this blog

javascript - gulp-nodemon - nodejs restart after file change - Error: listen EADDRINUSE events.js:85 -

Fatal Python error: Py_Initialize: unable to load the file system codec. ImportError: No module named 'encodings' -

oracle - Changing start date for system jobs related to automatic statistics collections in 11g -