How to change the order of DataFrame columns?
How to change the order of DataFrame columns?
Question
I have the following DataFrame
(df
):
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.rand(10, 5))
I add more column(s) by assignment:
df['mean'] = df.mean(1)
How can I move the column mean
to the front, i.e. set it as first column leaving the order of the other columns untouched?
Popular Answer
One easy way would be to reassign the dataframe with a list of the columns, rearranged as needed.
This is what you have now:
In [6]: df
Out[6]:
0 1 2 3 4 mean
0 0.445598 0.173835 0.343415 0.682252 0.582616 0.445543
1 0.881592 0.696942 0.702232 0.696724 0.373551 0.670208
2 0.662527 0.955193 0.131016 0.609548 0.804694 0.632596
3 0.260919 0.783467 0.593433 0.033426 0.512019 0.436653
4 0.131842 0.799367 0.182828 0.683330 0.019485 0.363371
5 0.498784 0.873495 0.383811 0.699289 0.480447 0.587165
6 0.388771 0.395757 0.745237 0.628406 0.784473 0.588529
7 0.147986 0.459451 0.310961 0.706435 0.100914 0.345149
8 0.394947 0.863494 0.585030 0.565944 0.356561 0.553195
9 0.689260 0.865243 0.136481 0.386582 0.730399 0.561593
In [7]: cols = df.columns.tolist()
In [8]: cols
Out[8]: [0L, 1L, 2L, 3L, 4L, 'mean']
Rearrange cols
in any way you want. This is how I moved the last element to the first position:
In [12]: cols = cols[-1:] + cols[:-1]
In [13]: cols
Out[13]: ['mean', 0L, 1L, 2L, 3L, 4L]
Then reorder the dataframe like this:
In [16]: df = df[cols] # OR df = df.ix[:, cols]
In [17]: df
Out[17]:
mean 0 1 2 3 4
0 0.445543 0.445598 0.173835 0.343415 0.682252 0.582616
1 0.670208 0.881592 0.696942 0.702232 0.696724 0.373551
2 0.632596 0.662527 0.955193 0.131016 0.609548 0.804694
3 0.436653 0.260919 0.783467 0.593433 0.033426 0.512019
4 0.363371 0.131842 0.799367 0.182828 0.683330 0.019485
5 0.587165 0.498784 0.873495 0.383811 0.699289 0.480447
6 0.588529 0.388771 0.395757 0.745237 0.628406 0.784473
7 0.345149 0.147986 0.459451 0.310961 0.706435 0.100914
8 0.553195 0.394947 0.863494 0.585030 0.565944 0.356561
9 0.561593 0.689260 0.865243 0.136481 0.386582 0.730399
Read more… Read less…
You could also do something like this:
df = df[['mean', '0', '1', '2', '3']]
You can get the list of columns with:
cols = list(df.columns.values)
The output will produce:
['0', '1', '2', '3', 'mean']
...which is then easy to rearrange manually before dropping it into the first function
Just assign the column names in the order you want them:
In [39]: df
Out[39]:
0 1 2 3 4 mean
0 0.172742 0.915661 0.043387 0.712833 0.190717 1
1 0.128186 0.424771 0.590779 0.771080 0.617472 1
2 0.125709 0.085894 0.989798 0.829491 0.155563 1
3 0.742578 0.104061 0.299708 0.616751 0.951802 1
4 0.721118 0.528156 0.421360 0.105886 0.322311 1
5 0.900878 0.082047 0.224656 0.195162 0.736652 1
6 0.897832 0.558108 0.318016 0.586563 0.507564 1
7 0.027178 0.375183 0.930248 0.921786 0.337060 1
8 0.763028 0.182905 0.931756 0.110675 0.423398 1
9 0.848996 0.310562 0.140873 0.304561 0.417808 1
In [40]: df = df[['mean', 4,3,2,1]]
Now, 'mean' column comes out in the front:
In [41]: df
Out[41]:
mean 4 3 2 1
0 1 0.190717 0.712833 0.043387 0.915661
1 1 0.617472 0.771080 0.590779 0.424771
2 1 0.155563 0.829491 0.989798 0.085894
3 1 0.951802 0.616751 0.299708 0.104061
4 1 0.322311 0.105886 0.421360 0.528156
5 1 0.736652 0.195162 0.224656 0.082047
6 1 0.507564 0.586563 0.318016 0.558108
7 1 0.337060 0.921786 0.930248 0.375183
8 1 0.423398 0.110675 0.931756 0.182905
9 1 0.417808 0.304561 0.140873 0.310562
How about:
df.insert(0, 'mean', df.mean(1))
http://pandas.pydata.org/pandas-docs/stable/dsintro.html#column-selection-addition-deletion
In your case,
df = df.reindex(columns=['mean',0,1,2,3,4])
will do exactly what you want.
In my case (general form):
df = df.reindex(columns=sorted(df.columns))
df = df.reindex(columns=(['opened'] + list([a for a in df.columns if a != 'opened']) ))
You need to create a new list of your columns in the desired order, then use df = df[cols]
to rearrange the columns in this new order.
cols = ['mean'] + [col for col in df if col != 'mean']
df = df[cols]
You can also use a more general approach. In this example, the last column (indicated by -1) is inserted as the first column.
cols = [df.columns[-1]] + [col for col in df if col != df.columns[-1]]
df = df[cols]
You can also use this approach for reordering columns in a desired order if they are present in the DataFrame.
inserted_cols = ['a', 'b', 'c']
cols = ([col for col in inserted_cols if col in df]
+ [col for col in df if col not in inserted_cols])
df = df[cols]