Advertisement
Advertisement


Renaming columns in pandas


Question

I have a DataFrame using pandas and column labels that I need to edit to replace the original column labels.

I'd like to change the column names in a DataFrame A where the original column names are:

['$a', '$b', '$c', '$d', '$e'] 

to

['a', 'b', 'c', 'd', 'e'].

I have the edited column names stored it in a list, but I don't know how to replace the column names.

2017/12/12
1
1932
12/12/2017 6:55:20 PM

Accepted Answer

Just assign it to the .columns attribute:

>>> df = pd.DataFrame({'$a':[1,2], '$b': [10,20]})
>>> df.columns = ['a', 'b']
>>> df
   a   b
0  1  10
1  2  20
2020/07/03
1951
7/3/2020 7:15:46 PM


The rename method can take a function, for example:

In [11]: df.columns
Out[11]: Index([u'$a', u'$b', u'$c', u'$d', u'$e'], dtype=object)

In [12]: df.rename(columns=lambda x: x[1:], inplace=True)

In [13]: df.columns
Out[13]: Index([u'a', u'b', u'c', u'd', u'e'], dtype=object)
2019/10/20

As documented in Working with text data:

df.columns = df.columns.str.replace('$','')
2020/04/10

Pandas 0.21+ Answer

There have been some significant updates to column renaming in version 0.21.

  • The rename method has added the axis parameter which may be set to columns or 1. This update makes this method match the rest of the pandas API. It still has the index and columns parameters but you are no longer forced to use them.
  • The set_axis method with the inplace set to False enables you to rename all the index or column labels with a list.

Examples for Pandas 0.21+

Construct sample DataFrame:

df = pd.DataFrame({'$a':[1,2], '$b': [3,4], 
                   '$c':[5,6], '$d':[7,8], 
                   '$e':[9,10]})

   $a  $b  $c  $d  $e
0   1   3   5   7   9
1   2   4   6   8  10

Using rename with axis='columns' or axis=1

df.rename({'$a':'a', '$b':'b', '$c':'c', '$d':'d', '$e':'e'}, axis='columns')

or

df.rename({'$a':'a', '$b':'b', '$c':'c', '$d':'d', '$e':'e'}, axis=1)

Both result in the following:

   a  b  c  d   e
0  1  3  5  7   9
1  2  4  6  8  10

It is still possible to use the old method signature:

df.rename(columns={'$a':'a', '$b':'b', '$c':'c', '$d':'d', '$e':'e'})

The rename function also accepts functions that will be applied to each column name.

df.rename(lambda x: x[1:], axis='columns')

or

df.rename(lambda x: x[1:], axis=1)

Using set_axis with a list and inplace=False

You can supply a list to the set_axis method that is equal in length to the number of columns (or index). Currently, inplace defaults to True, but inplace will be defaulted to False in future releases.

df.set_axis(['a', 'b', 'c', 'd', 'e'], axis='columns', inplace=False)

or

df.set_axis(['a', 'b', 'c', 'd', 'e'], axis=1, inplace=False)

Why not use df.columns = ['a', 'b', 'c', 'd', 'e']?

There is nothing wrong with assigning columns directly like this. It is a perfectly good solution.

The advantage of using set_axis is that it can be used as part of a method chain and that it returns a new copy of the DataFrame. Without it, you would have to store your intermediate steps of the chain to another variable before reassigning the columns.

# new for pandas 0.21+
df.some_method1()
  .some_method2()
  .set_axis()
  .some_method3()

# old way
df1 = df.some_method1()
        .some_method2()
df1.columns = columns
df1.some_method3()
2017/11/17

Since you only want to remove the $ sign in all column names, you could just do:

df = df.rename(columns=lambda x: x.replace('$', ''))

OR

df.rename(columns=lambda x: x.replace('$', ''), inplace=True)
2014/03/26

df.columns = ['a', 'b', 'c', 'd', 'e']

It will replace the existing names with the names you provide, in the order you provide.

2018/10/12