Advertisement
Advertisement


Set value for particular cell in pandas DataFrame using index


Question

I've created a Pandas DataFrame

df = DataFrame(index=['A','B','C'], columns=['x','y'])

and got this

    x    y
A  NaN  NaN
B  NaN  NaN
C  NaN  NaN


Then I want to assign value to particular cell, for example for row 'C' and column 'x'. I've expected to get such result:

    x    y
A  NaN  NaN
B  NaN  NaN
C  10  NaN

with this code:

df.xs('C')['x'] = 10

but contents of df haven't changed. It's again only NaNs in DataFrame.

Any suggestions?

2019/03/26
1
516
3/26/2019 5:19:58 PM

Accepted Answer

RukTech's answer, df.set_value('C', 'x', 10), is far and away faster than the options I've suggested below. However, it has been slated for deprecation.

Going forward, the recommended method is .iat/.at.


Why df.xs('C')['x']=10 does not work:

df.xs('C') by default, returns a new dataframe with a copy of the data, so

df.xs('C')['x']=10

modifies this new dataframe only.

df['x'] returns a view of the df dataframe, so

df['x']['C'] = 10

modifies df itself.

Warning: It is sometimes difficult to predict if an operation returns a copy or a view. For this reason the docs recommend avoiding assignments with "chained indexing".


So the recommended alternative is

df.at['C', 'x'] = 10

which does modify df.


In [18]: %timeit df.set_value('C', 'x', 10)
100000 loops, best of 3: 2.9 µs per loop

In [20]: %timeit df['x']['C'] = 10
100000 loops, best of 3: 6.31 µs per loop

In [81]: %timeit df.at['C', 'x'] = 10
100000 loops, best of 3: 9.2 µs per loop
2017/09/16
637
9/16/2017 2:35:13 AM

Update: The .set_value method is going to be deprecated. .iat/.at are good replacements, unfortunately pandas provides little documentation


The fastest way to do this is using set_value. This method is ~100 times faster than .ix method. For example:

df.set_value('C', 'x', 10)

2019/08/07

You can also use a conditional lookup using .loc as seen here:

df.loc[df[<some_column_name>] == <condition>, [<another_column_name>]] = <value_to_add>

where <some_column_name is the column you want to check the <condition> variable against and <another_column_name> is the column you want to add to (can be a new column or one that already exists). <value_to_add> is the value you want to add to that column/row.

This example doesn't work precisely with the question at hand, but it might be useful for someone wants to add a specific value based on a condition.

2018/12/18

The recommended way (according to the maintainers) to set a value is:

df.ix['x','C']=10

Using 'chained indexing' (df['x']['C']) may lead to problems.

See:

2017/05/23

Try using df.loc[row_index,col_indexer] = value

2015/10/16

This is the only thing that worked for me!

df.loc['C', 'x'] = 10

Learn more about .loc here.

2017/01/23