如何在Python Pandas DataFrame中的特定行中更新值?

问题描述：

有了Pandas中不错的索引方法，我可以用各种方式提取数据没有问题.另一方面，我仍然对如何更改现有DataFrame中的数据感到困惑.

With the nice indexing methods in Pandas I have no problems extracting data in various ways. On the other hand I am still confused about how to change data in an existing DataFrame.

在下面的代码中，我有两个DataFrames，我的目标是从第二个df的值更新第一个df中特定行的值.我该如何实现?

In the following code I have two DataFrames and my goal is to update values in a specific row in the first df from values of the second df. How can I achieve this?

import pandas as pd
df = pd.DataFrame({'filename' :  ['test0.dat', 'test2.dat'], 
                                  'm': [12, 13], 'n' : [None, None]})
df2 = pd.DataFrame({'filename' :  'test2.dat', 'n':16}, index=[0])

# this overwrites the first row but we want to update the second
# df.update(df2)

# this does not update anything
df.loc[df.filename == 'test2.dat'].update(df2)

print(df)

给予

   filename   m     n
0  test0.dat  12  None
1  test2.dat  13  None

[2 rows x 3 columns]

但是我怎么能做到这一点:

but how can I achieve this:

    filename   m     n
0  test0.dat  12  None
1  test2.dat  13  16

[2 rows x 3 columns]

答

因此，首先，熊猫使用索引进行更新.当更新命令不更新任何内容时，请同时检查左侧和右侧.如果由于某种原因您懒于更新索引以遵循标识逻辑，则可以按照

So first of all, pandas updates using the index. When an update command does not update anything, check both left-hand side and right-hand side. If for some reason you are too lazy to update the indices to follow your identification logic, you can do something along the lines of

>>> df.loc[df.filename == 'test2.dat', 'n'] = df2[df2.filename == 'test2.dat'].loc[0]['n']
>>> df
Out[331]: 
    filename   m     n
0  test0.dat  12  None
1  test2.dat  13    16

如果要对整个表执行此操作，建议使用一种我认为优于上述方法的方法:由于您的标识符为filename，因此将filename设置为索引，然后使用update()如您所愿. merge和apply()方法都包含不必要的开销:

If you want to do this for the whole table, I suggest a method I believe is superior to the previously mentioned ones: since your identifier is filename, set filename as your index, and then use update() as you wanted to. Both merge and the apply() approach contain unnecessary overhead:

>>> df.set_index('filename', inplace=True)
>>> df2.set_index('filename', inplace=True)
>>> df.update(df2)
>>> df
Out[292]: 
            m     n
filename           
test0.dat  12  None
test2.dat  13    16

如何在Python Pandas DataFrame中的特定行中更新值?

相关推荐