Pandas DataFrame:根据条件替换列中的所有值

时间:2022-05-28 22:55:22

I have a simple DataFrame like the following:

我有一个简单的DataFrame,如下所示:

Pandas DataFrame:根据条件替换列中的所有值

I want to select all values from the 'First Season' column and replace those that are over 1990 by 1. In this example, only Baltimore Ravens would have the 1996 replaced by 1 (keeping the rest of the data intact).

我想从“第一季”列中选择所有值,并将那些超过1990年的值替换为1.在此示例中,只有Baltimore Ravens将1996年替换为1(保持其余数据完好无损)。

I have used the following:

我使用了以下内容:

df.loc[(df['First Season'] > 1990)] = 1

But, it replaces all the values in that row by 1, and not just the values in the 'First Season' column.

但是,它将该行中的所有值替换为1,而不仅仅是“第一季”列中的值。

How can I replace just the values from that column?

如何只替换该列中的值?

1 个解决方案

#1


68  

You need to select that column:

您需要选择该列:

In [41]:
df.loc[df['First Season'] > 1990, 'First Season'] = 1
df

Out[41]:
                 Team  First Season  Total Games
0      Dallas Cowboys          1960          894
1       Chicago Bears          1920         1357
2   Green Bay Packers          1921         1339
3      Miami Dolphins          1966          792
4    Baltimore Ravens             1          326
5  San Franciso 49ers          1950         1003

So the syntax here is:

所以这里的语法是:

df.loc[<mask>(here mask is generating the labels to index) , <optional column(s)> ]

You can check the docs and also the 10 minutes to pandas which shows the semantics

您可以查看文档以及显示语义的熊猫10分钟

EDIT

编辑

If you want to generate a boolean indicator then you can just use the boolean condition to generate a boolean Series and cast the dtype to int this will convert True and False to 1 and 0 respectively:

如果你想生成一个布尔指标,那么你可以使用布尔条件生成一个布尔系列并将dtype转换为int,这将分别将True和False转换为1和0:

In [43]:
df['First Season'] = (df['First Season'] > 1990).astype(int)
df

Out[43]:
                 Team  First Season  Total Games
0      Dallas Cowboys             0          894
1       Chicago Bears             0         1357
2   Green Bay Packers             0         1339
3      Miami Dolphins             0          792
4    Baltimore Ravens             1          326
5  San Franciso 49ers             0         1003

#1


68  

You need to select that column:

您需要选择该列:

In [41]:
df.loc[df['First Season'] > 1990, 'First Season'] = 1
df

Out[41]:
                 Team  First Season  Total Games
0      Dallas Cowboys          1960          894
1       Chicago Bears          1920         1357
2   Green Bay Packers          1921         1339
3      Miami Dolphins          1966          792
4    Baltimore Ravens             1          326
5  San Franciso 49ers          1950         1003

So the syntax here is:

所以这里的语法是:

df.loc[<mask>(here mask is generating the labels to index) , <optional column(s)> ]

You can check the docs and also the 10 minutes to pandas which shows the semantics

您可以查看文档以及显示语义的熊猫10分钟

EDIT

编辑

If you want to generate a boolean indicator then you can just use the boolean condition to generate a boolean Series and cast the dtype to int this will convert True and False to 1 and 0 respectively:

如果你想生成一个布尔指标,那么你可以使用布尔条件生成一个布尔系列并将dtype转换为int,这将分别将True和False转换为1和0:

In [43]:
df['First Season'] = (df['First Season'] > 1990).astype(int)
df

Out[43]:
                 Team  First Season  Total Games
0      Dallas Cowboys             0          894
1       Chicago Bears             0         1357
2   Green Bay Packers             0         1339
3      Miami Dolphins             0          792
4    Baltimore Ravens             1          326
5  San Franciso 49ers             0         1003