Pandas中是否有一种方法可以在dataframe.apply中使用先前的行值,同时在apply中计算前一个值?

时间:2022-09-15 20:20:12

I have the following dataframe:

我有以下数据帧:

 Index_Date    A    B    C    D
 ===============================
 2015-01-31    10   10   Nan  10
 2015-02-01     2    3   Nan  22 
 2015-02-02    10   60   Nan  280
 2015-02-03    10   100   Nan  250

Require:

 Index_Date    A    B    C    D
 ===============================
 2015-01-31    10   10   10   10
 2015-02-01     2    3   23   22
 2015-02-02    10   60   290  280
 2015-02-03    10   100  3000 250

Column C is derived for 2015-01-31 by taking value of D.

C列是根据D的值得出的2015-01-31。

Then I need to use the value of C for 2015-01-31 and multiply by the value of A on 2015-02-01 and add B.

然后我需要使用C的值2015-01-31并乘以2015-02-01的A值并添加B.

I have attempted an apply and a shift using an if else by this gives a key error.

我已尝试使用if else进行应用和移位,这会产生一个关键错误。

3 个解决方案

#1


14  

First, create the derived value:

首先,创建派生值:

df.loc[0, 'C'] = df.loc[0, 'D']

Then iterate through the remaining rows and fill the calculated values:

然后迭代剩余的行并填充计算的值:

for i in range(1, len(df)):
    df.loc[i, 'C'] = df.loc[i-1, 'C'] * df.loc[i, 'A'] + df.loc[i, 'B']


  Index_Date   A   B    C    D
0 2015-01-31  10  10   10   10
1 2015-02-01   2   3   23   22
2 2015-02-02  10  60  290  280

#2


10  

Given a column of numbers:

给出一列数字:

lst = []
cols = ['A']
for a in range(100, 105):
    lst.append([a])
df = pd.DataFrame(lst, columns=cols, index=range(5))
df

    A
0   100
1   101
2   102
3   103
4   104

You can reference the previous row with shift:

您可以使用shift引用上一行:

df['Change'] = df.A - df.A.shift(1)
df

    A   Change
0   100 NaN
1   101 1.0
2   102 1.0
3   103 1.0
4   104 1.0

#3


6  

Applying the recursive function on numpy arrays will be faster than the current answer.

在numpy数组上应用递归函数将比当前答案更快。

df = pd.DataFrame(np.repeat(np.arange(2, 6),3).reshape(4,3), columns=['A', 'B', 'D'])
new = [df.D.values[0]]
for i in range(1, len(df.index)):
    new.append(new[i-1]*df.A.values[i]+df.B.values[i])
df['C'] = new

Output

      A  B  D    C
   0  1  1  1    1
   1  2  2  2    4
   2  3  3  3   15
   3  4  4  4   64
   4  5  5  5  325

#1


14  

First, create the derived value:

首先,创建派生值:

df.loc[0, 'C'] = df.loc[0, 'D']

Then iterate through the remaining rows and fill the calculated values:

然后迭代剩余的行并填充计算的值:

for i in range(1, len(df)):
    df.loc[i, 'C'] = df.loc[i-1, 'C'] * df.loc[i, 'A'] + df.loc[i, 'B']


  Index_Date   A   B    C    D
0 2015-01-31  10  10   10   10
1 2015-02-01   2   3   23   22
2 2015-02-02  10  60  290  280

#2


10  

Given a column of numbers:

给出一列数字:

lst = []
cols = ['A']
for a in range(100, 105):
    lst.append([a])
df = pd.DataFrame(lst, columns=cols, index=range(5))
df

    A
0   100
1   101
2   102
3   103
4   104

You can reference the previous row with shift:

您可以使用shift引用上一行:

df['Change'] = df.A - df.A.shift(1)
df

    A   Change
0   100 NaN
1   101 1.0
2   102 1.0
3   103 1.0
4   104 1.0

#3


6  

Applying the recursive function on numpy arrays will be faster than the current answer.

在numpy数组上应用递归函数将比当前答案更快。

df = pd.DataFrame(np.repeat(np.arange(2, 6),3).reshape(4,3), columns=['A', 'B', 'D'])
new = [df.D.values[0]]
for i in range(1, len(df.index)):
    new.append(new[i-1]*df.A.values[i]+df.B.values[i])
df['C'] = new

Output

      A  B  D    C
   0  1  1  1    1
   1  2  2  2    4
   2  3  3  3   15
   3  4  4  4   64
   4  5  5  5  325