Pandas数据帧numpy在多个条件下

时间:2022-07-14 23:48:28

Using pandas and numpy. How may I achieve the following:

使用熊猫和numpy。我怎样才能实现以下目标:

df['thecol'] = np.where(
(df["a"] >= df["a"].shift(1)) &
(df["a"] >= df["a"].shift(2)) &
(df["a"] >= df["a"].shift(3)) &
(df["a"] >= df["a"].shift(4)) &
(df["a"] >= df["a"].shift(5)) &
(df["a"] >= df["a"].shift(6)) &
(df["a"] >= df["a"].shift(7)) &
(df["a"] >= df["a"].shift(8)) &
(df["a"] >= df["a"].shift(9)) &
(df["a"] >= df["a"].shift(10))
,'istrue','isnottrue')

Without such ugly repetition of code, if it is only the number that is changing? I would like to have the same code with any number that I provide without typing it all out manually?

没有如此丑陋的代码重复,如果它只是变化的数字?我想拥有与我提供的任何号码相同的代码,而无需手动输入全部内容?

It is meant to compare the current value in column "a" to a value in same column one row above, and two rows above, etc, and result in "istrue" if all of these conditions are true

它意味着将列“a”中的当前值与上面一行中的相同列中的值和上面的两行等进行比较,如果所有这些条件都为真,则导致“istrue”

I tried shifting the dataframe in a for loop then appending the value to a list and calculating the maximum of it to only have (df["a"] >= maxvalue) once but it wouldn't work for me either. I am a novice at Python and will likely ask more silly questions in the near future

我尝试在for循环中移动数据帧,然后将值附加到列表并计算它的最大值只有(df [“a”]> = maxvalue)一次,但它对我也不起作用。我是Python的新手,可能会在不久的将来提出更多愚蠢的问题

This works but I would like it to also work without this much repetetive code so I can learn to code properly. I tried examples with yield generator but could not manage to get it working either

这有效,但我希望它也能在没有这么多重复代码的情况下工作,这样我就可以学会正确编码。我尝试了yield yield的例子,但无法让它工作

@Edit: Answered by Wen. I needed rolling.

@Edit:Wen回答。我需要滚动。

In the end I came up with this terrible terrible approach:

最后,我想出了这种可怕的糟糕方法:

def whereconditions(n):
    s1 = 'df["thecol"] = np.where('
    L = []
    while n > 0:
        s2 = '(df["a"] >= df["a"].shift('+str(n)+')) &'
        L.append(s2)
        n = n -1
    s3 = ",'istrue','isnottrue')"
    r = s1+str([x for x in L]).replace("'","").replace(",","").replace("&]","")+s3
    return str(r.replace("([(","(("))
call = whereconditions(10)
exec(call)

1 个解决方案

#1


1  

Sounds Like you need rolling

听起来像你需要滚动

np.where(df['a']==df['a'].rolling(10).max(),'istrue','isnottrue')

#1


1  

Sounds Like you need rolling

听起来像你需要滚动

np.where(df['a']==df['a'].rolling(10).max(),'istrue','isnottrue')