Pythonic方法用上限和下限(钳位,限幅,阈值)替换列表值?

时间:2021-07-21 12:26:25

I want to replace outliners from a list. Therefore I define a upper and lower bound. Now every value above upper_bound and under lower_bound is replaced with the bound value. My approach was to do this in two steps using a numpy array.

我想从列表中替换outliners。因此我定义了上限和下限。现在,每个高于upper_bound和lower_bound的值都将被绑定值替换。我的方法是使用numpy数组分两步完成。

Now I wonder if it's possible to do this in one step, as I guess it could improve performance and readability.

现在我想知道是否有可能一步到位,因为我猜它可以提高性能和可读性。

Is there a shorter way to do this?

有没有更短的方法来做到这一点?

import numpy as np

lowerBound, upperBound = 3, 7

arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

arr[arr > upperBound] = upperBound
arr[arr < lowerBound] = lowerBound

# [3 3 3 3 4 5 6 7 7 7]
print(arr)

2 个解决方案

#1


31  

You can use numpy.clip:

你可以使用numpy.clip:

In [1]: import numpy as np

In [2]: arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [3]: lowerBound, upperBound = 3, 7

In [4]: np.clip(arr, lowerBound, upperBound, out=arr)
Out[4]: array([3, 3, 3, 3, 4, 5, 6, 7, 7, 7])

In [5]: arr
Out[5]: array([3, 3, 3, 3, 4, 5, 6, 7, 7, 7])

#2


13  

For an alternative that doesn't rely on numpy, you could always do

对于不依赖于numpy的替代方案,您可以随时使用

arr = [max(lower_bound, min(x, upper_bound)) for x in arr]

If you just wanted to set an upper bound, you could of course write arr = [min(x, upper_bound) for x in arr]. Or similarly if you just wanted a lower bound, you'd use max instead.

如果你只想设置一个上限,你当然可以在arr中为ar编写arr = [min(x,upper_bound)]。或者类似地,如果你只想要一个下限,你可以使用max代替。

Here, I've just applied both operations, written together.

在这里,我刚刚应用了两个操作。

Edit: Here's a slightly more in-depth explanation:

编辑:这里有一个更深入的解释:

Given an element x of the array (and assuming that your upper_bound is at least as big as your lower_bound!), you'll have one of three cases:

给定数组的元素x(并假设您的upper_bound至少与lower_bound一样大!),您将遇到以下三种情况之一:

i) x < lower_bound

i)x

ii) x > upper_bound

ii)x> upper_bound

iii) lower_bound <= x <= upper_bound.

iii)lower_bound <= x <= upper_bound。

In case (i), the max/min expression first evaluates to max(lower_bound, x), which then resolves to lower_bound.

在情况(i)中,max / min表达式首先计算为max(lower_bound,x),然后解析为lower_bound。

In case (ii), the expression first becomes max(lower_bound, upper_bound), which then becomes upper_bound.

在情况(ii)中,表达式首先变为max(lower_bound,upper_bound),然后变为upper_bound。

In case (iii), we get max(lower_bound, x) which resolves to just x.

在情况(iii)中,我们得到max(lower_bound,x),它解析为x。

In all three cases, the output is what we want.

在所有这三种情况下,输出都是我们想要的。

#1


31  

You can use numpy.clip:

你可以使用numpy.clip:

In [1]: import numpy as np

In [2]: arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [3]: lowerBound, upperBound = 3, 7

In [4]: np.clip(arr, lowerBound, upperBound, out=arr)
Out[4]: array([3, 3, 3, 3, 4, 5, 6, 7, 7, 7])

In [5]: arr
Out[5]: array([3, 3, 3, 3, 4, 5, 6, 7, 7, 7])

#2


13  

For an alternative that doesn't rely on numpy, you could always do

对于不依赖于numpy的替代方案,您可以随时使用

arr = [max(lower_bound, min(x, upper_bound)) for x in arr]

If you just wanted to set an upper bound, you could of course write arr = [min(x, upper_bound) for x in arr]. Or similarly if you just wanted a lower bound, you'd use max instead.

如果你只想设置一个上限,你当然可以在arr中为ar编写arr = [min(x,upper_bound)]。或者类似地,如果你只想要一个下限,你可以使用max代替。

Here, I've just applied both operations, written together.

在这里,我刚刚应用了两个操作。

Edit: Here's a slightly more in-depth explanation:

编辑:这里有一个更深入的解释:

Given an element x of the array (and assuming that your upper_bound is at least as big as your lower_bound!), you'll have one of three cases:

给定数组的元素x(并假设您的upper_bound至少与lower_bound一样大!),您将遇到以下三种情况之一:

i) x < lower_bound

i)x

ii) x > upper_bound

ii)x> upper_bound

iii) lower_bound <= x <= upper_bound.

iii)lower_bound <= x <= upper_bound。

In case (i), the max/min expression first evaluates to max(lower_bound, x), which then resolves to lower_bound.

在情况(i)中,max / min表达式首先计算为max(lower_bound,x),然后解析为lower_bound。

In case (ii), the expression first becomes max(lower_bound, upper_bound), which then becomes upper_bound.

在情况(ii)中,表达式首先变为max(lower_bound,upper_bound),然后变为upper_bound。

In case (iii), we get max(lower_bound, x) which resolves to just x.

在情况(iii)中,我们得到max(lower_bound,x),它解析为x。

In all three cases, the output is what we want.

在所有这三种情况下,输出都是我们想要的。