python使用numpy的原因。r_而不是连接

时间:2021-08-20 08:00:39

In which case using objects like numpy.r_ or numpy.c_ is better (more efficient, more suitable) than using fonctions like concatenate or vstack for example ?

在这种情况下,使用numpy之类的对象。r_或numpy。c_(更有效,更合适)比使用像连接或vstack这样的fonctions更好吗?

I am trying to understand a code where the programmer wrote something like:

我试着理解程序员写的代码:

return np.r_[0.0, 1d_array, 0.0] == 2

where 1d_array is an array whose values can be 0, 1 or 2. Why not using np.concatenate (for example) instead ? Like :

其中1d_array是一个数组,其值可以是0、1或2。为什么不使用np。连接(例如)?如:

return np.concatenate([[0.0], 1d_array, [0.0]]) == 2

It is more readable and apparently it does the same thing.

它的可读性更强,而且很明显它也做同样的事情。

3 个解决方案

#1


10  

np.r_ is implemented in the numpy/lib/index_tricks.py file. This is pure Python code, with no special compiled stuff. So it is not going to be any faster than the equivalent written with concatenate, arange and linspace. It's useful only if the notation fits your way of thinking and your needs.

np。r_是在numpy/lib/index_tricks中实现的。py文件。这是纯Python代码,没有特殊的编译内容。所以它不会比用串联,arange和linspace写的等价代码快多少。只有当符号符合你的思维方式和需求时,它才有用。

In your example it just saves converting the scalars to lists or arrays:

在您的示例中,它只是将标量转换为列表或数组:

In [452]: np.r_[0.0, np.array([1,2,3,4]), 0.0]
Out[452]: array([ 0.,  1.,  2.,  3.,  4.,  0.])

error with the same arguments:

相同参数的错误:

In [453]: np.concatenate([0.0, np.array([1,2,3,4]), 0.0])
...
ValueError: zero-dimensional arrays cannot be concatenated

correct with the added []

用添加的[]进行校正

In [454]: np.concatenate([[0.0], np.array([1,2,3,4]), [0.0]])
Out[454]: array([ 0.,  1.,  2.,  3.,  4.,  0.])

hstack takes care of that by passing all arguments through [atleast_1d(_m) for _m in tup]:

hstack通过[atleast_1d(_m)为_m传入所有参数来处理这个问题]:

In [455]: np.hstack([0.0, np.array([1,2,3,4]), 0.0])
Out[455]: array([ 0.,  1.,  2.,  3.,  4.,  0.])

So at least in simple cases it is most similar to hstack.

因此,至少在简单的情况下,它与hstack最为相似。

But the real usefulness of r_ comes when you want to use ranges

但是当您想要使用范围时,r_的真正用途就来了

np.r_[0.0, 1:5, 0.0]
np.hstack([0.0, np.arange(1,5), 0.0])
np.r_[0.0, slice(1,5), 0.0]

r_ lets you use the : syntax that is used in indexing. That's because it is actually an instance of a class that has a __getitem__ method. index_tricks uses this programming trick several times.

r_允许您使用索引中使用的:语法。这是因为它实际上是一个具有__getitem__方法的类的实例。index_trick使用了这个编程技巧好几次。

They've thrown in other bells-n-whistles

他们还加入了其他的喇叭声

Using an imaginary step, uses np.linspace to expand the slice rather than np.arange.

使用虚步,使用np。linspace扩展切片而不是np.arange。

np.r_[-1:1:6j, [0]*3, 5, 6]

produces:

生产:

array([-1. , -0.6, -0.2,  0.2,  0.6,  1. ,  0. ,  0. ,  0. ,  5. ,  6. ])

There are more details in the documentation.

文档中有更多的细节。

I did some time tests for many slices in https://*.com/a/37625115/901925

我在https://*.com/a/37625115/901925中做了一些时间测试。

#2


3  

All the explanation you need:

你需要的所有解释:

https://sourceforge.net/p/numpy/mailman/message/13869535/

https://sourceforge.net/p/numpy/mailman/message/13869535/

I found the most relevant part to be:

我发现最相关的部分是:

"""
For r_ and c_ I'm summarizing, but effectively they seem to be doing
something like:

r_[args]:
    concatenate( map(atleast_1d,args),axis=0 )

c_[args]:
    concatenate( map(atleast_1d,args),axis=1 )

c_ behaves almost exactly like hstack -- with the addition of range
literals being allowed.

r_ is most like vstack, but a little different since it effectively
uses atleast_1d, instead of atleast_2d.  So you have
>>> numpy.vstack((1,2,3,4))
array([[1],
       [2],
       [3],
       [4]])
but
>>> numpy.r_[1,2,3,4]
array([1, 2, 3, 4])
"""

#3


1  

I was also interested in this question and compared the speed of

我也对这个问题感兴趣并比较了它的速度。

numpy.c_[a, a]
numpy.stack([a, a]).T
numpy.vstack([a, a]).T
numpy.column_stack([a, a])
numpy.concatenate([a[:,None], a[:,None]], axis=1)

which all do the same thing for any input vector a. Here's what I found (using perfplot):

对于任何输入向量a都是一样的。

python使用numpy的原因。r_而不是连接

For smaller numbers, numpy.concatenate is the winner, for larger (from about 3000) stack/vstack.

对于较小的数字,numpy。concatenate是最大的(约3000)堆栈/vstack。


The plot was created with

这个情节是用

import numpy
import perfplot

perfplot.show(
    setup=lambda n: numpy.random.rand(n),
    kernels=[
        lambda a: numpy.c_[a, a],
        lambda a: numpy.stack([a, a]).T,
        lambda a: numpy.vstack([a, a]).T,
        lambda a: numpy.column_stack([a, a]),
        lambda a: numpy.concatenate([a[:, None], a[:, None]],axis=1)
        ],
    labels=['c_', 'stack', 'vstack', 'column_stack', 'concat'],
    n_range=[2**k for k in range(19)],
    xlabel='len(a)',
    )

#1


10  

np.r_ is implemented in the numpy/lib/index_tricks.py file. This is pure Python code, with no special compiled stuff. So it is not going to be any faster than the equivalent written with concatenate, arange and linspace. It's useful only if the notation fits your way of thinking and your needs.

np。r_是在numpy/lib/index_tricks中实现的。py文件。这是纯Python代码,没有特殊的编译内容。所以它不会比用串联,arange和linspace写的等价代码快多少。只有当符号符合你的思维方式和需求时,它才有用。

In your example it just saves converting the scalars to lists or arrays:

在您的示例中,它只是将标量转换为列表或数组:

In [452]: np.r_[0.0, np.array([1,2,3,4]), 0.0]
Out[452]: array([ 0.,  1.,  2.,  3.,  4.,  0.])

error with the same arguments:

相同参数的错误:

In [453]: np.concatenate([0.0, np.array([1,2,3,4]), 0.0])
...
ValueError: zero-dimensional arrays cannot be concatenated

correct with the added []

用添加的[]进行校正

In [454]: np.concatenate([[0.0], np.array([1,2,3,4]), [0.0]])
Out[454]: array([ 0.,  1.,  2.,  3.,  4.,  0.])

hstack takes care of that by passing all arguments through [atleast_1d(_m) for _m in tup]:

hstack通过[atleast_1d(_m)为_m传入所有参数来处理这个问题]:

In [455]: np.hstack([0.0, np.array([1,2,3,4]), 0.0])
Out[455]: array([ 0.,  1.,  2.,  3.,  4.,  0.])

So at least in simple cases it is most similar to hstack.

因此,至少在简单的情况下,它与hstack最为相似。

But the real usefulness of r_ comes when you want to use ranges

但是当您想要使用范围时,r_的真正用途就来了

np.r_[0.0, 1:5, 0.0]
np.hstack([0.0, np.arange(1,5), 0.0])
np.r_[0.0, slice(1,5), 0.0]

r_ lets you use the : syntax that is used in indexing. That's because it is actually an instance of a class that has a __getitem__ method. index_tricks uses this programming trick several times.

r_允许您使用索引中使用的:语法。这是因为它实际上是一个具有__getitem__方法的类的实例。index_trick使用了这个编程技巧好几次。

They've thrown in other bells-n-whistles

他们还加入了其他的喇叭声

Using an imaginary step, uses np.linspace to expand the slice rather than np.arange.

使用虚步,使用np。linspace扩展切片而不是np.arange。

np.r_[-1:1:6j, [0]*3, 5, 6]

produces:

生产:

array([-1. , -0.6, -0.2,  0.2,  0.6,  1. ,  0. ,  0. ,  0. ,  5. ,  6. ])

There are more details in the documentation.

文档中有更多的细节。

I did some time tests for many slices in https://*.com/a/37625115/901925

我在https://*.com/a/37625115/901925中做了一些时间测试。

#2


3  

All the explanation you need:

你需要的所有解释:

https://sourceforge.net/p/numpy/mailman/message/13869535/

https://sourceforge.net/p/numpy/mailman/message/13869535/

I found the most relevant part to be:

我发现最相关的部分是:

"""
For r_ and c_ I'm summarizing, but effectively they seem to be doing
something like:

r_[args]:
    concatenate( map(atleast_1d,args),axis=0 )

c_[args]:
    concatenate( map(atleast_1d,args),axis=1 )

c_ behaves almost exactly like hstack -- with the addition of range
literals being allowed.

r_ is most like vstack, but a little different since it effectively
uses atleast_1d, instead of atleast_2d.  So you have
>>> numpy.vstack((1,2,3,4))
array([[1],
       [2],
       [3],
       [4]])
but
>>> numpy.r_[1,2,3,4]
array([1, 2, 3, 4])
"""

#3


1  

I was also interested in this question and compared the speed of

我也对这个问题感兴趣并比较了它的速度。

numpy.c_[a, a]
numpy.stack([a, a]).T
numpy.vstack([a, a]).T
numpy.column_stack([a, a])
numpy.concatenate([a[:,None], a[:,None]], axis=1)

which all do the same thing for any input vector a. Here's what I found (using perfplot):

对于任何输入向量a都是一样的。

python使用numpy的原因。r_而不是连接

For smaller numbers, numpy.concatenate is the winner, for larger (from about 3000) stack/vstack.

对于较小的数字,numpy。concatenate是最大的(约3000)堆栈/vstack。


The plot was created with

这个情节是用

import numpy
import perfplot

perfplot.show(
    setup=lambda n: numpy.random.rand(n),
    kernels=[
        lambda a: numpy.c_[a, a],
        lambda a: numpy.stack([a, a]).T,
        lambda a: numpy.vstack([a, a]).T,
        lambda a: numpy.column_stack([a, a]),
        lambda a: numpy.concatenate([a[:, None], a[:, None]],axis=1)
        ],
    labels=['c_', 'stack', 'vstack', 'column_stack', 'concat'],
    n_range=[2**k for k in range(19)],
    xlabel='len(a)',
    )