以pythonic方式组合具有特定合并顺序的列表?

时间:2022-02-02 22:41:49

I would like to construct list x from two lists y and z. I want all elements from y be placed where ypos elements point. For example:

我想从两个列表y和z构造列表x。我希望y中的所有元素都放在ypos元素指向的位置。例如:

y = [11, 13, 15]
z = [12, 14]
ypos = [1, 3, 5]

So, x must be [11, 12, 13, 14, 15]

所以,x必须是[11,12,13,14,15]

Another example:

y = [77]
z = [35, 58, 74]
ypos = [3]

So, x must be [35, 58, 77, 74]

所以,x必须是[35,58,77,74]

I've written function that does what I want but it looks ugly:

我写的函数可以完成我想要的但看起来很难看:

def func(y, z, ypos):
    x = [0] * (len(y) + len(z))
    zpos = list(range(len(y) + len(z)))
    for i, j in zip(y, ypos):
        x[j-1] = i
        zpos.remove(j-1)
    for i, j in zip(z, zpos):
        x[j] = i
    return x

How to write it in pythonic way?

如何用pythonic方式编写它?

6 个解决方案

#1


35  

If the lists are very long, repeatedly calling insert might not be very efficient. Alternatively, you could create two iterators from the lists and construct a list by getting the next element from either of the iterators depending on whether the current index is in ypos (or a set thereof):

如果列表很长,重复调用insert可能效率不高。或者,您可以从列表中创建两个迭代器,并通过从任一迭代器获取下一个元素来构造列表,具体取决于当前索引是否在ypos(或其中的一组)中:

>>> ity = iter(y)
>>> itz = iter(z)
>>> syp = set(ypos)
>>> [next(ity if i+1 in syp else itz) for i in range(len(y)+len(z))]
[11, 12, 13, 14, 15]

Note: this will insert the elements from y in the order they appear in y itself, i.e. the first element of y is inserted at the lowest index in ypos, not necessarily at the first index in ypos. If the elements of y should be inserted at the index of the corresponding element of ypos, then either ypos has to be in ascending order (i.e. the first index of ypos is also the lowest), or the iterator of y has to be sorted by the same order as the indices in ypos (afterwards, ypos itself does not have to be sorted, as we are turning it into a set anyway).

注意:这将按y在y本身中出现的顺序插入y中的元素,即y的第一个元素插入ypos中的最低索引,而不是ypos中的第一个索引。如果y的元素应该插入到ypos的相应元素的索引处,那么ypos必须按升序排列(即ypos的第一个索引也是最低的),或者y的迭代器必须按以下顺序排序:与ypos中的索引相同的顺序(之后,ypos本身不必进行排序,因为我们无论如何都要将它转换为集合)。

>>> ypos = [5,3,1]   # y and z being same as above
>>> ity = iter(e for i, e in sorted(zip(ypos, y)))
>>> [next(ity if i+1 in syp else itz) for i in range(len(y)+len(z))]
[15, 12, 13, 14, 11]

#2


12  

You should use list.insert, this is what it was made for!

你应该使用list.insert,这就是它的用途!

def func(y, z, ypos):
    x = z[:]
    for pos, val in zip(ypos, y):
        x.insert(pos-1, val)
    return x

and a test:

和测试:

>>> func([11, 13, 15], [12, 14], [1,3,5])
[11, 12, 13, 14, 15]

#3


8  

With large lists, it might be a good idea to work with numpy.

对于大型列表,使用numpy可能是个好主意。

Algorithm

  • create a new array as large as y + z
  • 创建一个与y + z一样大的新数组

  • calculate coordinates for z values
  • 计算z值的坐标

  • assign y values to x at ypos
  • 在ypos将y值赋给x

  • assign z values to x at zpos
  • 在zpos将x值赋给x

The complexity should be O(n), with n being the total number of values.

复杂度应为O(n),其中n为值的总数。

import numpy as np

def distribute_values(y_list, z_list, y_pos):
    y = np.array(y_list)
    z = np.array(z_list)
    n = y.size + z.size
    x = np.empty(n, np.int)
    y_indices = np.array(y_pos) - 1
    z_indices = np.setdiff1d(np.arange(n), y_indices, assume_unique=True)
    x[y_indices] = y
    x[z_indices] = z
    return x

print(distribute_values([11, 13, 15], [12, 14], [1, 3, 5]))
# [11 12 13 14 15]
print(distribute_values([77], [35, 58, 74], [3]))
# [35 58 77 74]

As a bonus, it also works fine when ypos isn't sorted:

作为奖励,当ypos未排序时,它也可以正常工作:

print(distribute_values([15, 13, 11], [12, 14], [5, 3, 1]))
# [11 12 13 14 15]
print(distribute_values([15, 11, 13], [12, 14], [5, 1, 3]))
# [11 12 13 14 15]

Performance

With n set to 1 million, this approach is a bit faster than @tobias_k's answer and 500 times faster than @Joe_Iddon's answer.

当n设置为100万时,这种方法比@ tobias_k的答案快一点,比@ Joe_Iddon的答案快500倍。

The lists were created this way:

列表是这样创建的:

from random import random, randint
N = 1000000
ypos = [i+1 for i in range(N) if random()<0.4]
y = [randint(0, 10000) for _ in ypos]
z = [randint(0, 1000) for _ in range(N - len(y))

Here are the results with %timeit and IPython:

以下是%timeit和IPython的结果:

%timeit eric(y, z, ypos)
131 ms ± 1.54 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit tobias(y, z, ypos)
224 ms ± 977 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit joe(y,z, ypos)
54 s ± 1.48 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

#4


2  

Assuming that the ypos indices are sorted, here is another solution using iterators, though this one also supports ypos of unknown or infinite length:

假设ypos索引已经排序,这里是另一个使用迭代器的解决方案,尽管这个也支持未知或无限长度的ypos:

import itertools

def func(y, ypos, z):
    y = iter(y)
    ypos = iter(ypos)
    z = iter(z)
    next_ypos = next(ypos, -1)
    for i in itertools.count(start=1):
        if i == next_ypos:
            yield next(y)
            next_ypos = next(ypos, -1)
        else:
            yield next(z)

#5


2  

If you want the elements in ypos to be placed at the x index where each element's index in ypos should correspond with the same y index's element:

如果你希望ypos中的元素放在x索引处,ypos中的每个元素的索引应该与y索引的元素相对应:

  1. Initialize x to the required size using all null values.
  2. 使用所有空值将x初始化为所需大小。

  3. Iterate through the zipped y and ypos elements to fill in each corresponding y element into x.
  4. 通过压缩的y和ypos元素迭代,将每个对应的y元素填充到x中。

  5. Iterate through x and replace each remaining null value with z values where each replacement will choose from z in increasing order.
  6. 迭代x并用z值替换每个剩余的空值,其中每个替换将从z中按递增顺序选择。

y = [11, 13, 15]
z = [12, 14]
ypos = [1, 5, 3]

x = [None] * (len(y) + len(z))
for x_ypos, y_elem in zip(ypos, y):
    x[x_ypos - 1] = y_elem

z_iter = iter(z)
x = [next(z_iter) if i is None else i for i in x]
# x -> [11, 12, 15, 14, 13]

#6


1  

Pythonic way

y = [11, 13, 15]
z = [12, 14]
ypos = [1, 3, 5]

x = z[:]

for c, n in enumerate(ypos):
    x.insert(n - 1, y[c])

print(x)

output

[11, 12, 13, 14, 15]

[11,12,13,14,15]

In a function

def func(y, ypos, z):
    x = z[:]
    for c,n in enumerate(ypos):
        x.insert(n-1,y[c])
    return x

print(func([11,13,15],[1,2,3],[12,14]))

outoput

[11, 12, 13, 14, 15]

[11,12,13,14,15]

Using zip

y, z, ypos = [11, 13, 15], [12, 14], [1, 3, 5]

for i, c in zip(ypos, y):
    z.insert(i - 1, c)

print(z)

[out:]

> [11, 12, 13, 14, 15]

#1


35  

If the lists are very long, repeatedly calling insert might not be very efficient. Alternatively, you could create two iterators from the lists and construct a list by getting the next element from either of the iterators depending on whether the current index is in ypos (or a set thereof):

如果列表很长,重复调用insert可能效率不高。或者,您可以从列表中创建两个迭代器,并通过从任一迭代器获取下一个元素来构造列表,具体取决于当前索引是否在ypos(或其中的一组)中:

>>> ity = iter(y)
>>> itz = iter(z)
>>> syp = set(ypos)
>>> [next(ity if i+1 in syp else itz) for i in range(len(y)+len(z))]
[11, 12, 13, 14, 15]

Note: this will insert the elements from y in the order they appear in y itself, i.e. the first element of y is inserted at the lowest index in ypos, not necessarily at the first index in ypos. If the elements of y should be inserted at the index of the corresponding element of ypos, then either ypos has to be in ascending order (i.e. the first index of ypos is also the lowest), or the iterator of y has to be sorted by the same order as the indices in ypos (afterwards, ypos itself does not have to be sorted, as we are turning it into a set anyway).

注意:这将按y在y本身中出现的顺序插入y中的元素,即y的第一个元素插入ypos中的最低索引,而不是ypos中的第一个索引。如果y的元素应该插入到ypos的相应元素的索引处,那么ypos必须按升序排列(即ypos的第一个索引也是最低的),或者y的迭代器必须按以下顺序排序:与ypos中的索引相同的顺序(之后,ypos本身不必进行排序,因为我们无论如何都要将它转换为集合)。

>>> ypos = [5,3,1]   # y and z being same as above
>>> ity = iter(e for i, e in sorted(zip(ypos, y)))
>>> [next(ity if i+1 in syp else itz) for i in range(len(y)+len(z))]
[15, 12, 13, 14, 11]

#2


12  

You should use list.insert, this is what it was made for!

你应该使用list.insert,这就是它的用途!

def func(y, z, ypos):
    x = z[:]
    for pos, val in zip(ypos, y):
        x.insert(pos-1, val)
    return x

and a test:

和测试:

>>> func([11, 13, 15], [12, 14], [1,3,5])
[11, 12, 13, 14, 15]

#3


8  

With large lists, it might be a good idea to work with numpy.

对于大型列表,使用numpy可能是个好主意。

Algorithm

  • create a new array as large as y + z
  • 创建一个与y + z一样大的新数组

  • calculate coordinates for z values
  • 计算z值的坐标

  • assign y values to x at ypos
  • 在ypos将y值赋给x

  • assign z values to x at zpos
  • 在zpos将x值赋给x

The complexity should be O(n), with n being the total number of values.

复杂度应为O(n),其中n为值的总数。

import numpy as np

def distribute_values(y_list, z_list, y_pos):
    y = np.array(y_list)
    z = np.array(z_list)
    n = y.size + z.size
    x = np.empty(n, np.int)
    y_indices = np.array(y_pos) - 1
    z_indices = np.setdiff1d(np.arange(n), y_indices, assume_unique=True)
    x[y_indices] = y
    x[z_indices] = z
    return x

print(distribute_values([11, 13, 15], [12, 14], [1, 3, 5]))
# [11 12 13 14 15]
print(distribute_values([77], [35, 58, 74], [3]))
# [35 58 77 74]

As a bonus, it also works fine when ypos isn't sorted:

作为奖励,当ypos未排序时,它也可以正常工作:

print(distribute_values([15, 13, 11], [12, 14], [5, 3, 1]))
# [11 12 13 14 15]
print(distribute_values([15, 11, 13], [12, 14], [5, 1, 3]))
# [11 12 13 14 15]

Performance

With n set to 1 million, this approach is a bit faster than @tobias_k's answer and 500 times faster than @Joe_Iddon's answer.

当n设置为100万时,这种方法比@ tobias_k的答案快一点,比@ Joe_Iddon的答案快500倍。

The lists were created this way:

列表是这样创建的:

from random import random, randint
N = 1000000
ypos = [i+1 for i in range(N) if random()<0.4]
y = [randint(0, 10000) for _ in ypos]
z = [randint(0, 1000) for _ in range(N - len(y))

Here are the results with %timeit and IPython:

以下是%timeit和IPython的结果:

%timeit eric(y, z, ypos)
131 ms ± 1.54 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit tobias(y, z, ypos)
224 ms ± 977 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit joe(y,z, ypos)
54 s ± 1.48 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

#4


2  

Assuming that the ypos indices are sorted, here is another solution using iterators, though this one also supports ypos of unknown or infinite length:

假设ypos索引已经排序,这里是另一个使用迭代器的解决方案,尽管这个也支持未知或无限长度的ypos:

import itertools

def func(y, ypos, z):
    y = iter(y)
    ypos = iter(ypos)
    z = iter(z)
    next_ypos = next(ypos, -1)
    for i in itertools.count(start=1):
        if i == next_ypos:
            yield next(y)
            next_ypos = next(ypos, -1)
        else:
            yield next(z)

#5


2  

If you want the elements in ypos to be placed at the x index where each element's index in ypos should correspond with the same y index's element:

如果你希望ypos中的元素放在x索引处,ypos中的每个元素的索引应该与y索引的元素相对应:

  1. Initialize x to the required size using all null values.
  2. 使用所有空值将x初始化为所需大小。

  3. Iterate through the zipped y and ypos elements to fill in each corresponding y element into x.
  4. 通过压缩的y和ypos元素迭代,将每个对应的y元素填充到x中。

  5. Iterate through x and replace each remaining null value with z values where each replacement will choose from z in increasing order.
  6. 迭代x并用z值替换每个剩余的空值,其中每个替换将从z中按递增顺序选择。

y = [11, 13, 15]
z = [12, 14]
ypos = [1, 5, 3]

x = [None] * (len(y) + len(z))
for x_ypos, y_elem in zip(ypos, y):
    x[x_ypos - 1] = y_elem

z_iter = iter(z)
x = [next(z_iter) if i is None else i for i in x]
# x -> [11, 12, 15, 14, 13]

#6


1  

Pythonic way

y = [11, 13, 15]
z = [12, 14]
ypos = [1, 3, 5]

x = z[:]

for c, n in enumerate(ypos):
    x.insert(n - 1, y[c])

print(x)

output

[11, 12, 13, 14, 15]

[11,12,13,14,15]

In a function

def func(y, ypos, z):
    x = z[:]
    for c,n in enumerate(ypos):
        x.insert(n-1,y[c])
    return x

print(func([11,13,15],[1,2,3],[12,14]))

outoput

[11, 12, 13, 14, 15]

[11,12,13,14,15]

Using zip

y, z, ypos = [11, 13, 15], [12, 14], [1, 3, 5]

for i, c in zip(ypos, y):
    z.insert(i - 1, c)

print(z)

[out:]

> [11, 12, 13, 14, 15]