将数组拆分为基于值的子数组。

时间:2022-02-28 08:17:34

I was looking for an Array equivalent String#split in Ruby Core, and was surprised to find that it did not exist. Is there a more elegant way than the following to split an array into sub-arrays based on a value?

我正在寻找Ruby内核中等价的字符串#split,并惊讶地发现它不存在。有比以下更优雅的方法将数组分割成基于值的子数组吗?

class Array
  def split( split_on=nil )
    inject([[]]) do |a,v|
      a.tap{
        if block_given? ? yield(v) : v==split_on
          a << []
        else
          a.last << v
        end
      }
    end.tap{ |a| a.pop if a.last.empty? }
  end
end

p (1..9 ).to_a.split{ |i| i%3==0 },
  (1..10).to_a.split{ |i| i%3==0 }
#=> [[1, 2], [4, 5], [7, 8]]
#=> [[1, 2], [4, 5], [7, 8], [10]]

Edit: For those interested, the "real-world" problem which sparked this request can be seen in this answer, where I've used @fd's answer below for the implementation.

编辑:对于那些感兴趣的人来说,引发这个请求的“现实世界”问题可以在这个答案中看到,我在下面使用了@fd的答案来实现。

5 个解决方案

#1


11  

I tried golfing it a bit, still not a single method though:

我试着打了一点高尔夫,但还是没有一个办法:

(1..9).chunk{|i|i%3==0}.reject{|sep,ans| sep}.map{|sep,ans| ans}

Or faster:

或更快:

(1..9).chunk{|i|i%3==0 || nil}.map{|sep,ans| sep&&ans}.compact

Also, Enumerable#chunk seems to be Ruby 1.9+, but it is very close to what you want.

而且,Enumerable#chunk看起来是Ruby 1.9+,但是它非常接近于您想要的。

For example, the raw output would be:

例如,原始输出是:

(1..9).chunk{ |i|i%3==0 }.to_a                                       
=> [[false, [1, 2]], [true, [3]], [false, [4, 5]], [true, [6]], [false, [7, 8]], [true, [9]]]

(The to_a is to make irb print something nice, since chunk gives you an enumerator rather than an Array)

(to_a是为了让irb打印一些漂亮的东西,因为chunk为您提供一个枚举数而不是数组)


Edit: Note that the above elegant solutions are 2-3x slower than the fastest implementation:

编辑:注意以上优雅的解决方案比最快的实现慢2-3倍:

module Enumerable
  def split_by
    result = [a=[]]
    each{ |o| yield(o) ? (result << a=[]) : (a << o) }
    result.pop if a.empty?
    result
  end
end

#2


14  

Sometimes partition is a good way to do things like that:

有时分区是一种很好的方式来做这样的事情:

(1..6).partition { |v| v.even? } 
#=> [[2, 4, 6], [1, 3, 5]]

#3


5  

Here are benchmarks aggregating the answers (I'll not be accepting this answer):

以下是对答案的综合评判(我不会接受这个答案):

require 'benchmark'
a = *(1..5000); N = 1000
Benchmark.bmbm do |x|
  %w[ split_with_inject split_with_inject_no_tap split_with_each
      split_with_chunk split_with_chunk2 split_with_chunk3 ].each do |method|
    x.report( method ){ N.times{ a.send(method){ |i| i%3==0 || i%5==0 } } }
  end
end
#=>                                user     system      total        real
#=> split_with_inject          1.857000   0.015000   1.872000 (  1.879188)
#=> split_with_inject_no_tap   1.357000   0.000000   1.357000 (  1.353135)
#=> split_with_each            1.123000   0.000000   1.123000 (  1.123113)
#=> split_with_chunk           3.962000   0.000000   3.962000 (  3.984398)
#=> split_with_chunk2          3.682000   0.000000   3.682000 (  3.687369)
#=> split_with_chunk3          2.278000   0.000000   2.278000 (  2.281228)

The implementations being tested (on Ruby 1.9.2):

正在测试的实现(关于Ruby 1.9.2):

class Array
  def split_with_inject
    inject([[]]) do |a,v|
      a.tap{ yield(v) ? (a << []) : (a.last << v) }
    end.tap{ |a| a.pop if a.last.empty? }
  end

  def split_with_inject_no_tap
    result = inject([[]]) do |a,v|
      yield(v) ? (a << []) : (a.last << v)
      a
    end
    result.pop if result.last.empty?
    result
  end

  def split_with_each
    result = [a=[]]
    each{ |o| yield(o) ? (result << a=[]) : (a << o) }
    result.pop if a.empty?
    result
  end

  def split_with_chunk
    chunk{ |o| !!yield(o) }.reject{ |b,a| b }.map{ |b,a| a }
  end

  def split_with_chunk2
    chunk{ |o| !!yield(o) }.map{ |b,a| b ? nil : a }.compact
  end

  def split_with_chunk3
    chunk{ |o| yield(o) || nil }.map{ |b,a| b && a }.compact
  end
end

#4


1  

Other Enumerable methods you might want to consider is each_slice or each_cons

您可能需要考虑的其他可枚举方法是each_slice或each_cons

I don't know how general you want it to be, here's one way

我不知道你希望它有多普遍,这里有一个方法

>> (1..9).each_slice(3) {|a| p a.size>1?a[0..-2]:a}
[1, 2]
[4, 5]
[7, 8]
=> nil
>> (1..10).each_slice(3) {|a| p a.size>1?a[0..-2]:a}
[1, 2]
[4, 5]
[7, 8]
[10]

#5


1  

here is another one (with a benchmark comparing it to the fastest split_with_each here https://*.com/a/4801483/410102):

这里是另一个(与split_with_each相比,这里是https://*.com/a/4801483/410102):

require 'benchmark'

class Array
  def split_with_each
    result = [a=[]]
    each{ |o| yield(o) ? (result << a=[]) : (a << o) }
    result.pop if a.empty?
    result
  end

  def split_with_each_2
    u, v = [], []
    each{ |x| (yield x) ? (u << x) : (v << x) }
    [u, v]
  end
end

a = *(1..5000); N = 1000
Benchmark.bmbm do |x|
  %w[ split_with_each split_with_each_2 ].each do |method|
    x.report( method ){ N.times{ a.send(method){ |i| i%3==0 || i%5==0 } } }
  end
end

                        user     system      total        real
split_with_each     2.730000   0.000000   2.730000 (  2.742135)
split_with_each_2   2.270000   0.040000   2.310000 (  2.309600)

#1


11  

I tried golfing it a bit, still not a single method though:

我试着打了一点高尔夫,但还是没有一个办法:

(1..9).chunk{|i|i%3==0}.reject{|sep,ans| sep}.map{|sep,ans| ans}

Or faster:

或更快:

(1..9).chunk{|i|i%3==0 || nil}.map{|sep,ans| sep&&ans}.compact

Also, Enumerable#chunk seems to be Ruby 1.9+, but it is very close to what you want.

而且,Enumerable#chunk看起来是Ruby 1.9+,但是它非常接近于您想要的。

For example, the raw output would be:

例如,原始输出是:

(1..9).chunk{ |i|i%3==0 }.to_a                                       
=> [[false, [1, 2]], [true, [3]], [false, [4, 5]], [true, [6]], [false, [7, 8]], [true, [9]]]

(The to_a is to make irb print something nice, since chunk gives you an enumerator rather than an Array)

(to_a是为了让irb打印一些漂亮的东西,因为chunk为您提供一个枚举数而不是数组)


Edit: Note that the above elegant solutions are 2-3x slower than the fastest implementation:

编辑:注意以上优雅的解决方案比最快的实现慢2-3倍:

module Enumerable
  def split_by
    result = [a=[]]
    each{ |o| yield(o) ? (result << a=[]) : (a << o) }
    result.pop if a.empty?
    result
  end
end

#2


14  

Sometimes partition is a good way to do things like that:

有时分区是一种很好的方式来做这样的事情:

(1..6).partition { |v| v.even? } 
#=> [[2, 4, 6], [1, 3, 5]]

#3


5  

Here are benchmarks aggregating the answers (I'll not be accepting this answer):

以下是对答案的综合评判(我不会接受这个答案):

require 'benchmark'
a = *(1..5000); N = 1000
Benchmark.bmbm do |x|
  %w[ split_with_inject split_with_inject_no_tap split_with_each
      split_with_chunk split_with_chunk2 split_with_chunk3 ].each do |method|
    x.report( method ){ N.times{ a.send(method){ |i| i%3==0 || i%5==0 } } }
  end
end
#=>                                user     system      total        real
#=> split_with_inject          1.857000   0.015000   1.872000 (  1.879188)
#=> split_with_inject_no_tap   1.357000   0.000000   1.357000 (  1.353135)
#=> split_with_each            1.123000   0.000000   1.123000 (  1.123113)
#=> split_with_chunk           3.962000   0.000000   3.962000 (  3.984398)
#=> split_with_chunk2          3.682000   0.000000   3.682000 (  3.687369)
#=> split_with_chunk3          2.278000   0.000000   2.278000 (  2.281228)

The implementations being tested (on Ruby 1.9.2):

正在测试的实现(关于Ruby 1.9.2):

class Array
  def split_with_inject
    inject([[]]) do |a,v|
      a.tap{ yield(v) ? (a << []) : (a.last << v) }
    end.tap{ |a| a.pop if a.last.empty? }
  end

  def split_with_inject_no_tap
    result = inject([[]]) do |a,v|
      yield(v) ? (a << []) : (a.last << v)
      a
    end
    result.pop if result.last.empty?
    result
  end

  def split_with_each
    result = [a=[]]
    each{ |o| yield(o) ? (result << a=[]) : (a << o) }
    result.pop if a.empty?
    result
  end

  def split_with_chunk
    chunk{ |o| !!yield(o) }.reject{ |b,a| b }.map{ |b,a| a }
  end

  def split_with_chunk2
    chunk{ |o| !!yield(o) }.map{ |b,a| b ? nil : a }.compact
  end

  def split_with_chunk3
    chunk{ |o| yield(o) || nil }.map{ |b,a| b && a }.compact
  end
end

#4


1  

Other Enumerable methods you might want to consider is each_slice or each_cons

您可能需要考虑的其他可枚举方法是each_slice或each_cons

I don't know how general you want it to be, here's one way

我不知道你希望它有多普遍,这里有一个方法

>> (1..9).each_slice(3) {|a| p a.size>1?a[0..-2]:a}
[1, 2]
[4, 5]
[7, 8]
=> nil
>> (1..10).each_slice(3) {|a| p a.size>1?a[0..-2]:a}
[1, 2]
[4, 5]
[7, 8]
[10]

#5


1  

here is another one (with a benchmark comparing it to the fastest split_with_each here https://*.com/a/4801483/410102):

这里是另一个(与split_with_each相比,这里是https://*.com/a/4801483/410102):

require 'benchmark'

class Array
  def split_with_each
    result = [a=[]]
    each{ |o| yield(o) ? (result << a=[]) : (a << o) }
    result.pop if a.empty?
    result
  end

  def split_with_each_2
    u, v = [], []
    each{ |x| (yield x) ? (u << x) : (v << x) }
    [u, v]
  end
end

a = *(1..5000); N = 1000
Benchmark.bmbm do |x|
  %w[ split_with_each split_with_each_2 ].each do |method|
    x.report( method ){ N.times{ a.send(method){ |i| i%3==0 || i%5==0 } } }
  end
end

                        user     system      total        real
split_with_each     2.730000   0.000000   2.730000 (  2.742135)
split_with_each_2   2.270000   0.040000   2.310000 (  2.309600)