如何在Ruby中生成n个唯一随机数的列表?

时间:2022-11-24 22:36:54

This is what I have so far:

这是我目前所拥有的:

myArray.map!{ rand(max) }

Obviously, however, sometimes the numbers in the list are not unique. How can I make sure my list only contains unique numbers without having to create a bigger list from which I then just pick the n unique numbers?

然而,很明显,有时候列表中的数字并不是唯一的。如何确保我的列表中只包含唯一的数字,而不需要创建一个更大的列表,然后从中选择n个唯一的数字?

Edit:
I'd really like to see this done w/o loop - if at all possible.

编辑:如果可能的话,我真的想看看这个w/o循环。

14 个解决方案

#1


23  

This uses Set:

它使用:

require 'set'

def rand_n(n, max)
    randoms = Set.new
    loop do
        randoms << rand(max)
        return randoms.to_a if randoms.size >= n
    end
end

#2


61  

(0..50).to_a.sort{ rand() - 0.5 }[0..x] 

(0..50).to_a can be replaced with any array. 0 is "minvalue", 50 is "max value" x is "how many values i want out"

(0 . . 50)。to_a可以被任何数组替换。0是"minvalue" 50是"max value" x是"我想要多少值"

of course, its impossible for x to be permitted to be greater than max-min :)

当然,x不可能大于max-min:)

In expansion of how this works

来扩展它的工作方式

(0..5).to_a  ==> [0,1,2,3,4,5]
[0,1,2,3,4,5].sort{ -1 }  ==>  [0, 1, 2, 4, 3, 5]  # constant
[0,1,2,3,4,5].sort{  1 }  ==>  [5, 3, 0, 4, 2, 1]  # constant
[0,1,2,3,4,5].sort{ rand() - 0.5 }   ==>  [1, 5, 0, 3, 4, 2 ]  # random
[1, 5, 0, 3, 4, 2 ][ 0..2 ]   ==>  [1, 5, 0 ]

Footnotes:

It is worth mentioning that at the time this question was originally answered, September 2008, that Array#shuffle was either not available or not already known to me, hence the approximation in Array#sort

值得一提的是,在这个问题最初被回答的时候,也就是2008年9月,数组#shuffle要么不可用,要么我还不知道,这就是数组#sort中的近似值

And there's a barrage of suggested edits to this as a result.

这就产生了一连串的建议。

So:

所以:

.sort{ rand() - 0.5 }

Can be better, and shorter expressed on modern ruby implementations using

在使用现代ruby实现时,是否可以更好、更短地表达

.shuffle

Additionally,

此外,

[0..x]

Can be more obviously written with Array#take as:

可以更明显地使用数组#take as编写:

.take(x)

Thus, the easiest way to produce a sequence of random numbers on a modern ruby is:

因此,在现代ruby上生成随机数序列的最简单方法是:

(0..50).to_a.shuffle.take(x)

#3


22  

Just to give you an idea about speed, I ran four versions of this:

为了让你们了解一下速度,我写了四个版本

  1. Using Sets, like Ryan's suggestion.
  2. 使用集合,如Ryan的建议。
  3. Using an Array slightly larger than necessary, then doing uniq! at the end.
  4. 使用一个稍微大于必要的数组,然后执行uniq!在最后。
  5. Using a Hash, like Kyle suggested.
  6. 使用散列,就像凯尔建议的那样。
  7. Creating an Array of the required size, then sorting it randomly, like Kent's suggestion (but without the extraneous "- 0.5", which does nothing).
  8. 创建一个所需大小的数组,然后随机排序,就像Kent的建议一样(但是没有无关的“- 0.5”,它什么都不做)。

They're all fast at small scales, so I had them each create a list of 1,000,000 numbers. Here are the times, in seconds:

它们在小范围内都很快,所以我让它们各自创建一个包含100万个数字的列表。以下是时间,以秒为单位:

  1. Sets: 628
  2. 集:628
  3. Array + uniq: 629
  4. 阵列+ uniq:629
  5. Hash: 645
  6. 散列:645
  7. fixed Array + sort: 8
  8. 固定阵列+排序:8

And no, that last one is not a typo. So if you care about speed, and it's OK for the numbers to be integers from 0 to whatever, then my exact code was:

不,最后一个不是打印错误。如果你关心速度,数字从0到任意整数都没问题,那么我的确切代码是:

a = (0...1000000).sort_by{rand}

#4


21  

Ruby 1.9 offers the Array#sample method which returns an element, or elements randomly selected from an Array. The results of #sample won't include the same Array element twice.

Ruby 1.9提供了数组#样例方法,该方法返回一个元素,或从数组中随机选择的元素。#sample的结果不会包含相同的数组元素两次。

(1..999).to_a.sample 5 # => [389, 30, 326, 946, 746]

When compared to the to_a.sort_by approach, the sample method appears to be significantly faster. In a simple scenario I compared sort_by to sample, and got the following results.

与to_a相比。sort_by方法,示例方法似乎要快得多。在一个简单的场景中,我比较了sort_by和sample,得到了以下结果。

require 'benchmark'
range = 0...1000000
how_many = 5

Benchmark.realtime do
  range.to_a.sample(how_many)
end
=> 0.081083

Benchmark.realtime do
  (range).sort_by{rand}[0...how_many]
end
=> 2.907445

#5


4  

Yes, it's possible to do this without a loop and without keeping track of which numbers have been chosen. It's called a Linear Feedback Shift Register: Create Random Number Sequence with No Repeats

是的,这样做是有可能的,没有一个循环,没有跟踪哪个数字被选择。它被称为线性反馈移位寄存器:创建无重复的随机数序列

#6


2  

How about a play on this? Unique random numbers without needing to use Set or Hash.

你看这个怎么样?不需要使用集合或散列的唯一随机数。

x = 0
(1..100).map{|iter| x += rand(100)}.shuffle

#7


1  

You could use a hash to track the random numbers you've used so far:

你可以使用散列来跟踪到目前为止你使用的随机数:

seen = {}
max = 100
(1..10).map { |n|
  x = rand(max)
  while (seen[x]) 
    x = rand(max)
  end
  x
}

#8


1  

Rather than add the items to a list/array, add them to a Set.

与其将项目添加到列表/数组中,不如将它们添加到一个集合中。

#9


1  

If you have a finite list of possible random numbers (i.e. 1 to 100), then Kent's solution is good.

如果你有一个可能随机数的有限列表(比如1到100),那么Kent的解决方案就很好。

Otherwise there is no other good way to do it without looping. The problem is you MUST do a loop if you get a duplicate. My solution should be efficient and the looping should not be too much more than the size of your array (i.e. if you want 20 unique random numbers, it might take 25 iterations on average.) Though the number of iterations gets worse the more numbers you need and the smaller max is. Here is my above code modified to show how many iterations are needed for the given input:

否则就没有其他没有循环的好方法了。问题是,如果你得到一个副本,你必须做一个循环。我的解决方案应该是高效的,循环范围不应该超过数组的大小(例如,如果您想要20个唯一随机数,那么平均需要25次迭代)。虽然迭代次数越少,需要的次数越多,最大值越小。下面是我上面修改的代码,用于显示给定输入需要多少次迭代:

require 'set'

def rand_n(n, max)
    randoms = Set.new
    i = 0
    loop do
        randoms << rand(max)
        break if randoms.size > n
        i += 1
    end
    puts "Took #{i} iterations for #{n} random numbers to a max of #{max}"
    return randoms.to_a
end

I could write this code to LOOK more like Array.map if you want :)

我可以把这段代码写得更像数组。地图如果你想:)

#10


1  

Based on Kent Fredric's solution above, this is what I ended up using:

根据上面Kent Fredric的解决方案,我最后使用的是:

def n_unique_rand(number_to_generate, rand_upper_limit)
  return (0..rand_upper_limit - 1).sort_by{rand}[0..number_to_generate - 1]
end

Thanks Kent.

谢谢你肯特。

#11


1  

No loops with this method

这个方法没有循环

Array.new(size) { rand(max) }

require 'benchmark'
max = 1000000
size = 5
Benchmark.realtime do
  Array.new(size) { rand(max) }
end

=> 1.9114e-05 

#12


0  

Here is one solution:

这里有一个解决方案:

Suppose you want these random numbers to be between r_min and r_max. For each element in your list, generate a random number r, and make list[i]=list[i-1]+r. This would give you random numbers which are monotonically increasing, guaranteeing uniqueness provided that

假设您希望这些随机数位于r_min和r_max之间。对于列表中的每个元素,生成一个随机数r,并使list[i]=list[i-1]+r。这会得到单调递增的随机数,保证了唯一性

  • r+list[i-1] does not over flow
  • r+list[i-1]不超流
  • r > 0
  • r > 0

For the first element, you would use r_min instead of list[i-1]. Once you are done, you can shuffle the list so the elements are not so obviously in order.

对于第一个元素,您将使用r_min而不是list[i-1]。一旦你完成了,你就可以洗牌了,这样元素的顺序就不那么明显了。

The only problem with this method is when you go over r_max and still have more elements to generate. In this case, you can reset r_min and r_max to 2 adjacent element you have already computed, and simply repeat the process. This effectively runs the same algorithm over an interval where there are no numbers already used. You can keep doing this until you have the list populated.

这个方法的唯一问题是当您遍历r_max并仍然有更多的元素要生成时。在这种情况下,可以将r_min和r_max重置为已经计算过的两个相邻元素,然后简单地重复这个过程。这实际上在没有使用数字的间隔上运行相同的算法。您可以继续这样做,直到填充列表。

#13


0  

As far as it is nice to know in advance the maxium value, you can do this way:

只要事先知道最大的价值,你就可以这样做:

class NoLoopRand
  def initialize(max)
    @deck = (0..max).to_a
  end

  def getrnd
    return @deck.delete_at(rand(@deck.length - 1))
  end
end

and you can obtain random data in this way:

你可以这样获取随机数据:

aRndNum = NoLoopRand.new(10)
puts aRndNum.getrnd

you'll obtain nil when all the values will be exausted from the deck.

当将所有值从甲板上清除时,您将获得nil。

#14


0  

Method 1

Using Kent's approach, it is possible to generate an array of arbitrary length keeping all values in a limited range:

使用Kent的方法,可以生成一个任意长度的数组,将所有值保持在一个有限的范围内:

# Generates a random array of length n.
#
# @param n     length of the desired array
# @param lower minimum number in the array
# @param upper maximum number in the array
def ary_rand(n, lower, upper)
    values_set = (lower..upper).to_a
    repetition = n/(upper-lower+1) + 1
    (values_set*repetition).sample n
end

Method 2

Another, possibly more efficient, method modified from same Kent's another answer:

另一种可能更有效的方法是从同样的Kent的另一种回答中改进而来的:

def ary_rand2(n, lower, upper)
    v = (lower..upper).to_a
    (0...n).map{ v[rand(v.length)] }
end

Output

puts (ary_rand 5, 0, 9).to_s # [0, 8, 2, 5, 6] expected
puts (ary_rand 5, 0, 9).to_s # [7, 8, 2, 4, 3] different result for same params
puts (ary_rand 5, 0, 1).to_s # [0, 0, 1, 0, 1] repeated values from limited range
puts (ary_rand 5, 9, 0).to_s # []              no such range :)

#1


23  

This uses Set:

它使用:

require 'set'

def rand_n(n, max)
    randoms = Set.new
    loop do
        randoms << rand(max)
        return randoms.to_a if randoms.size >= n
    end
end

#2


61  

(0..50).to_a.sort{ rand() - 0.5 }[0..x] 

(0..50).to_a can be replaced with any array. 0 is "minvalue", 50 is "max value" x is "how many values i want out"

(0 . . 50)。to_a可以被任何数组替换。0是"minvalue" 50是"max value" x是"我想要多少值"

of course, its impossible for x to be permitted to be greater than max-min :)

当然,x不可能大于max-min:)

In expansion of how this works

来扩展它的工作方式

(0..5).to_a  ==> [0,1,2,3,4,5]
[0,1,2,3,4,5].sort{ -1 }  ==>  [0, 1, 2, 4, 3, 5]  # constant
[0,1,2,3,4,5].sort{  1 }  ==>  [5, 3, 0, 4, 2, 1]  # constant
[0,1,2,3,4,5].sort{ rand() - 0.5 }   ==>  [1, 5, 0, 3, 4, 2 ]  # random
[1, 5, 0, 3, 4, 2 ][ 0..2 ]   ==>  [1, 5, 0 ]

Footnotes:

It is worth mentioning that at the time this question was originally answered, September 2008, that Array#shuffle was either not available or not already known to me, hence the approximation in Array#sort

值得一提的是,在这个问题最初被回答的时候,也就是2008年9月,数组#shuffle要么不可用,要么我还不知道,这就是数组#sort中的近似值

And there's a barrage of suggested edits to this as a result.

这就产生了一连串的建议。

So:

所以:

.sort{ rand() - 0.5 }

Can be better, and shorter expressed on modern ruby implementations using

在使用现代ruby实现时,是否可以更好、更短地表达

.shuffle

Additionally,

此外,

[0..x]

Can be more obviously written with Array#take as:

可以更明显地使用数组#take as编写:

.take(x)

Thus, the easiest way to produce a sequence of random numbers on a modern ruby is:

因此,在现代ruby上生成随机数序列的最简单方法是:

(0..50).to_a.shuffle.take(x)

#3


22  

Just to give you an idea about speed, I ran four versions of this:

为了让你们了解一下速度,我写了四个版本

  1. Using Sets, like Ryan's suggestion.
  2. 使用集合,如Ryan的建议。
  3. Using an Array slightly larger than necessary, then doing uniq! at the end.
  4. 使用一个稍微大于必要的数组,然后执行uniq!在最后。
  5. Using a Hash, like Kyle suggested.
  6. 使用散列,就像凯尔建议的那样。
  7. Creating an Array of the required size, then sorting it randomly, like Kent's suggestion (but without the extraneous "- 0.5", which does nothing).
  8. 创建一个所需大小的数组,然后随机排序,就像Kent的建议一样(但是没有无关的“- 0.5”,它什么都不做)。

They're all fast at small scales, so I had them each create a list of 1,000,000 numbers. Here are the times, in seconds:

它们在小范围内都很快,所以我让它们各自创建一个包含100万个数字的列表。以下是时间,以秒为单位:

  1. Sets: 628
  2. 集:628
  3. Array + uniq: 629
  4. 阵列+ uniq:629
  5. Hash: 645
  6. 散列:645
  7. fixed Array + sort: 8
  8. 固定阵列+排序:8

And no, that last one is not a typo. So if you care about speed, and it's OK for the numbers to be integers from 0 to whatever, then my exact code was:

不,最后一个不是打印错误。如果你关心速度,数字从0到任意整数都没问题,那么我的确切代码是:

a = (0...1000000).sort_by{rand}

#4


21  

Ruby 1.9 offers the Array#sample method which returns an element, or elements randomly selected from an Array. The results of #sample won't include the same Array element twice.

Ruby 1.9提供了数组#样例方法,该方法返回一个元素,或从数组中随机选择的元素。#sample的结果不会包含相同的数组元素两次。

(1..999).to_a.sample 5 # => [389, 30, 326, 946, 746]

When compared to the to_a.sort_by approach, the sample method appears to be significantly faster. In a simple scenario I compared sort_by to sample, and got the following results.

与to_a相比。sort_by方法,示例方法似乎要快得多。在一个简单的场景中,我比较了sort_by和sample,得到了以下结果。

require 'benchmark'
range = 0...1000000
how_many = 5

Benchmark.realtime do
  range.to_a.sample(how_many)
end
=> 0.081083

Benchmark.realtime do
  (range).sort_by{rand}[0...how_many]
end
=> 2.907445

#5


4  

Yes, it's possible to do this without a loop and without keeping track of which numbers have been chosen. It's called a Linear Feedback Shift Register: Create Random Number Sequence with No Repeats

是的,这样做是有可能的,没有一个循环,没有跟踪哪个数字被选择。它被称为线性反馈移位寄存器:创建无重复的随机数序列

#6


2  

How about a play on this? Unique random numbers without needing to use Set or Hash.

你看这个怎么样?不需要使用集合或散列的唯一随机数。

x = 0
(1..100).map{|iter| x += rand(100)}.shuffle

#7


1  

You could use a hash to track the random numbers you've used so far:

你可以使用散列来跟踪到目前为止你使用的随机数:

seen = {}
max = 100
(1..10).map { |n|
  x = rand(max)
  while (seen[x]) 
    x = rand(max)
  end
  x
}

#8


1  

Rather than add the items to a list/array, add them to a Set.

与其将项目添加到列表/数组中,不如将它们添加到一个集合中。

#9


1  

If you have a finite list of possible random numbers (i.e. 1 to 100), then Kent's solution is good.

如果你有一个可能随机数的有限列表(比如1到100),那么Kent的解决方案就很好。

Otherwise there is no other good way to do it without looping. The problem is you MUST do a loop if you get a duplicate. My solution should be efficient and the looping should not be too much more than the size of your array (i.e. if you want 20 unique random numbers, it might take 25 iterations on average.) Though the number of iterations gets worse the more numbers you need and the smaller max is. Here is my above code modified to show how many iterations are needed for the given input:

否则就没有其他没有循环的好方法了。问题是,如果你得到一个副本,你必须做一个循环。我的解决方案应该是高效的,循环范围不应该超过数组的大小(例如,如果您想要20个唯一随机数,那么平均需要25次迭代)。虽然迭代次数越少,需要的次数越多,最大值越小。下面是我上面修改的代码,用于显示给定输入需要多少次迭代:

require 'set'

def rand_n(n, max)
    randoms = Set.new
    i = 0
    loop do
        randoms << rand(max)
        break if randoms.size > n
        i += 1
    end
    puts "Took #{i} iterations for #{n} random numbers to a max of #{max}"
    return randoms.to_a
end

I could write this code to LOOK more like Array.map if you want :)

我可以把这段代码写得更像数组。地图如果你想:)

#10


1  

Based on Kent Fredric's solution above, this is what I ended up using:

根据上面Kent Fredric的解决方案,我最后使用的是:

def n_unique_rand(number_to_generate, rand_upper_limit)
  return (0..rand_upper_limit - 1).sort_by{rand}[0..number_to_generate - 1]
end

Thanks Kent.

谢谢你肯特。

#11


1  

No loops with this method

这个方法没有循环

Array.new(size) { rand(max) }

require 'benchmark'
max = 1000000
size = 5
Benchmark.realtime do
  Array.new(size) { rand(max) }
end

=> 1.9114e-05 

#12


0  

Here is one solution:

这里有一个解决方案:

Suppose you want these random numbers to be between r_min and r_max. For each element in your list, generate a random number r, and make list[i]=list[i-1]+r. This would give you random numbers which are monotonically increasing, guaranteeing uniqueness provided that

假设您希望这些随机数位于r_min和r_max之间。对于列表中的每个元素,生成一个随机数r,并使list[i]=list[i-1]+r。这会得到单调递增的随机数,保证了唯一性

  • r+list[i-1] does not over flow
  • r+list[i-1]不超流
  • r > 0
  • r > 0

For the first element, you would use r_min instead of list[i-1]. Once you are done, you can shuffle the list so the elements are not so obviously in order.

对于第一个元素,您将使用r_min而不是list[i-1]。一旦你完成了,你就可以洗牌了,这样元素的顺序就不那么明显了。

The only problem with this method is when you go over r_max and still have more elements to generate. In this case, you can reset r_min and r_max to 2 adjacent element you have already computed, and simply repeat the process. This effectively runs the same algorithm over an interval where there are no numbers already used. You can keep doing this until you have the list populated.

这个方法的唯一问题是当您遍历r_max并仍然有更多的元素要生成时。在这种情况下,可以将r_min和r_max重置为已经计算过的两个相邻元素,然后简单地重复这个过程。这实际上在没有使用数字的间隔上运行相同的算法。您可以继续这样做,直到填充列表。

#13


0  

As far as it is nice to know in advance the maxium value, you can do this way:

只要事先知道最大的价值,你就可以这样做:

class NoLoopRand
  def initialize(max)
    @deck = (0..max).to_a
  end

  def getrnd
    return @deck.delete_at(rand(@deck.length - 1))
  end
end

and you can obtain random data in this way:

你可以这样获取随机数据:

aRndNum = NoLoopRand.new(10)
puts aRndNum.getrnd

you'll obtain nil when all the values will be exausted from the deck.

当将所有值从甲板上清除时,您将获得nil。

#14


0  

Method 1

Using Kent's approach, it is possible to generate an array of arbitrary length keeping all values in a limited range:

使用Kent的方法,可以生成一个任意长度的数组,将所有值保持在一个有限的范围内:

# Generates a random array of length n.
#
# @param n     length of the desired array
# @param lower minimum number in the array
# @param upper maximum number in the array
def ary_rand(n, lower, upper)
    values_set = (lower..upper).to_a
    repetition = n/(upper-lower+1) + 1
    (values_set*repetition).sample n
end

Method 2

Another, possibly more efficient, method modified from same Kent's another answer:

另一种可能更有效的方法是从同样的Kent的另一种回答中改进而来的:

def ary_rand2(n, lower, upper)
    v = (lower..upper).to_a
    (0...n).map{ v[rand(v.length)] }
end

Output

puts (ary_rand 5, 0, 9).to_s # [0, 8, 2, 5, 6] expected
puts (ary_rand 5, 0, 9).to_s # [7, 8, 2, 4, 3] different result for same params
puts (ary_rand 5, 0, 1).to_s # [0, 0, 1, 0, 1] repeated values from limited range
puts (ary_rand 5, 9, 0).to_s # []              no such range :)