从偏斜的正态分布中生成随机数

when you use the random(min,max) function in most languages, what is the distribution like ?

当你在大多数语言中使用随机（最小，最大）函数时，分布是什么样的？

what if i want to produce a range of numbers for 20% of the time, and another range of numbers for 80% of the time, how can i generate series of random number that follows that ?

如果我想在20％的时间内生成一系列数字，而在80％的时间内生成另一个数字范围，我该如何生成随后的一系列随机数呢？

ex) i should get random frequency but the frequency of "1" must be higher by around 20% than the frequency of "0"

ex）我应该得到随机频率，但“1”的频率必须比频率“0”高20％左右

9 个解决方案

#1

For most languages, the random number generated can be dependent on an algorithm within that language, or generated randomly based on the several factors such as time, processor, seed number.

对于大多数语言，生成的随机数可以取决于该语言中的算法，或者基于诸如时间，处理器，种子数等几个因素随机生成。

The distribution is not normal. In fact say if the function returns 5 integers, all 5 integers have a fair chance of appearing in the next function call. This is also known as uniformed distribution.

分布不正常。事实上，如果函数返回5个整数，则所有5个整数都很有可能出现在下一个函数调用中。这也称为均匀分布。

So say if you wish to produce a number (say 7) for 20% of the time, and another number (say 13) for 80% of the time, you can do an array like this:

所以说如果你希望在20％的时间内产生一个数字（比如7），而在80％的时间内产生另一个数字（比如13），你可以这样做一个数组：

var arr = [7,13,13,13,13];
var picked = arr[Math.floor(Math.random()*arr.length)] ; 
// since Math.random() returns a float from 0.0 to 1.0

So thus 7 has a 20% chance of appearing, and 13 has 80% chance.

因此7有20％的机会出现，13有80％的机会出现。

#2

This is one possible method:

这是一种可能的方法：

ranges = [(10..15), (20..30)]
selector = [0, 0, 1,1,1,1,1,1,1,1] # 80:20 distribution array

# now select a range randomly    
random_within_range(ranges(selector[random(10)]))  


def random_within_range range
  rand (range.last - range.begin - (range.exclude_end? ? 1 : 0)) + range.begin
end

#3

Most pseudo random generators built-in programming languages produce a uniform distribution, i.e. each value within the range has the same probability of being produced as any other value in the range. Indeed in some cases this requirement is part of the language standard. Some languages such as Python or R support various of the common distributions.

大多数伪随机生成器内置编程语言产生均匀分布，即该范围内的每个值具有与该范围内的任何其他值相同的生成概率。实际上，在某些情况下，这一要求是语言标准的一部分。某些语言（如Python或R）支持各种常见发行版。

If the language doesn't support it, you either have to use mathematical tricks to produce other distributions such as a normal distribution from a uniform one, or you can look for third-party libraries which perform this function.

如果语言不支持它，您必须使用数学技巧来生成其他分布，例如来自统一分布的正态分布，或者您可以查找执行此功能的第三方库。

Your problem seems much simpler however since the random variable is discrete (and of the simpler type thereof, i.e binary). The trick for these is to produce a random number form the uniform distribution, in a given range, say 0 to 999, and to split this range in the proportions associated with each value, in the case at hand this would be something like :

你的问题似乎要简单得多，因为随机变量是离散的（并且是更简单的类型，即二进制）。这些技巧的诀窍是在给定范围内（例如0到999）从均匀分布产生随机数，并以与每个值相关联的比例分割该范围，在这种情况下，这将是：

  If (RandomNumber) < 200    // 20%
     RandomVariable = 0
  Else                       // 80%
     RandomVariable = 1

This logic can of course be applied to n discrete variables.

该逻辑当然可以应用于n个离散变量。

#4

Your question differs from your example quite a bit. So I'll answer both and you can figure out whichever answers what you're really looking for.

你的问题与你的例子有很大的不同。所以我会回答这两个问题，你可以找出你真正想要的答案。

1) Your example (I don't know ruby or java, so bear with me)

1）你的例子（我不知道ruby或java，所以忍受我）

First generate a random number from a uniform distribution from 0 to 1, we'll call it X.
首先从0到1的均匀分布生成一个随机数，我们称之为X.
You can then setup a if/else (i.e. if ( x < .2) {1} else {0})
然后你可以设置一个if / else（即if（x <.2）{1} else {0}）

2) Generating random numbers from a normal distribution with skew

2）从具有偏斜的正态分布生成随机数

You can look into skewed distributions such as a skewed student T's distribution with high degree of freedom.
您可以查看偏斜的分布，例如具有高度*度的偏斜学生T的分布。
You can also use the normal CDF and just pick off numbers that way.
你也可以使用普通的CDF，然后用这种方式选择数字。
Here's a paper which discusses how to do it with multiple random numbers from a uniform distribution
这是一篇论文，讨论如何使用统一分布中的多个随机数来完成它
Finally, you can use a non-parametric approach which would involve kernal density estimation (I suspect you aren't looking for anything this sophisticated however).
最后，您可以使用非参数方法，这将涉及内核密度估计（我怀疑您不会寻找任何复杂的东西）。

#5

Like anybody says, pseudo-random number generator on most languages implements the uniform distribution over (0,1). If you have two responses categories (0,1) with p probability for 1, you have a Bernoulli distribution and can be emulated with

像任何人说的那样，大多数语言中的伪随机数生成器实现了（0,1）上的均匀分布。如果你有两个响应类别（0,1），p概率为1，你就有一个伯努利分布，可以用它来模拟

#  returns 1 with p probability and 0 with (1-p) probability
def bernoulli(p)
rand()<p ? 1:0;
end

Simple as that. Skewed normal distribution is a entirely different beast, made by the 'union' of pdf and cdf of a normal distribution to create the skew. You can read Azzalini's work here. Using gem distribution, you can generate the probability density function, with

就那么简单。倾斜的正态分布是完全不同的野兽，由正常分布的pdf和cdf的“联合”产生，以产生偏斜。你可以在这里阅读Azzalini的作品。使用gem分布，您可以生成概率密度函数

# require 'distribution'
def sn_pdf(x,alpha)
sp = 2*Distribution::Normal.pdf(x)*Distribution::Normal.cdf(x*alpha)
end

Obtains the cdf is difficult, because there isn't an analytical solution, so you should integrate. To obtain random numbers from a skewed normal, you could use the acceptation-rejection algorithm.

获得cdf很困难，因为没有分析解决方案，所以你应该整合。要从偏斜法线中获取随机数，可以使用接受拒绝算法。

#6

Most computer languages have a uniform distribution to their (pseudo) random integer generators. So each integer is equally likely.

大多数计算机语言对其（伪）随机整数生成器具有均匀分布。所以每个整数都有可能。

For your example, suppose you want "1" 55% of the time and "0" 45% of the time.

对于您的示例，假设您想要“1”55％的时间和“0”45％的时间。

To get unequal these frequencies, try generating a random number between 1 and 100. If the number generated is from 1 to 55, output "1"; otherwise output "0".

要使这些频率不相等，请尝试生成1到100之间的随机数。如果生成的数字是1到55，则输出“1”;否则输出“0”。

#7

Have a look at this lecture if you want a good mathematical understanding.

如果你想要一个良好的数学理解，看看这个讲座。

#8

How about

怎么样

var oneFreq = 80.0/100.0;
var output = 0;
if (Math.random() > oneFreq)
   output = 1;

or, if you want 20% of the values to be between 0 and 100, and 80% to be between 100 and 200.

或者，如果您希望20％的值介于0和100之间，80％介于100和200之间。

var oneFreq = 80.0/100.0;
var oneRange  = 100;
var zeroRange = 100;
var output = Math.random();
if (output > oneFreq)
   output = zeroRange + Math.floor(oneRange * (output - oneFreq));
else
   output = Math.floor(zeroRange * output);

#9

In ruby I would do it like this:

在红宝石中，我会这样做：

class DistributedRandom
  def initialize(left, right = nil)
    if right
      @distribution = [0] * left + [1] * right
    else
      @distribution = left
    end
  end
  def get
    @distribution[rand @distribution.length]
  end
end

Running a test with 80:20 distribution:

以80:20分布运行测试：

test = [0,0]
rnd = DistributedRandom.new 80, 20   # 80:20 distribution
10000.times { test[rnd.get] += 1 }; puts "Test 1", test

Running a test with 20% more distribution on the right side:

在右侧运行测试，分配额度增加20％：

test = [0,0]
rnd = DistributedRandom.new 100, 120   # +20% distribution
10000.times { test[rnd.get] += 1 }; puts "Test 2", test

Running a test with custom distribution with a trigonometric function over 91 discrete values, output however does not fit very well into the previous tests:

使用具有超过91个离散值的三角函数的自定义分布运行测试，但输出不适合以前的测试：

test = [0,0]
rnd = DistributedRandom.new((0..90).map {|x| Math.sin(Math::PI * x / 180.0)})
10000.times { test[rnd.get] += 1 }; puts "Test 3", test

#1