生成与预定义值相加的随机数-IT科技

摘要：问题描述：如何生成n伪随机数，使得它们之和等于特定值？事情是这样的：我想要（例如）生成 4 个伪随机数，将它们加在一起等于 40。在 Python 中如何实现这一点？我可以生成一个 1-40 的随机数，然后生成 1 和余数之间的另一个数，等等，但第一个数“抓取”更多内容的机会更大。解决方案 1：这是标准解决方...

问题描述：

如何生成n伪随机数，使得它们之和等于特定值？

事情是这样的：我想要（例如）生成 4 个伪随机数，将它们加在一起等于 40。

在 Python 中如何实现这一点？我可以生成一个 1-40 的随机数，然后生成 1 和余数之间的另一个数，等等，但第一个数“抓取”更多内容的机会更大。

解决方案 1：

这是标准解决方案。它与 Laurence Gonsalves 的答案类似，但比该答案有两个优点。

它是均匀的：每个 4 个正整数的组合，加起来等于 40，都有同样的可能性形成这种方案。

和

它很容易适应其他总数（7 个数字加起来等于 100，等等）

import random

def constrained_sum_sample_pos(n, total):
    """Return a randomly chosen list of n positive integers summing to total.
    Each such list is equally likely to occur."""

    dividers = sorted(random.sample(range(1, total), n - 1))
    return [a - b for a, b in zip(dividers + [total], [0] + dividers)]

示例输出：

>>> constrained_sum_sample_pos(4, 40)
[4, 4, 25, 7]
>>> constrained_sum_sample_pos(4, 40)
[9, 6, 5, 20]
>>> constrained_sum_sample_pos(4, 40)
[11, 2, 15, 12]
>>> constrained_sum_sample_pos(4, 40)
[24, 8, 3, 5]

(a, b, c, d)解释： (1) 4 元组正整数a + b + c + d == 40，与 (2) 整数三元(e, f, g)组之间存在一一对应关系0 < e < f < g < 40，并且使用很容易生成后者random.sample。对应关系在一个方向上由给出(e, f, g) = (a, a + b, a + b + c)，(a, b, c, d) = (e, f - e, g - f, 40 - g)在反方向上由给出。

如果您想要非负整数（即允许0）而不是正整数，那么有一个简单的转换：如果(a, b, c, d)是非负整数，并且总和为40，那么(a+1, b+1, c+1, d+1)是正整数，并且总和为44，反之亦然。使用这个想法，我们有：

def constrained_sum_sample_nonneg(n, total):
    """Return a randomly chosen list of n nonnegative integers summing to total.
    Each such list is equally likely to occur."""

    return [x - 1 for x in constrained_sum_sample_pos(n, total + n)]

图形说明constrained_sum_sample_pos(4, 10)，感谢@FM。（略作编辑。）

0 1 2 3 4 5 6 7 8 9 10  # The universe.
|                    |  # Place fixed dividers at 0, 10.
|   |     |       |  |  # Add 4 - 1 randomly chosen dividers in [1, 9]
  a    b      c    d    # Compute the 4 differences: 2 3 4 1

解决方案 2：

使用多项分布

from numpy.random import multinomial
multinomial(40, [1/4.] * 4)

在这个例子中，每个变量将分布为二项分布，其平均值n * p等于40 * 1/4 = 10。

解决方案 3：

b = random.randint(2, 38)
a = random.randint(1, b - 1)
c = random.randint(b + 1, 39)
return [a, b - a, c - b, 40 - c]

（我假设你想要整数，因为你说了“1-40”，但这可以很容易地推广到浮点数。）

工作原理如下：

将总范围随机分为两部分，即 b。奇数范围是因为至少有 2 个低于中点，至少有 2 个高于中点。（这来自每个值的 1 个最小值）。
将每个范围随机分成两部分。同样，边界是为了考虑 1 的最小值。
返回每个切片的大小。它们加起来为 40。

解决方案 4：

生成 4 个随机数，计算它们的和，将每个数除以和并乘以 40。

如果您想要整数，那么这将需要一点非随机性。

解决方案 5：

在 [1,37] 范围内（允许重复），四个整数的排列方式只有 37^4 = 1,874,161 种。枚举它们，保存并计算加起来等于 40 的排列方式。（这将是一个更小的数字，N）。

在区间 [0, N-1] 中抽取均匀分布的随机整数 K，并返回第 K 个排列。很容易看出，这保证了可能结果空间的均匀分布，每个序列位置的分布相同。（我看到的许多答案的最终选择都会比前三个偏低！）

解决方案 6：

这就是狄利克雷多项分布的作用。这是对多项分布抽样的改进，因为我们不需要为每个要选择的项分配概率。

就上下文而言，多项分布从中抽取一个m具有选定概率的项（总和为 1），重复该n次数，然后计算每个项被选中的次数，根据构造，总和为n。相比之下，狄利克雷多项分布首先以在所有可能组合中均匀的方式对每个项被选中的概率进行采样，然后运行多项选择。

不幸的是，目前 SciPy 似乎还没有实现从这个分布中抽样的功能，所以我就此提出了一个问题。与此同时，以下函数将在所有可能的分割中以统一的方式生成m总和为的非负整数：n

import scipy.stats as sps

def random_split(n,m):
    """ Generate m non-negative integers summing to n. """
    p = sps.dirichlet.rvs(alpha=[1]*m, size=1)
    return sps.multinomial.rvs(n=n, p=p[0], size=1)

如果我们想进一步强制整数为正数，我们可以m从总和中减去并手动将 1 添加到所有项中：

import scipy.stats as sps

def random_split_positive(n,m):
    """ Generate m positive integers summing to n. """
    p = sps.dirichlet.rvs(alpha=[1]*m, size=1)
    return 1+sps.multinomial.rvs(n=n-m, p=p[0], size=1)

最后，为了完整性，如果项和总和都不限制为整数，则狄利克雷分布也很乐意找到一个分裂：

import scipy.stats as sps

def random_split_float(s,m):
    """ Generate m non-negative numbers summing to s. """
    return s*sps.dirichlet.rvs(alpha=[1]*m, size=1)

解决方案 7：

这是对@Mark Dickinson版本的一个小小的改进，允许生成的整数中出现零（这样它们就是非负的，而不是正的）：

import random

def constrained_sum_sample_pos(n, total):
    """Return a randomly chosen list of n non-negative integers summing to total.
    Each such list is equally likely to occur."""

    dividers = sorted(random.choices(range(0, total), k=n-1))
    return [a - b for a, b in zip(dividers + [total], [0] + dividers)]

该函数是random.choices()放回抽样，与不放回抽样相反。它是 Python 3.6 中的新增功能。random.sample()

解决方案 8：

如果你想要真正的随机性那么使用：

import numpy as np
def randofsum_unbalanced(s, n):
    # Where s = sum (e.g. 40 in your case) and n is the output array length (e.g. 4 in your case)
    r = np.random.rand(n)
    a = np.array(np.round((r/np.sum(r))*s,0),dtype=int)
    while np.sum(a) > s:
        a[np.random.choice(n)] -= 1
    while np.sum(a) < s:
        a[np.random.choice(n)] += 1
    return a

如果您想要更高水平的均匀性，那么可以利用多项分布：

def randofsum_balanced(s, n):
    return np.random.multinomial(s,np.ones(n)/n,size=1)[0]

解决方案 9：

在@markdickonson的基础上，提供对除数之间分布的一些控制。我引入了方差/抖动作为每个除数之间均匀距离的百分比。

 def constrained_sum_sample(n, total, variance=50):
    """Return a random-ish list of n positive integers summing to total.

    variance: int; percentage of the gap between the uniform spacing to vary the result.
    """
    divisor = total/n
    jiggle = divisor * variance / 100 / 2
    dividers = [int((x+1)*divisor + random.random()*jiggle) for x in range(n-1)]
    result = [a - b for a, b in zip(dividers + [total], [0] + dividers)]
    return result