我可以使用list(itertools.combinations(range(n), m))
列出所有组合,但这通常会非常大。
给定
n
和m
,如何在不首先构建大量列表的情况下随机均匀地选择组合?
来自http://docs.python.org/2/library/itertools.html#recipes
def random_combination(iterable, r):
"Random selection from itertools.combinations(iterable, r)"
pool = tuple(iterable)
n = len(pool)
indices = sorted(random.sample(xrange(n), r))
return tuple(pool[i] for i in indices)
在itertools
模块中,有一个从迭代中返回随机组合的方法。下面是代码的两个版本,一个用于Python 2.x,另一个用于Python 3.x - 在这两种情况下,您使用的是generator,这意味着您不会在内存中创建大的迭代。
def random_combination(iterable, r):
"Random selection from itertools.combinations(iterable, r)"
pool = tuple(iterable)
n = len(pool)
indices = sorted(random.sample(xrange(n), r))
return tuple(pool[i] for i in indices)
在你的情况下,这样做很简单:
>>> import random
>>> def random_combination(iterable, r):
"Random selection from itertools.combinations(iterable, r)"
pool = tuple(iterable)
n = len(pool)
indices = sorted(random.sample(xrange(n), r))
return tuple(pool[i] for i in indices)
>>> n = 10
>>> m = 3
>>> print(random_combination(range(n), m))
(3, 5, 9) # Returns a random tuple with length 3 from the iterable range(10)
在Python 3.x的情况下,你用xrange
替换range
调用,但用例仍然是相同的。
def random_combination(iterable, r):
"Random selection from itertools.combinations(iterable, r)"
pool = tuple(iterable)
n = len(pool)
indices = sorted(random.sample(range(n), r))
return tuple(pool[i] for i in indices)
对于迭代,生成器的内存效率会更高:
def random_combination(iterable,r):
i = 0
pool = tuple(iterable)
n = len(pool)
rng = range(n)
while i < r:
i += 1
yield [pool[j] for j in random.sample(rng, r)]