在Python中合并排序列表

Question

我有一堆排序的对象列表和一个比较函数

class Obj :
    def __init__(p) :
        self.points = p
def cmp(a, b) :
    return a.points < b.points

a = [Obj(1), Obj(3), Obj(8), ...]
b = [Obj(1), Obj(2), Obj(3), ...]
c = [Obj(100), Obj(300), Obj(800), ...]

result = magic(a, b, c)
assert result == [Obj(1), Obj(1), Obj(2), Obj(3), Obj(3), Obj(8), ...]

magic

是什么样子的？我当前的实现是

def magic(*args) :
    r = []
    for a in args : r += a
    return sorted(r, cmp)

但这效率很低。更好的答案？

Answer 1

Python 标准库提供了一个方法：

heapq.merge

。
正如文档所说，它与使用 itertools 非常相似（但有更多限制）；如果你不能忍受这些限制（或者如果你不使用 Python 2.6），你可以这样做：

sorted(itertools.chain(args), cmp)

但是，我认为它与您自己的解决方案具有相同的复杂性，尽管使用迭代器应该提供一些相当好的优化和速度提升。

Answer 2

我喜欢罗伯托·利弗雷多的回答。我不知道 heapq.merge()。嗯嗯。

这是使用 Roberto 的领导的完整解决方案：

class Obj(object):
    def __init__(self, p) :
        self.points = p
    def __cmp__(self, b) :
        return cmp(self.points, b.points)
    def __str__(self):
        return "%d" % self.points

a = [Obj(1), Obj(3), Obj(8)]
b = [Obj(1), Obj(2), Obj(3)]
c = [Obj(100), Obj(300), Obj(800)]

import heapq

sorted = [item for item in heapq.merge(a,b,c)]
for item in sorted:
    print item

或者：

for item in heapq.merge(a,b,c):
    print item

Answer 3

使用

bisect

模块。来自文档：“该模块支持按排序顺序维护列表，而无需在每次插入后对列表进行排序。”

import bisect

def magic(*args):
    r = []
    for a in args:
        for i in a:
            bisect.insort(r, i)
    return r

Answer 4

您可以使用[堆](http://en.wikipedia.org/wiki/Heap_(data_struct)而不是使用列表。

插入的时间复杂度为 O(log(n))，因此合并 a、b 和 c 的时间复杂度为 O(n log(n))

在Python中，您可以使用

heapq

模块。

Answer 5

我不知道是否会更快，但你可以简化它：

def GetObjKey(a):
    return a.points

return sorted(a + b + c, key=GetObjKey)

当然，如果您愿意，也可以使用

cmp

而不是

key

。

Answer 6

使用排序的一行解决方案：

def magic(*args):
  return sorted(sum(args,[]), key: lambda x: x.points)

IMO 这个解决方案非常可读。

使用heapq模块，可能会更高效，但我没有测试过。你不能在heapq中指定cmp/key函数，所以你必须实现Obj来隐式排序。

import heapq
def magic(*args):
  h = []
  for a in args:
    heapq.heappush(h,a)
  return [i for i in heapq.heappop(h)

Answer 7

我问了类似的问题并得到了一些很好的答案：

连接一组有序整数生成 Python 迭代器

该问题的最佳解决方案是合并算法的变体，您可以在此处阅读：

维基百科：合并算法

Answer 8

下面是一个以 O(n) 比较运行的函数示例。

您可以通过创建 a 和 b 迭代器并递增它们来加快速度。

我只是调用该函数两次来合并 3 个列表：

def zip_sorted(a, b):
    '''
    zips two iterables, assuming they are already sorted
    '''
    i = 0
    j = 0
    result = []
    while i < len(a) and j < len(b):
        if a[i] < b[j]:
            result.append(a[i])
            i += 1
        else:
            result.append(b[j])
            j += 1
    if i < len(a):
        result.extend(a[i:])
    else:
        result.extend(b[j:])
    return result

def genSortedList(num,seed):
    result = [] 
    for i in range(num):
        result.append(i*seed)
    return result

if __name__ == '__main__':
    a = genSortedList(10000,2.0)
    b = genSortedList(6666,3.0)
    c = genSortedList(5000,4.0)
    d = zip_sorted(zip_sorted(a,b),c)
    print d

但是，heapq.merge混合使用了此方法并堆放所有列表的当前元素，因此应该表现得更好

Answer 9

给你：功能齐全的列表合并排序（改编自我的排序这里）：

def merge(*args):
    import copy
    def merge_lists(left, right):
        result = []
        while left and right:
            which_list = (left if left[0] <= right[0] else right)
            result.append(which_list.pop(0))
        return result + left + right
    lists = list(args)
    while len(lists) > 1:
        left, right = copy.copy(lists.pop(0)), copy.copy(lists.pop(0))
        result = merge_lists(left, right)
        lists.append(result)
    return lists.pop(0)

这样称呼它：

merged_list = merge(a, b, c)
for item in merged_list:
    print item

为了更好地衡量，我将对您的 Obj 类进行一些更改：

class Obj(object):
    def __init__(self, p) :
        self.points = p
    def __cmp__(self, b) :
        return cmp(self.points, b.points)
    def __str__(self):
        return "%d" % self.points

从对象派生
通过
```
self
```
到
```
__init__()
```
使
```
__cmp__
```
成为成员函数
添加
```
str()
```
成员函数以将
```
Obj
```
显示为字符串

Answer 10

这是你的神奇函数，时间复杂度为 O(n0+n1+n2+...)。

它跟踪每个可迭代对象的指针，并在任何指针处生成最小的项。即与归并排序的“合并”部分相同的想法。

def merge(*iterables, key=lambda x:x):
    L = len(iterables)
    ls = [len(l) for l in iterables]
    ps = [0] * len(iterables)
    while sum([ls[i] - ps[i] for i in range(0, L)]) > 0:
        min, mink = None, None
        mini = -1
        for i in range(0, L):  # loop over pointers
            if ps[i] == ls[i]: continue
            keyed = key(iterables[i][ps[i]])
            if mink is None or keyed < mink:
                min = iterables[i][ps[i]]
                mink = keyed
                mini = i
        yield min
        ps[mini] += 1

用法是：

merge([1, 2, 3], [3, 4, 5], [2, 4, 10]) == [1, 2, 2, 3, 3, 4, 4, 5, 10]

注意：它需要一个键函数，但输入的可迭代对象也必须根据相同的键进行排序，否则输入将无法正确排序。

在Python中合并排序列表

问题描述投票：0回答：10

10个回答

最新问题

在Python中合并排序列表

问题描述 投票：0回答：10

10个回答

最新问题

问题描述投票：0回答：10