我用什么来实现 Python 中的最大堆?

问题描述 投票:0回答:19

Python 包含用于 min-heapheapq 模块,但我需要一个 max-heap。我应该使用什么来实现 Python 中的最大堆?

python data-structures heap recursive-datastructures
19个回答
442
投票

最简单的方法是反转键的值并使用heapq。例如,将 1000.0 变为 -1000.0,将 5.0 变为 -5.0.


369
投票

你可以使用

import heapq
listForTree = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]    
heapq.heapify(listForTree)             # for a min heap
heapq._heapify_max(listForTree)        # for a maxheap!!

如果你想弹出元素,使用:

heapq.heappop(minheap)      # pop from minheap
heapq._heappop_max(maxheap) # pop from maxheap

139
投票

解决方案是在将值存储在堆中时取反,或者像这样反转对象比较:

import heapq

class MaxHeapObj(object):
  def __init__(self, val): self.val = val
  def __lt__(self, other): return self.val > other.val
  def __eq__(self, other): return self.val == other.val
  def __str__(self): return str(self.val)

最大堆的例子:

maxh = []
heapq.heappush(maxh, MaxHeapObj(x))
x = maxh[0].val  # fetch max value
x = heapq.heappop(maxh).val  # pop max value

但是你必须记住包装和解包你的值,这需要知道你正在处理的是最小堆还是最大堆。

MinHeap、MaxHeap 类

MinHeap
MaxHeap
对象添加类可以简化您的代码:

class MinHeap(object):
  def __init__(self): self.h = []
  def heappush(self, x): heapq.heappush(self.h, x)
  def heappop(self): return heapq.heappop(self.h)
  def __getitem__(self, i): return self.h[i]
  def __len__(self): return len(self.h)

class MaxHeap(MinHeap):
  def heappush(self, x): heapq.heappush(self.h, MaxHeapObj(x))
  def heappop(self): return heapq.heappop(self.h).val
  def __getitem__(self, i): return self.h[i].val

用法示例:

minh = MinHeap()
maxh = MaxHeap()
# add some values
minh.heappush(12)
maxh.heappush(12)
minh.heappush(4)
maxh.heappush(4)
# fetch "top" values
print(minh[0], maxh[0])  # "4 12"
# fetch and remove "top" values
print(minh.heappop(), maxh.heappop())  # "4 12"

65
投票

最简单理想的解决方案

将值乘以 -1

给你。所有最高的数字现在都是最低的,反之亦然。

请记住,当您弹出一个元素以将其与 -1 相乘以再次获得原始值时。


21
投票

最简单的方法是将每个元素转换为负值,它将解决您的问题。

import heapq
heap = []
heapq.heappush(heap, 1*(-1))
heapq.heappush(heap, 10*(-1))
heapq.heappush(heap, 20*(-1))
print(heap)

输出将如下所示:

[-20, -1, -10]

15
投票

我还需要使用最大堆,而且我正在处理整数,所以我只是包装了我需要的两个方法

heap
如下:

import heapq


def heappush(heap, item):
    return heapq.heappush(heap, -item)


def heappop(heap):
    return -heapq.heappop(heap)

然后我分别用

heapq.heappush()
heapq.heappop()
替换了我的
heappush()
heappop()
电话。


15
投票

我实现了 heapq 的最大堆版本并将其提交给 PyPI。 (heapq 模块的 CPython 代码有非常细微的变化。)

heapq_max

Heapq_max(GitHub)

安装

pip install heapq_max

用法

tl;dr:与 heapq 模块相同,只是在所有函数中添加了‘_max’。

heap_max = []                           # Creates an empty heap
heappush_max(heap_max, item)            # Pushes a new item on the heap
item = heappop_max(heap_max)            # Pops the largest item from the heap
item = heap_max[0]                      # The largest item on the heap without popping it
heapify_max(x)                          # Transforms the list into a heap, in-place, in linear time
item = heapreplace_max(heap_max, item)  # Pops and returns the largest item, and
                                        # adds a new item; the heap size is unchanged

12
投票

这是一个基于 heapq 的简单最大堆实现。虽然它只适用于数值。

import heapq
from typing import List


class MaxHeap:
    def __init__(self):
        self.data = []

    def top(self):
        return -self.data[0]

    def push(self, val):
        heapq.heappush(self.data, -val)

    def pop(self):
        return -heapq.heappop(self.data)

用法:

max_heap = MaxHeap()
max_heap.push(3)
max_heap.push(5)
max_heap.push(1)
print(max_heap.top())  # 5

6
投票

最好的方法:

from heapq import *

h = [5, 7, 9, 1, 3]
h_neg = [-i for i in h]
heapify(h_neg)            # heapify
heappush(h_neg, -2)       # push
print(-heappop(h_neg))    # pop
# 9

4
投票

如果您要插入可比较但不像 int 的键,您可能会覆盖它们的比较运算符(即 <= become > 和 > 变成 <=). Otherwise, you can override heapq._siftup in the heapq module (it's all just Python code, in the end).


4
投票

扩展 int 类并覆盖 __lt__ 是方法之一。

import queue
class MyInt(int):
    def __lt__(self, other):
        return self > other

def main():
    q = queue.PriorityQueue()
    q.put(MyInt(10))
    q.put(MyInt(5))
    q.put(MyInt(1))
    while not q.empty():
        print (q.get())


if __name__ == "__main__":
    main()

2
投票

允许您选择任意数量的最大或最小项目

import heapq
heap = [23, 7, -4, 18, 23, 42, 37, 2, 8, 2, 23, 7, -4, 18, 23, 42, 37, 2]
heapq.heapify(heap)
print(heapq.nlargest(3, heap))  # [42, 42, 37]
print(heapq.nsmallest(3, heap)) # [-4, -4, 2]

1
投票

我创建了一个堆包装器,它反转值以创建一个最大堆,以及一个用于最小堆的包装器类,使库更像 OOP。 这里是要点。分为三类;堆(抽象类)、HeapMin 和 HeapMax。

方法:

isempty() -> bool; obvious
getroot() -> int; returns min/max
push() -> None; equivalent to heapq.heappush
pop() -> int; equivalent to heapq.heappop
view_min()/view_max() -> int; alias for getroot()
pushpop() -> int; equivalent to heapq.pushpop

1
投票

为了详细说明 Apoorv Patne 的回答,这里有一个针对一般情况的完整记录、注释和测试的 Python 3 实现。

from __future__ import annotations  # To allow "MinHeap.push -> MinHeap:"
from typing import Generic, List, Optional, TypeVar
from heapq import heapify, heappop, heappush, heapreplace


T = TypeVar('T')


class MinHeap(Generic[T]):
    '''
    MinHeap provides a nicer API around heapq's functionality.
    As it is a minimum heap, the first element of the heap is always the
    smallest.
    >>> h = MinHeap([3, 1, 4, 2])
    >>> h[0]
    1
    >>> h.peek()
    1
    >>> h.push(5)  # N.B.: the array isn't always fully sorted.
    [1, 2, 4, 3, 5]
    >>> h.pop()
    1
    >>> h.pop()
    2
    >>> h.pop()
    3
    >>> h.push(3).push(2)
    [2, 3, 4, 5]
    >>> h.replace(1)
    2
    >>> h
    [1, 3, 4, 5]
    '''
    def __init__(self, array: Optional[List[T]] = None):
        if array is None:
            array = []
        heapify(array)
        self.h = array
    def push(self, x: T) -> MinHeap:
        heappush(self.h, x)
        return self  # To allow chaining operations.
    def peek(self) -> T:
        return self.h[0]
    def pop(self) -> T:
        return heappop(self.h)
    def replace(self, x: T) -> T:
        return heapreplace(self.h, x)
    def __getitem__(self, i) -> T:
        return self.h[i]
    def __len__(self) -> int:
        return len(self.h)
    def __str__(self) -> str:
        return str(self.h)
    def __repr__(self) -> str:
        return str(self.h)


class Reverse(Generic[T]):
    '''
    Wrap around the provided object, reversing the comparison operators.
    >>> 1 < 2
    True
    >>> Reverse(1) < Reverse(2)
    False
    >>> Reverse(2) < Reverse(1)
    True
    >>> Reverse(1) <= Reverse(2)
    False
    >>> Reverse(2) <= Reverse(1)
    True
    >>> Reverse(2) <= Reverse(2)
    True
    >>> Reverse(1) == Reverse(1)
    True
    >>> Reverse(2) > Reverse(1)
    False
    >>> Reverse(1) > Reverse(2)
    True
    >>> Reverse(2) >= Reverse(1)
    False
    >>> Reverse(1) >= Reverse(2)
    True
    >>> Reverse(1)
    1
    '''
    def __init__(self, x: T) -> None:
        self.x = x
    def __lt__(self, other: Reverse) -> bool:
        return other.x.__lt__(self.x)
    def __le__(self, other: Reverse) -> bool:
        return other.x.__le__(self.x)
    def __eq__(self, other) -> bool:
        return self.x == other.x
    def __ne__(self, other: Reverse) -> bool:
        return other.x.__ne__(self.x)
    def __ge__(self, other: Reverse) -> bool:
        return other.x.__ge__(self.x)
    def __gt__(self, other: Reverse) -> bool:
        return other.x.__gt__(self.x)
    def __str__(self):
        return str(self.x)
    def __repr__(self):
        return str(self.x)


class MaxHeap(MinHeap):
    '''
    MaxHeap provides an implement of a maximum-heap, as heapq does not provide
    it. As it is a maximum heap, the first element of the heap is always the
    largest. It achieves this by wrapping around elements with Reverse,
    which reverses the comparison operations used by heapq.
    >>> h = MaxHeap([3, 1, 4, 2])
    >>> h[0]
    4
    >>> h.peek()
    4
    >>> h.push(5)  # N.B.: the array isn't always fully sorted.
    [5, 4, 3, 1, 2]
    >>> h.pop()
    5
    >>> h.pop()
    4
    >>> h.pop()
    3
    >>> h.pop()
    2
    >>> h.push(3).push(2).push(4)
    [4, 3, 2, 1]
    >>> h.replace(1)
    4
    >>> h
    [3, 1, 2, 1]
    '''
    def __init__(self, array: Optional[List[T]] = None):
        if array is not None:
            array = [Reverse(x) for x in array]  # Wrap with Reverse.
        super().__init__(array)
    def push(self, x: T) -> MaxHeap:
        super().push(Reverse(x))
        return self
    def peek(self) -> T:
        return super().peek().x
    def pop(self) -> T:
        return super().pop().x
    def replace(self, x: T) -> T:
        return super().replace(Reverse(x)).x


if __name__ == '__main__':
    import doctest
    doctest.testmod()

https://gist.github.com/marccarre/577a55850998da02af3d4b7b98152cf4


1
投票

heapq 模块包含实现最大堆所需的一切。 它仅执行最大堆的 heappush 功能。 我在下面演示了如何克服这个问题。

在heapq模块中添加这个函数:

def _heappush_max(heap, item):
    """Push item onto heap, maintaining the heap invariant."""
    heap.append(item)
    _siftdown_max(heap, 0, len(heap)-1)

最后,添加:

try:
    from _heapq import _heappush_max
except ImportError:
    pass

瞧!完成了

PS - 去heapq函数。首先在你的编辑器中输入“import heapq”,然后右键单击“heapq”并选择“go to definition”。


0
投票

如果您想使用最大堆获得最大的 K 元素,可以执行以下技巧:

nums= [3,2,1,5,6,4]
k = 2  #k being the kth largest element you want to get
heapq.heapify(nums) 
temp = heapq.nlargest(k, nums)
return temp[-1]

0
投票

我创建了一个名为 heap_class 的包,它实现了最大堆,并将各种堆函数包装到一个列表兼容的环境中。

>>> from heap_class import Heap
>>> h = Heap([3, 1, 9, 20], max=True)
>>> h.pop()
20
>>> h.peek()  # same as h[0]
9
>>> h.push(17)  # or h.append(17)
>>> h[0]  # same as h.peek()
17
>>> h[1]  # inefficient, but works
9

从最大堆中获取最小堆。

>>> y = reversed(h)
>>> y.peek()
1
>>> y  # repr is inefficient, but correct
Heap([1, 3, 9, 17], max=False)
>>> 9 in y
True
>>> y.raw()  # underlying heap structure
[1, 3, 17, 9]

正如其他人所提到的,在最大堆中处理字符串和复杂对象在 heapq 中相当困难,因为 否定形式。易于使用 heap_class 实现:

>>> h = Heap(('aa', 4), ('aa', 5), ('zz', 2), ('zz', 1), max=True)
>>> h.pop()
('zz', 2)

支持自定义键并与后续的推送/追加和弹出一起使用:

>>> vals = [('Adam', 'Smith'), ('Zeta', 'Jones')]
>>> h = Heap(vals, key=lambda name: name[1])
>>> h.peek()  # Jones comes before Smith
('Zeta', 'Jones')
>>> h.push(('Aaron', 'Allen'))
>>> h.peek()
('Aaron', 'Allen')

(实现是建立在 heapq 函数上的,所以除了 heappush 和 heapreplace on max heap 是在 Python 中之外,它都是用 C 或 C 包装器实现的)


0
投票

在 Python 中有内置堆,但如果有人想像我一样自己构建它,我只想分享这个。

我是 Python 的新手。不要判断我是否犯了错误。 该算法正在运行,但我不知道效率如何。

class Heap:

    def __init__(self):
        self.heap = []
        self.size = 0


    def add(self, heap):
        self.heap = heap
        self.size = len(self.heap)

    def heappush(self, value):
        self.heap.append(value)
        self.size += 1


    def heapify(self, heap, index=0):

        mid = int(self.size /2)
        """
            if you want to travel great value from bottom to the top you need to repeat swaping by the hight of the tree
            I  don't how how can i get the  height of the tree that's why i use sezi/2
            you can find height by this formula
            2^(x) = size+1  why 2^x because tree is growing exponentially
            xln(2) = ln(size+1)
            x = ln(size+1)/ln(2)
        """

        for i in range(mid):
            self.createTee(heap, index)

        return heap

    def createTee(self, heap, shiftindex):

        """
        """
        """

            this pos reffer to the index of the parent only parent with children
                    (1)
                (2)      (3)           here the size of list is 7/2 = 3
            (4)   (5)  (6)  (7)        the number of parent is 3 but we use {2, 1, 0} in while loop
                                       that why a put pos -1

        """
        pos = int(self.size /2) -1
        """
            this if you wanna sort this heap list we should swap max value in the root of the tree with the last
            value in the list and if you wanna repeat this until sort all list you will need to prevent the func from
            change what we already sorted I should decrease the size of the list that will heapify on it

        """

        newsize = self.size - shiftindex
        while pos >= 0:
            left_child = pos * 2 + 1
            right_child = pos * 2 + 2
            # This means that left child is exist
            if left_child < newsize:
                if right_child < newsize:

                    # If the right child exit we wanna check if left
                    # child > rightchild.
                    #
                    # If the right child doesn't exist we can check that
                    # we will get error out of range.
                    if heap[pos] < heap[left_child] and heap[left_child]  > heap[right_child]:
                        heap[left_child], heap[pos] = heap[pos], heap[left_child]
                # Here if the righ child doesn't exist
                else:
                    if heap[pos] < heap[left_child]:
                        heap[left_child], heap[pos] = heap[pos], heap[left_child]
            # If the right child exist
            if right_child < newsize:
                if heap[pos] < heap[right_child]:
                    heap[right_child], heap[pos] = heap[pos], heap[right_child]
            pos -= 1

        return heap

    def sort(self):
        k = 1
        for i in range(self.size -1, 0, -1):
            """
            because this is max heap we swap root with last element in the list

            """
            self.heap [0], self.heap[i] = self.heap[i], self.heap[0]
            self.heapify(self.heap, k)
            k += 1

        return self.heap


h = Heap()
h.add([5, 7, 0, 8, 9, 10, 20, 30, 50, -1])
h.heappush(-2)
print(" before heapify ")
print(h.heap)
print(" after heapify ")
print(h.heapify(h.heap, 0))
print(" after sort ")
print(h.sort())

输出:

堆化之前

[5, 7, 0, 8, 9, 10, 20, 30, 50, -1, -2]

堆化后

[50, 30, 20, 8, 9, 10, 0, 7, 5, -1, -2]

排序后

[-2, -1, 0, 5, 7, 8, 9, 10, 20, 30, 50]

我希望你能理解我的代码。如果有什么不明白的地方,请发表评论。我会尽力帮助。


0
投票
arr = [3, 4, 5, 1, 2, 3, 0, 7, 8, 90, 67, 31, 2, 5, 567]
# max-heap sort will lead the array to ascending order
def maxheap(arr, p):

    for i in range(len(arr)-p):
        if i > 0:
            child = i
            parent = (i + 1)//2 - 1

            while arr[child]> arr[parent] and child != 0:
                arr[child], arr[parent] = arr[parent], arr[child]
                child = parent
                parent = (parent + 1)//2 -1


def heapsort(arr):
    for i in range(len(arr)):
        maxheap(arr, i)
        arr[0], arr[len(arr)-i-1] = arr[len(arr)-i-1], arr[0]

    return arr


print(heapsort(arr))

试试这个。

© www.soinside.com 2019 - 2024. All rights reserved.