PriorityQueue<TElement, TPriority>
集合,我经常需要保留具有相同优先级的元素的插入顺序(FIFO 顺序),据我所知,唯一的方法是将 TPriority
与额外的递增的int
指数:PriorityQueue<TElement, (TPriority, int)>
。这很好用,但让我感到不安的是,如果队列最终被使用了很长时间,int
索引将绕过 Int32.MaxValue
限制,破坏 FIFO 功能。我可以通过从 int
切换到 long
来解决这个问题,这使得索引几乎不可能换行,但仍然感觉很脏。我想知道是否有可能做得更好,方法是在索引即将用下一个 TElement
操作包裹时重建 TPriority
+Enqueue
对的索引。所以我为索引 FIFO 优先级队列写了这个扩展方法:
public static void Reindex<TElement, TPriority>(
this PriorityQueue<TElement, (TPriority, int)> source)
{
ArgumentNullException.ThrowIfNull(source);
(TElement, (TPriority, int))[] entries = source.UnorderedItems
.OrderBy(e => e.Priority, source.Comparer)
.ToArray();
source.Clear();
for (int i = 0; i < entries.Length; i++)
source.Enqueue(entries[i].Item1, (entries[i].Item2.Item1, i));
}
这个实现的问题是它在重新索引期间分配了不成比例的内存。例如,分配了 61,456 个字节用于重建具有 1,000 个元素的
PriorityQueue<object, (char, int)>
的索引:
PriorityQueue<object, (char, int)> queue = new(Comparer<(char, int)>.Create((x, y) =>
{
int result = x.Item1.CompareTo(y.Item1);
if (result == 0) result = x.Item2.CompareTo(y.Item2);
return result;
}));
Random random = new(0);
for (int i = 0; i < 100_000; i++)
{
char key = (char)random.Next(65, 91);
queue.Enqueue(new object(), (key, i));
}
while (queue.Count > 1_000) queue.Dequeue();
long mem0 = GC.GetTotalAllocatedBytes(true);
queue.Reindex();
long mem1 = GC.GetTotalAllocatedBytes(true);
Console.WriteLine($"Count: {queue.Count:#,0}, Allocated: {mem1 - mem0:#,0} bytes");
输出:
Count: 1,000, Allocated: 61,456 bytes
现场演示.
我想问一下是否有可能用零分配(就地)重建索引,或者至少不超过
System.Runtime.CompilerServices.Unsafe.SizeOf<(TElement, TPriority)>() * queue.Count
分配字节(在上面的示例中为 16,000 字节)。
根据我的测试,最大的内存消耗者是带排序的 LINQ(有或没有物化),因此只需将值复制到数组中并使用
Array.Sort
排序即可大大减少内存消耗:
public static void ReindexNew<TElement, TPriority>(this PriorityQueue<TElement, (TPriority, int)> source)
{
ArgumentNullException.ThrowIfNull(source);
(TElement, (TPriority, int))[] entries = new (TElement, (TPriority, int))[source.UnorderedItems.Count];
int counter = 0;
foreach (var item in source.UnorderedItems)
{
entries[counter++] = item;
}
// hope got comparisons right
Array.Sort(entries, (left, right) => source.Comparer.Compare(left.Item2, right.Item2));
source.Clear();
for (int i = 0; i < entries.Length; i++)
{
entries[i].Item2.Item2 = i;
}
source.EnqueueRange(entries);
// or something like this, requires time benchmarking
// for (int i = 0; i < entries.Length; i++)
// {
// source.Enqueue(entries[i].Item1, (entries[i].Item2.Item1, i));
// }
}
和“测试”:
void Test()
{
long mem0 = GC.GetTotalAllocatedBytes(true);
queue.ReindexNew();
long mem1 = GC.GetTotalAllocatedBytes(true);
Console.WriteLine($"Count: {queue.Count:#,0}, Allocated: {mem1 - mem0:#,0} bytes");
}
Test(); // Count: 1,000, Allocated: 16,136 bytes
Test(); // Count: 1,000, Allocated: 16,112 bytes
我只是在玩这个谜题。我不是专家。实际上我从未使用过 PriorityQueue。但是 TryDequeue 会减少大约 16k 字节的分配
计数:1.000,分配:16.152 字节
因为它一次只分配一个项目。关于表演思想,我无话可说。
public static void Reindex2<TElement, TPriority>(
this PriorityQueue<TElement, (TPriority, int)> source)
{
ArgumentNullException.ThrowIfNull(source);
int count = source.Count;
var orderedItems = new (TElement, (TPriority, int))[count];
int i = 0;
while (source.TryDequeue(out var item, out var priority))
{
orderedItems[i] = (item, (priority.Item1, priority.Item2));
i++;
}
Array.Sort(orderedItems, (a, b)
=> source.Comparer.Compare(a.Item2, b.Item2));
for (i = 0; i < count; i++)
{
source.Enqueue(orderedItems[i].Item1, (orderedItems[i].Item2.Item1, i));
}
}