c++中unordered_set如何确定插入顺序？

Question

我知道人们在不关心集合中元素的顺序时会使用

unordered_set

。但是，当我在 C++ Shell

上运行示例程序时

#include <iostream>
#include <unordered_set>
#include <string>

int main()

{
std::unordered_set<std::string> inputSet;
inputSet.insert("Hello world");
inputSet.insert("Abcdef");
inputSet.insert("This is the test string...");

for(const auto &val : inputSet)
  std::cout << val.c_str() << std::endl;

return 0;}

它给了我

This is the test string...
Abcdef
Hello world

我尝试运行它 3 或 4 次，它仍然给我相同的输出，这意味着有一种方法可以

unordered_set

确定插入顺序。

有人可以解释一下

unordered_set

如何确定插入顺序吗？

抱歉，如果之前有人问过这个问题，我在网上搜索了一段时间，找不到这个问题的具体答案。预先感谢。

Answer 1

没有特定的顺序...它使用默认的

std::hash

来哈希字符串。无论哈希值是什么，它都会被转换为容器中适当的桶索引..

我们所说的哈希值可以得到：

auto hello = std::hash<std::string>()("Hello world");
auto abcd = std::hash<std::string>()("Abcdef");
auto test = std::hash<std::string>()("This is the test string...");

对于特定的 STL 实现，这可以解决：

Hello maps to: 14420674105493498572
abcd maps to: 10830572898531769673
test maps to: 13068738153895491918

在 C++Shell

上实时查看

该值通常通过应用

运算符转换为适当的桶索引。同样，

std::unordered_set

的迭代器不被要求按顺序迭代所有存储桶（碰撞怎么办？）。因此，您不应依赖在程序运行之间从迭代器观察到的任何顺序。

从 C++14 开始，明确允许

std::hash<>

在不同的程序运行之间产生不同的结果。引用：

哈希函数只需要产生相同的结果程序的单次执行中的相同输入；这允许加盐防止冲突 DoS 攻击的哈希值。

Answer 2

如此处所述 http://en.cppreference.com/w/cpp/container/unordered_set

在内部，元素不按任何特定顺序排序，但是整理成桶。元素被放入哪个桶取决于完全取决于其值的哈希值。这允许快速访问单个元素，因为一旦计算出哈希值，它就会引用元素放入的确切存储桶。

因此它要么使用默认的散列算法，要么使用用户提供的散列算法来对散列桶进行排序。

Answer 3

std::unordered_set<T>

中的顺序是无序的。然而，假设使用确定性散列并且完成相同顺序的插入操作，则程序的不同运行将具有相同顺序的元素。以不同的顺序插入元素和/或使用为不同的运行生成不同值的哈希将产生不同的元素顺序。

Answer 4

我也遇到了同样的现象。起初我以为这是 cling 编译器的一个怪癖；然而，经过进一步调查，我找到了这个答案：Complexity of std::unordered_set iterator traversal

这里建议按照“反向存储桶创建顺序”进行迭代。我自己进行了测试并证实了这个假设。

test.cpp

#include <string>
using std::string;

#include <iostream>
using std::cout;
using std::endl;

#include <unordered_set>
using std::unordered_set;

int main() {
    unordered_set<string> stuff(10);
    stuff.insert("yep");
    stuff.insert("123");
    stuff.insert("xyz");
    stuff.insert("foo");
    stuff.insert("bar");
    stuff.insert("baz");
    stuff.insert("abc");

    for (const auto& word : stuff) {
        cout << word << endl;
    }

    cout << "bucket count: " << stuff.bucket_count() << endl;

    cout << "yep: " << stuff.bucket("yep") << endl;
    cout << "123: " << stuff.bucket("123") << endl;
    cout << "xyz: " << stuff.bucket("xyz") << endl;
    cout << "foo: " << stuff.bucket("foo") << endl;
    cout << "bar: " << stuff.bucket("bar") << endl;
    cout << "baz: " << stuff.bucket("baz") << endl;
    cout << "abc: " << stuff.bucket("abc") << endl;
}

% g++ -std=c++17 -o test test.cpp && ./test
abc
baz
foo
bar
xyz
123
yep
bucket count: 11
yep: 8
123: 10
xyz: 9
foo: 5
bar: 9
baz: 6
abc: 3

随着存储桶

创建

，它们（它们的索引）（可能）被推到单链表的前面（它以反向插入顺序迭代：3、6、5、9、10、8。注意：存储桶也是 SLList（因此按反向插入顺序迭代；这就是为什么

bar

在下面的存储桶

xyz

中位于

之前）。

因此迭代发生了：

3) abc 6) baz 5) foo 9) bar, xyz 10) 123 8) yep

因此，如果您在添加项目时没有发生冲突（并且您的表格不会增长），那么您将获得反向插入顺序。

但是，如果您插入更多项目并且表增长（并且所有项目都被重新插入），那么您的迭代顺序将反映重新插入项目的反向存储桶创建顺序（我的猜测是项目被插入按照迭代的顺序；因此插入的越多，表增长得越多，原始插入顺序就会变得模糊）。

c++中unordered_set如何确定插入顺序？

问题描述投票：0回答：4

4个回答

最新问题

c++中unordered_set如何确定插入顺序？

问题描述 投票：0回答：4

4个回答

最新问题

问题描述投票：0回答：4