元组vs字符串作为C＃中的字典键

Question

我有一个使用ConcurrentDictionary实现的缓存，我需要保留的数据取决于5个参数。所以从缓存中获取它的方法是:(为简单起见，这里只显示3个参数，我更改了数据类型以表示CarData的清晰度）

public CarData GetCarData(string carModel, string engineType, int year);

我想知道在我的ConcurrentDictionary中使用哪种类型的密钥会更好，我可以这样做：

var carCache = new ConcurrentDictionary<string, CarData>();
// check for car key
bool exists = carCache.ContainsKey(string.Format("{0}_{1}_{2}", carModel, engineType, year);

或者像这样：

var carCache = new ConcurrentDictionary<Tuple<string, string, int>, CarData>();
// check for car key
bool exists = carCache.ContainsKey(new Tuple(carModel, engineType, year));

我不会将这些参数与其他任何地方一起使用，因此没有理由创建一个类来保持它们在一起。

我想知道哪种方法在性能和可维护性方面更好。

Answer 1

您可以创建一个覆盖GetHashCode和Equals的类（在此处仅使用它并不重要）：

感谢Dmi（和其他人）的改进......

public class CarKey : IEquatable<CarKey>
{
    public CarKey(string carModel, string engineType, int year)
    {
        CarModel = carModel;
        EngineType= engineType;
        Year= year;
    }

    public string CarModel {get;}
    public string EngineType {get;}
    public int Year {get;}

    public override int GetHashCode()
    {
        unchecked // Overflow is fine, just wrap
        {
            int hash = (int) 2166136261;

            hash = (hash * 16777619) ^ CarModel?.GetHashCode() ?? 0;
            hash = (hash * 16777619) ^ EngineType?.GetHashCode() ?? 0;
            hash = (hash * 16777619) ^ Year.GetHashCode();
            return hash;
        }
    }

    public override bool Equals(object other)
    {
        if (ReferenceEquals(null, other)) return false;
        if (ReferenceEquals(this, other)) return true;
        if (other.GetType() != GetType()) return false;
        return Equals(other as CarKey);
    }

    public bool Equals(CarKey other)
    {
        if (ReferenceEquals(null, other)) return false;
        if (ReferenceEquals(this, other)) return true;
        return string.Equals(CarModel,obj.CarModel) && string.Equals(EngineType, obj.EngineType) && Year == obj.Year;
    }
}

如果你不重写那些，ContainsKey会引用等于。

注意：Tuple类确实有自己的相等函数，基本上和上面一样。使用定制类可以清楚地表明发生了什么 - 因此更易于维护。它还有一个优点，你可以命名属性，使其清晰

注2：该类是不可变的，因为字典键需要避免在将对象添加到字典后更改哈希码的潜在错误See here

GetHashCode taken from here

Answer 2

我想知道哪种方法在性能和可维护性方面更好。

和往常一样，你有工具来解决它。编写两种可能的解决方案并让它们竞争。获胜的是赢家，你不需要任何人在这里回答这个特定的问题。

关于维护，自动文档更好，具有更好的可扩展性的解决方案应该是赢家。在这种情况下，代码是如此微不足道，以至于autodocumentation不是一个问题。从可扩展性的角度来看，恕我直言，最好的解决方案是使用Tuple<T1, T2, ...>：

您获得了不需要维护的自由相等语义。
碰撞是不可能的，如果您选择字符串连接解决方案则不是这样： var param1 = "Hey_I'm a weird string"; var param2 = "!" var param3 = 1; key = "Hey_I'm a weird string_!_1"; var param1 = "Hey"; var param2 = "I'm a weird string_!" var param3 = 1; key = "Hey_I'm a weird string_!_1"; 是的，远远不够，但理论上，完全有可能，你的问题恰恰是未来的未知事件，所以......
最后，但并非最不重要的是，编译器可以帮助您维护代码。例如，如果明天您必须将param4添加到您的密钥，Tuple<T1, T2, T3, T4>将强烈键入您的密钥。另一方面，你的字符串连接算法可以生活在没有param4的幸福快乐生成密钥上，你不知道发生什么事情，直到你的客户打电话给你，因为他们的软件没有按预期工作。

Answer 3

如果性能非常重要，那么答案是您不应该使用任何一个选项，因为两者都会在每次访问时不必要地分配一个对象。

相反，你应该使用struct，无论是自定义的，还是来自ValueTuple的the System.ValueTuple package：

var myCache = new ConcurrentDictionary<ValueTuple<string, string, int>, CachedData>();
bool exists = myCache.ContainsKey(ValueTuple.Create(param1, param2, param3));

C＃7.0还包含语法糖，使这个代码更容易编写（但你不需要等待C＃7.0开始使用没有糖的ValueTuple）：

var myCache = new ConcurrentDictionary<(string, string, int), CachedData>();
bool exists = myCache.ContainsKey((param1, param2, param3));

Answer 4

实现自定义键类并确保它适用于此类用例，即实现IEquatable并使该类不可变：

public class CacheKey : IEquatable<CacheKey>
{
    public CacheKey(string param1, string param2, int param3)
    {
        Param1 = param1;
        Param2 = param2;
        Param3 = param3;
    }

    public string Param1 { get; }

    public string Param2 { get; }

    public int Param3 { get; }

    public bool Equals(CacheKey other)
    {
        if (ReferenceEquals(null, other)) return false;
        if (ReferenceEquals(this, other)) return true;
        return string.Equals(Param1, other.Param1) && string.Equals(Param2, other.Param2) && Param3 == other.Param3;
    }

    public override bool Equals(object obj)
    {
        if (ReferenceEquals(null, obj)) return false;
        if (ReferenceEquals(this, obj)) return true;
        if (obj.GetType() != GetType()) return false;
        return Equals((CacheKey)obj);
    }

    public override int GetHashCode()
    {
        unchecked
        {
            var hashCode = Param1?.GetHashCode() ?? 0;
            hashCode = (hashCode * 397) ^ (Param2?.GetHashCode() ?? 0);
            hashCode = (hashCode * 397) ^ Param3;
            return hashCode;
        }
    }
}

这是一个GetHashCode()实现Resharper如何生成它。这是一个很好的通用实现。根据需要进行调整。

或者，使用类似Equ（我是该库的创建者）的东西，自动生成Equals和GetHashCode实现。这将确保这些方法始终包含CacheKey类的所有成员，因此代码变得更容易维护。这样的实现就像这样：

public class CacheKey : MemberwiseEquatable<CacheKey>
{
    public CacheKey(string param1, string param2, int param3)
    {
        Param1 = param1;
        Param2 = param2;
        Param3 = param3;
    }

    public string Param1 { get; }

    public string Param2 { get; }

    public int Param3 { get; }
}

注意：您显然应该使用有意义的属性名称，否则引入自定义类与使用Tuple相比没有太大的好处。

Answer 5

我想比较Tuple与Class和其他评论中描述的“id_id_id”方法。我使用了这个简单的代码：

public class Key : IEquatable<Key>
{
    public string Param1 { get; set; }
    public string Param2 { get; set; }
    public int Param3 { get; set; }

    public bool Equals(Key other)
    {
        if (ReferenceEquals(null, other)) return false;
        if (ReferenceEquals(this, other)) return true;
        return string.Equals(Param1, other.Param1) && string.Equals(Param2, other.Param2) && Param3 == other.Param3;
    }

    public override bool Equals(object obj)
    {
        if (ReferenceEquals(null, obj)) return false;
        if (ReferenceEquals(this, obj)) return true;
        if (obj.GetType() != this.GetType()) return false;
        return Equals((Key) obj);
    }

    public override int GetHashCode()
    {
        unchecked
        {
            var hashCode = (Param1 != null ? Param1.GetHashCode() : 0);
            hashCode = (hashCode * 397) ^ (Param2 != null ? Param2.GetHashCode() : 0);
            hashCode = (hashCode * 397) ^ Param3;
            return hashCode;
        }
    }
}

static class Program
{

    static void TestClass()
    {
        var stopwatch = new Stopwatch();
        stopwatch.Start();
        var classDictionary = new Dictionary<Key, string>();

        for (var i = 0; i < 10000000; i++)
        {
            classDictionary.Add(new Key { Param1 = i.ToString(), Param2 = i.ToString(), Param3 = i }, i.ToString());
        }
        stopwatch.Stop();
        Console.WriteLine($"initialization: {stopwatch.Elapsed}");

        stopwatch.Restart();

        for (var i = 0; i < 10000000; i++)
        {
            var s = classDictionary[new Key { Param1 = i.ToString(), Param2 = i.ToString(), Param3 = i }];
        }

        stopwatch.Stop();
        Console.WriteLine($"Retrieving: {stopwatch.Elapsed}");
    }

    static void TestTuple()
    {
        var stopwatch = new Stopwatch();
        stopwatch.Start();
        var tupleDictionary = new Dictionary<Tuple<string, string, int>, string>();

        for (var i = 0; i < 10000000; i++)
        {
            tupleDictionary.Add(new Tuple<string, string, int>(i.ToString(), i.ToString(), i), i.ToString());
        }
        stopwatch.Stop();
        Console.WriteLine($"initialization: {stopwatch.Elapsed}");

        stopwatch.Restart();

        for (var i = 0; i < 10000000; i++)
        {
            var s = tupleDictionary[new Tuple<string, string, int>(i.ToString(), i.ToString(), i)];
        }

        stopwatch.Stop();
        Console.WriteLine($"Retrieving: {stopwatch.Elapsed}");
    }

    static void TestFlat()
    {
        var stopwatch = new Stopwatch();
        stopwatch.Start();
        var tupleDictionary = new Dictionary<string, string>();

        for (var i = 0; i < 10000000; i++)
        {
            tupleDictionary.Add($"{i}_{i}_{i}", i.ToString());
        }
        stopwatch.Stop();
        Console.WriteLine($"initialization: {stopwatch.Elapsed}");

        stopwatch.Restart();

        for (var i = 0; i < 10000000; i++)
        {
            var s = tupleDictionary[$"{i}_{i}_{i}"];
        }

        stopwatch.Stop();
        Console.WriteLine($"Retrieving: {stopwatch.Elapsed}");
    }

    static void Main()
    {
        TestClass();
        TestTuple();
        TestFlat();
    }
}

结果：

我在Release中运行了每个方法3次而没有调试，每次运行都会注释掉对其他方法的调用。我拿了3次跑的平均值，但无论如何都没有太大的变化。

TestTuple：

initialization: 00:00:14.2512736
Retrieving: 00:00:08.1912167

识别TestClass：

initialization: 00:00:11.5091160
Retrieving: 00:00:05.5127963

测试平：

initialization: 00:00:16.3672901
Retrieving: 00:00:08.6512009

我惊讶地发现类方法比元组方法和字符串方法都要快。在我看来，它更具可读性和未来安全性，因为可以在Key类中添加更多功能（假设它不仅仅是一个键，它代表了一些东西）。

Answer 6

恕我直言，我更喜欢在这种情况下使用一些中间结构（在你的情况下，它将是Tuple）。这种方法在参数和结束目标字典之间创建了附加层。当然，这取决于目的。例如，这种方式允许您创建不是简单的参数转换（例如容器可能“扭曲”数据）。

Answer 7

我运行了Tomer的测试用例，添加了ValueTuples作为测试用例（新的c＃值类型）。对他们的表现印象深刻。

TestClass
initialization: 00:00:11.8787245
Retrieving: 00:00:06.3609475

TestTuple
initialization: 00:00:14.6531189
Retrieving: 00:00:08.5906265

TestValueTuple
initialization: 00:00:10.8491263
Retrieving: 00:00:06.6928401

TestFlat
initialization: 00:00:16.6559780
Retrieving: 00:00:08.5257845

测试代码如下：

static void TestValueTuple(int n = 10000000)
{
    var stopwatch = new Stopwatch();
    stopwatch.Start();
    var tupleDictionary = new Dictionary<(string, string, int), string>();

    for (var i = 0; i < n; i++)
    {
        tupleDictionary.Add((i.ToString(), i.ToString(), i), i.ToString());
    }
    stopwatch.Stop();
    Console.WriteLine($"initialization: {stopwatch.Elapsed}");

    stopwatch.Restart();

    for (var i = 0; i < n; i++)
    {
        var s = tupleDictionary[(i.ToString(), i.ToString(), i)];
    }

    stopwatch.Stop();
    Console.WriteLine($"Retrieving: {stopwatch.Elapsed}");
}

元组vs字符串作为C＃中的字典键

问题描述投票：25回答：7

7个回答

结果：

最新问题

元组vs字符串作为C＃中的字典键

问题描述 投票：25回答：7

7个回答

结果：

最新问题

问题描述投票：25回答：7