C# - 搜索时提高性能

问题描述 投票:1回答:3

我在txt文件上有一个15000000用户名列表,我写了一个方法来创建脑钱包,检查是否有任何地址包含600地址列表。它非常像这样

private static List<string> userList = new List<string>(File.ReadAllLines(@"C:\Users\Erik\Desktop\InfernoUser-workspace-db.txt"));
private static List<string> enterpriseUserList = new List<string>(File.ReadAllLines(@"C:\Users\Erik\Desktop\InfernoEnterpriseUser-local-db.txt"));
foreach (var i in userList)
{ 
    userid = ToAddress(i);
    if (enterpriseUserList.Contains(userid))
        Console.WriteLine(i,userid);        
    {
    private string ToAddress(string username)
    {
        string bitcoinAddress = BitcoinAddress.GetBitcoinAdressEncodedStringFromPublicKey(new PrivateKey(Globals.ProdDumpKeyVersion, new SHA256Managed().ComputeHash(UTF8Encoding.UTF8.GetBytes(username), 0, UTF8Encoding.UTF8.GetBytes(username).Length), false).PublicKey);     
    }

将ToAddrsess方法哈希用户名转换为SHA256字符串,获取其公钥并将其转换为如下地址:

15hDBtLpQfcbrrAFupWjgN5ieHeEBd8mbu

这段代码很麻烦,运行速度很慢,每秒处理大约200行数据。所以我尝试使用多线程来改进它

private static void CheckAddress(string username)
{                      
    var userid = ToAddress(username);
    if (enterpriseUserList.Contains(userid))
    {
        Console.WriteLine(i,userid);        
    }            
}
private static void Parallel() 
{
    List<string> items = new List<string>(File.ReadLines(@"C:\Users\Erik\Desktop\InfernoUser-workspace-db.txt"));
    ParallelOptions check = new ParallelOptions() { MaxDegreeOfParallelism = 100 };
    Parallel.ForEach<string>(items, check, line =>
    {
        CheckAddress(line);
    });
}

它没有多大帮助。任何人都可以建议如何即兴表演吗?比较在CPU上运行的vanitygen,每秒可以处理4-500k地址。它怎么能产生如此大的差异?

c# multithreading performance cryptography text-files
3个回答
1
投票

您可以尝试使用带有key = userid的Dictionary,以防止每次迭代按列表搜索

var dict = new ConcurrentDictionary<string, string>(100, userList.Count);

        userList.AsParallel().ForAll(item => 
        {
            dict.AddOrUpdate(ToAddress(item), item, (key,value)=>{return value;});
        });

        enterpriseUserList.AsParallel().ForAll(x =>
        {
            if (dict.ContainsKey(x))
            { Console.WriteLine(dict[x]); }
        });

0
投票

在寻找低效率时,主要的红旗之一是重复的函数调用。你打电话给GetBytes两次。将它放入一个单独的变量并调用它一次应该有所帮助。

private string ToAddress(string username)
{
    var userNameAsBytes = UTF8Encoding.UTF8.GetBytes(username);
    string bitcoinAddress = BitcoinAddress.GetBitcoinAdressEncodedStringFromPublicKey(new PrivateKey(Globals.ProdDumpKeyVersion, new SHA256Managed().ComputeHash(userNameAsBytes, 0, userNameAsBytes.Length), false).PublicKey);     
}

0
投票

你可以在这里执行一些操作

  1. List更新为HashSet。它将大大地执行Contains操作。我相信这是代码库中最慢的情况。 private static List<string> enterpriseUserList = new List<string>(File.ReadAllLines(@"C:\Users\Erik\Desktop\InfernoEnterpriseUser-local-db.txt"));改为private static HashSet<string> enterpriseUserList = new HashSet<string>(File.ReadAllLines(@"C:\Users\Erik\Desktop\InfernoEnterpriseUser-local-db.txt"));
  2. 不使用ParallelOptions check = new ParallelOptions() { MaxDegreeOfParallelism = 100 };这种优化会提升你的上下文切换和降低性能。
  3. 使用Parallel.ForEach优化Partitioner.Create

也许这就是我可以建议你的全部。

    private static List<string> userList = new List<string>(File.ReadAllLines(@"C:\Users\Erik\Desktop\InfernoUser-workspace-db.txt"));
    private static HashSet<string> enterpriseUserList = new HashSet<string>(File.ReadAllLines(@"C:\Users\Erik\Desktop\InfernoEnterpriseUser-local-db.txt"));

 [MethodImpl(MethodImplOptions.AggressiveInlining)]
   private static void CheckAddress(int id,string username)
{                      
    var userid = ToAddress(username);
    if (enterpriseUserList.Contains(userid))
    {
       // todo
    }            
}


private static void Parallel() 
{
    var ranges = Partitioner.Create(0,userList.Count);
    Parallel.ForEach(ranges ,(range)=>{
     for(int i=range.Item1;i<range.Item2;i++){
              CheckAddress(i,userList[i])               
     }}

}
© www.soinside.com 2019 - 2024. All rights reserved.