如何在C#中优化读取大型ASC文件

问题描述 投票:0回答:1

我正在开发一个项目,需要读取和处理包含高程数据的大型 ASC 文件。 ASC 文件 可能非常大,我希望使文件读取过程尽可能高效,而无需借助多线程或外部库。我已经实现了一种 C# 方法,用于读取文件并将其处理为数据结构。

ASC 文件如下所示:

ncols 3000
nrows 3000
xllcorner 374000.0
yllcorner 6684000.0
cellsize 2.0
NODATA_value -9999.0
40.767973117423324 49.96434379380793 40.36619928493712 49.23485075482036 43.4695751201485 46.58732779966212 42.405516361731415 42.30614094530102 43.33023022757775 45.533280055961164 43.10895840681881 49.993608914785426 45.936606729160516 46.79609520776553 40.33910078221831 49.3084348052535 49.71693824127023 41.367404468661 44.45054389735144 45.07209138599468 46.614278728530415 48.654384952754434 46.821256520465795 46.36333535870697 47.628871061474115 49.59762063268766 45.03513365419986 45.19707914748046 48.279679353977635 46.949860290186315 43.323259141683536 47.999728809362615 40.55398889076988 41.30879574660539 44.360995489781104 40.33021581004381 42.91178225386288 46.2458115183426 48.77776035585331 48.82510662377992 49.982963798125425 43.86178205444347 48.61824784520495
using System;
using System.IO;

var ascStream = @"asc_example.asc";
int nCols = 0;
int nRows = 0;
float nodataValue = 0.0f;
float[,] data = null;

using (StreamReader reader = new StreamReader(ascStream))
{
    string line;
    int rowIndex = 0;
    while (!reader.EndOfStream && (line = reader.ReadLine()) != null)
    {
        string[] parts = line.Split(' ', StringSplitOptions.RemoveEmptyEntries);

        if (parts.Length == 2)
        {
            string key = parts[0].ToLower();
            if (key == "ncols")
                nCols = int.Parse(parts[1]);
            else if (key == "nrows")
                nRows = int.Parse(parts[1]);
            else if (key == "nodata_value")
                float.TryParse(parts[1], out nodataValue);
                data = new float[nRows, nCols];
        }
        else if (parts.Length >= nCols && data != null)
        {
            for (int colIndex = 0; colIndex < nCols; colIndex++)
            {
                if (float.TryParse(parts[colIndex], out float value))
                {
                    data[rowIndex, colIndex] = value;
                    if (rowIndex == 0 && colIndex == 1 || rowIndex == 100 && colIndex == 100 || rowIndex == 200 && colIndex == 500|| rowIndex == 500 && colIndex == 200 )
                    {
                        Console.WriteLine($"data[{rowIndex}, {colIndex}]: {value}");
                    }
                }
                else
                {
                    data[rowIndex, colIndex] = nodataValue;
                }
            }
            rowIndex++;
        }
    }
}

var minValue = data?.Cast<float>().Min();
var maxValue = data?.Cast<float>().Max();

// Print the values to verify they were correctly assigned
Console.WriteLine("ncols: " + ncols);
Console.WriteLine("nrows: " + nrows);
Console.WriteLine($"Max: {maxValue}");
Console.WriteLine($"min: {minValue}");

输出示例:

我担心处理大文件时代码的性能,因为文件可能会变得更大。我想看看是否有更好的方法来传递这部分

float.TryParse(parts[colIndex], out float value
的值。

c# performance optimization streamreader
1个回答
0
投票

真正可以改进的是把这个

if (parts.Length == 2)
移出循环。因此,首先您需要循环遍历参数,然后循环遍历值。这是真正的优化,但由于
if
语句的执行时间较小且循环次数较少(5000),因此不会节省太多时间。

遵循这种方法,您也将避免

data != null
检查每个循环迭代。

© www.soinside.com 2019 - 2024. All rights reserved.