我正在读取以制表符、逗号或分号分隔的日志,并将它们读入数据表中,但是一旦添加了多个分隔符,我就会收到错误消息(当读取记录时,而不是标题时):
System.ArgumentException:输入数组比此表中的列数长。
static DataTable ReadDataTable(string filePath)
{
var reader = ReadAsLines(filePath);
var dataTable = new DataTable();
int i = 0;
//this assumes the first record is filled with the column names
var headers = reader.First().Split('\t',',',';');
foreach (var header in headers)
dataTable.Columns.Add(header);
var records = reader.Skip(1);
foreach (var record in records)
{
dataTable.Rows.Add(record.Split('\t', ',', ';' )); //error here
i++;
}
return dataTable;
}
我读取的第一个文件是制表符分隔的(每个文件都用一个分隔符分隔,而不是多个),直到第 662 行才出现问题,该行看起来与其周围的行相同:
StartTime EndTime Day ElapsedHMS Status NewVersionString VersionFile SourceControlComment Label ErrorDescription AdditionalInformation
05/28/2015 17:51:38 340.2015.0528.139 \\view\Build_NightlyDeveloper\Retail\RetailVersion.cs 34000.2015.0528.139 - Automated version number update.
... that was first line, and following is lines 661, 662, 663
12/20/2015 18:30:23 342.2015.122.1 \\view\Build_Developer\R\RetailVersion.cs 342.2015.1220.81 - Automated version number update.
12/20/2015 18:30:32 342.2015.122.1 \\view\Build_Developer\R\Version.cs 34200.2015.1220.81 - Automated version number update.
12/20/2015 18:30:44 342.2015.122.1 \\view\Build_Developer\R\P\E\CommonCode\RetailVersion.h 342.2015.122.1 - Automated version number update.
我不知道为什么错误消息说该行比其他行有更多的列,因为选项卡似乎是相同的。我不确定是否需要只指定一个分隔符,因为每个表都有一种分隔符类型。我无法将这些分隔符存储在我也读取的 xml 文件中,因此我不确定如何获取表的分隔符。
我查看了这些链接,这不是同一个问题:
从字符串转换为字符进行分割 - 我也尝试过使用双引号,但它需要是 char[],我尝试过但它没有修复它
输入数组比表中的列数长 - 不同的表
我添加了一条写入行,它是文件中时间为:12/21/2015 17:31:16的行
我终于明白了。
我遵循了这些想法
并想出了这个(检测第一行/标题中的分隔符并使用它,而不是在创建数据表时说明所有可能性):
private static readonly char[] SeparatorChars = { ';', '\t', ',' };
public static char DetectSeparator(string filePath)
{
string[] lines = File.ReadLines(filePath).Take(1).ToArray();
return DetectSeparator(lines);
}
public static char DetectSeparator(string[] lines)
{
var q = SeparatorChars.Select(sep => new
{ Separator = sep, Found = lines.GroupBy(line => line.Count(ch => ch == sep)) })
.OrderByDescending(res => res.Found.Count(grp => grp.Key > 0))
.ThenBy(res => res.Found.Count())
.First();
return q.Separator;
}
static DataTable ReadDataTable(string filePath)
{
char delimiter = DetectSeparator(filePath);
var reader = ReadAsLines(filePath);
var dataTable = new DataTable();
int i = 0;
if(filePath.Contains("Parse"))
{
Console.WriteLine("here");
}
//this assumes the first record is filled with the column names
var headers = reader.First().Split(delimiter); //split on tab, comma, semicolon
foreach (var header in headers)
dataTable.Columns.Add(header);
var records = reader.Skip(1);
foreach (var record in records)
{
dataTable.Rows.Add(record.Split(delimiter)); //split on tab, comma, semicolon
//var tmp = record.Split('\t', ',', ';');
//dataTable.Rows.Add(tmp); //split on tab, comma, semicolon
i++;
}
return dataTable;
}