大家好,我刚刚开始学习如何使用c ++进行csv文件管理,目前此代码有效。它可以打印出“数学”列。
但是只有当我使用getline(ss,#any column variable#,',')分配每列时,然后我打印出我想要的列。但是,如果即时通讯将其用于大型列表,则可以说一个csv文件具有大约100列。那我该如何简化呢?还是有什么办法让我仅获取特定的列而无需为每个变量分配/解析每个列?可以说从100列开始,我只希望第47列具有任何可能的名称?还是我可以按名称获取列?
谢谢。
这是一个快速的[有效]示例。
fin.close()
之后)让您选择要打印的内容(或选择要进行的处理)。 #include <iostream>
#include <string>
#include <fstream>
#include <sstream>
#include <vector>
using namespace std;
int main(int argc, char** argv)
{
ifstream fin("filename");
string line;
int rowCount=0;
int rowIdx=0; //keep track of inserted rows
//count the total nb of lines in your file
while(getline(fin,line)){
rowCount++;
}
//this will be your table. A row is represented by data[row_number].
//If you want to access the name of the column #47, you would
//cout << data[0][46]. 0 being the first row(assuming headers)
//and 46 is the 47 column.
//But first you have to input the data. See below.
vector<string> data[rowCount];
fin.clear(); //remove failbit (ie: continue using fin.)
fin.seekg(fin.beg); //rewind stream to start
while(getline(fin,line)) //for every line in input file
{
stringstream ss(line); //copy line to stringstream
string value;
while(getline(ss,value,’,’)){ //for every value in that stream (ie: every cell on that row)
data[rowIdx].push_back(value);//add that value at the end of the current row in our table
}
rowIdx++; //increment row number before reading in next line
}
}
fin.close();
//Now you can choose to access the data however you like.
//If you want to printout only column 47...
int colNum=47; //set this number to the column you want to printout
for(int row=0; row<rowCount; row++)
{
cout << data[row][colNum] << "\t"; //print every value in column 47 only
}
cout << endl
return 0;
}
编辑:添加此内容可获得更完整的答案。
要按名称搜索列,请用此代码段替换最后一个for循环
//if you want to look up a column by name, instead of by column number...
//Use find on that row to get its column number.
//Than you can printout just that column.
int colNum;
string colName = "computer science";
//1.Find the index of column name "computer science" on the first row, using iterator
//note: if "it == data[0].end()", it means that that column name was not found
vector<string>::iterator it = find(data[0].begin(), data[0].end(),colName);
//calulate its index (ie: column number integer)
colNum = std::distance(data[0].begin(), it);
//2. Print the column with the header "computer science"
for(int row=0; row<rowCount; row++)
{
cout << data[row][colNum] << "\t"; //print every value in column 47 only
}
cout << endl
return 0;
}
或者我有什么方法可以只获取特定的列,而不必为每个变量分配/解析每个列?
使用CSV格式避免每列reading并不是很实际,因此,实际上,您要做的基本上只是discard不需要的列,就像您已经在做的一样。] >
为了使它可以使用未知的列数,您可以读入std::vector
,它基本上是一个动态调整大小的数组,因此在这种情况下非常有用。
std::vector<std::string> read_csv_line(const std::string &line) { std::vector<std::string> ret; std::string val; std::stringstream ss(line); while (std::getline(ss, val, ',')) ret.push_back(std::move(val)); return ret; } ... std::getline(is, line); auto row = read_csv_line(line); if (row.size() > 10) // Check each row is expected size! std::cout << row[0] << ", " << row[10] << std::endl; else std::cerr << "Row too short" << std::endl;
然后您可以访问所需的特定列。
或者也许我可以按名称获得该列?
假设您的CSV文件具有标题行,您可以将其读为
std::unordered_map<std::string, size_t>
,其中值是列索引。也可以选择std::vector
和std::find
之类的东西。
请注意,单引号std::getline
无法完成对带引号的值的处理,以及其他一些可能的CSV功能。