[ACCEPTED]-How can I read and manipulate CSV file data in C++?-csv
More information would be useful.
But the 1 simplest form:
#include <iostream>
#include <sstream>
#include <fstream>
#include <string>
int main()
{
std::ifstream data("plop.csv");
std::string line;
while(std::getline(data,line))
{
std::stringstream lineStream(line);
std::string cell;
while(std::getline(lineStream,cell,','))
{
// You have a cell!!!!
}
}
}
Also see this question: CSV parser in C++
If what you're really doing is manipulating 12 a CSV file itself, Nelson's answer makes 11 sense. However, my suspicion is that the 10 CSV is simply an artifact of the problem 9 you're solving. In C++, that probably means 8 you have something like this as your data 7 model:
struct Customer {
int id;
std::string first_name;
std::string last_name;
struct {
std::string street;
std::string unit;
} address;
char state[2];
int zip;
};
Thus, when you're working with a collection 6 of data, it makes sense to have std::vector<Customer>
or std::set<Customer>
.
With 5 that in mind, think of your CSV handling 4 as two operations:
// if you wanted to go nuts, you could use a forward iterator concept for both of these
class CSVReader {
public:
CSVReader(const std::string &inputFile);
bool hasNextLine();
void readNextLine(std::vector<std::string> &fields);
private:
/* secrets */
};
class CSVWriter {
public:
CSVWriter(const std::string &outputFile);
void writeNextLine(const std::vector<std::string> &fields);
private:
/* more secrets */
};
void readCustomers(CSVReader &reader, std::vector<Customer> &customers);
void writeCustomers(CSVWriter &writer, const std::vector<Customer> &customers);
Read and write a single 3 row at a time, rather than keeping a complete 2 in-memory representation of the file itself. There 1 are a few obvious benefits:
- Your data is represented in a form that makes sense for your problem (customers), rather than the current solution (CSV files).
- You can trivially add adapters for other data formats, such as bulk SQL import/export, Excel/OO spreadsheet files, or even an HTML
<table>
rendering. - Your memory footprint is likely to be smaller (depends on relative
sizeof(Customer)
vs. the number of bytes in a single row). CSVReader
andCSVWriter
can be reused as the basis for an in-memory model (such as Nelson's) without loss of performance or functionality. The converse is not true.
I've worked with a lot of CSV files in my 17 time. I'd like to add the advice:
1 - Depending 16 on the source (Excel, etc), commas or tabs 15 may be embedded in a field. Usually, the 14 rule is that they will be 'protected' because 13 the field will be double-quote delimited, as 12 in "Boston, MA 02346".
2 - Some sources will 11 not double-quote delimit all text fields. Other 10 sources will. Others will delimit all fields, even 9 numerics.
3 - Fields containing double-quotes 8 usually get the embedded double quotes doubled 7 up (and the field itself delimited with 6 double quotes, as in "George ""Babe"" Ruth".
4 5 - Some sources will embed CR/LFs (Excel 4 is one of these!). Sometimes it'll be just 3 a CR. The field will usually be double-quote 2 delimited, but this situation is very difficult 1 to handle.
This is a good exercise for yourself to 16 work on :)
You should break your library 15 into three parts
- Loading the CSV file
- Representing the file in memory so that you can modify it and read it
- Saving the CSV file back to disk
So you are looking at writing 14 a CSVDocument class that contains:
- Load(const char* file);
- Save(const char* file);
- GetBody
So that 13 you may use your library like this:
CSVDocument doc;
doc.Load("file.csv");
CSVDocumentBody* body = doc.GetBody();
CSVDocumentRow* header = body->GetRow(0);
for (int i = 0; i < header->GetFieldCount(); i++)
{
CSVDocumentField* col = header->GetField(i);
cout << col->GetText() << "\t";
}
for (int i = 1; i < body->GetRowCount(); i++) // i = 1 so we skip the header
{
CSVDocumentRow* row = body->GetRow(i);
for (int p = 0; p < row->GetFieldCount(); p++)
{
cout << row->GetField(p)->GetText() << "\t";
}
cout << "\n";
}
body->GetRecord(10)->SetText("hello world");
CSVDocumentRow* lastRow = body->AddRow();
lastRow->AddField()->SetText("Hey there");
lastRow->AddField()->SetText("Hey there column 2");
doc->Save("file.csv");
Which 12 gives us the following interfaces:
class CSVDocument
{
public:
void Load(const char* file);
void Save(const char* file);
CSVDocumentBody* GetBody();
};
class CSVDocumentBody
{
public:
int GetRowCount();
CSVDocumentRow* GetRow(int index);
CSVDocumentRow* AddRow();
};
class CSVDocumentRow
{
public:
int GetFieldCount();
CSVDocumentField* GetField(int index);
CSVDocumentField* AddField(int index);
};
class CSVDocumentField
{
public:
const char* GetText();
void GetText(const char* text);
};
Now you 11 just have to fill in the blanks from here 10 :)
Believe me when I say this - investing 9 your time into learning how to make libraries, especially 8 those dealing with the loading, manipulation 7 and saving of data, will not only remove 6 your dependence on the existence of such 5 libraries but will also make you an all-around 4 better programmer.
:)
EDIT
I don't know how much 3 you already know about string manipulation 2 and parsing; so if you get stuck I would 1 be happy to help.
Here is some code you can use. The data 3 from the csv is stored inside an array of 2 rows. Each row is an array of strings. Hope 1 this helps.
#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
#include <vector>
typedef std::string String;
typedef std::vector<String> CSVRow;
typedef CSVRow::const_iterator CSVRowCI;
typedef std::vector<CSVRow> CSVDatabase;
typedef CSVDatabase::const_iterator CSVDatabaseCI;
void readCSV(std::istream &input, CSVDatabase &db);
void display(const CSVRow&);
void display(const CSVDatabase&);
int main(){
std::fstream file("file.csv", std::ios::in);
if(!file.is_open()){
std::cout << "File not found!\n";
return 1;
}
CSVDatabase db;
readCSV(file, db);
display(db);
}
void readCSV(std::istream &input, CSVDatabase &db){
String csvLine;
// read every line from the stream
while( std::getline(input, csvLine) ){
std::istringstream csvStream(csvLine);
CSVRow csvRow;
String csvCol;
// read every element from the line that is seperated by commas
// and put it into the vector or strings
while( std::getline(csvStream, csvCol, ',') )
csvRow.push_back(csvCol);
db.push_back(csvRow);
}
}
void display(const CSVRow& row){
if(!row.size())
return;
CSVRowCI i=row.begin();
std::cout<<*(i++);
for(;i != row.end();++i)
std::cout<<','<<*i;
}
void display(const CSVDatabase& db){
if(!db.size())
return;
CSVDatabaseCI i=db.begin();
for(; i != db.end(); ++i){
display(*i);
std::cout<<std::endl;
}
}
Look at 'The Practice of Programming' (TPOP) by Kernighan & Pike. It 4 includes an example of parsing CSV files 3 in both C and C++. But it would be worth 2 reading the book even if you don't use the 1 code.
(Previous URL: http://cm.bell-labs.com/cm/cs/tpop/)
Using boost tokenizer to parse records, see here for more details.
ifstream in(data.c_str());
if (!in.is_open()) return 1;
typedef tokenizer< escaped_list_separator<char> > Tokenizer;
vector< string > vec;
string line;
while (getline(in,line))
{
Tokenizer tok(line);
vec.assign(tok.begin(),tok.end());
/// do something with the record
if (vec.size() < 3) continue;
copy(vec.begin(), vec.end(),
ostream_iterator<string>(cout, "|"));
cout << "\n----------------------" << endl;
}
0
I found this interesting approach:
Quote: CSVtoC 5 is a program that takes a CSV or comma-separated 4 values file as input and dumps it as a C 3 structure.
Naturally, you can't make changes 2 to the CSV file, but if you just need in-memory 1 read-only access to the data, it could work.
More Related questions
We use cookies to improve the performance of the site. By staying on our site, you agree to the terms of use of cookies.