Creating a Simple Configuration File and Parser in C++

Creating a simple configuration file and parser in C++

In general, it's easiest to parse such typical config files in two stages: first read the lines, and then parse those one by one.

In C++, lines can be read from a stream using std::getline(). While by default it will read up to the next '\n' (which it will consume, but not return), you can pass it some other delimiter, too, which makes it a good candidate for reading up-to-some-char, like = in your example.

For simplicity, the following presumes that the = are not surrounded by whitespace. If you want to allow whitespaces at these positions, you will have to strategically place is >> std::ws before reading the value and remove trailing whitespaces from the keys. However, IMO the little added flexibility in the syntax is not worth the hassle for a config file reader.

#include <sstream>
const char config[] = "url=http://example.com\n"
"file=main.exe\n"
"true=0";

std::istringstream is_file(config);

std::string line;
while( std::getline(is_file, line) )
{
std::istringstream is_line(line);
std::string key;
if( std::getline(is_line, key, '=') )
{
std::string value;
if( std::getline(is_line, value) )
store_line(key, value);
}
}

(Adding error handling is left as an exercise to the reader.)

What c lib to use when I need to parse a simple config file under linux?

libconfig but it does quite more than what you're asking

A simple way to read TXT config files in C++

Also in C++ it is easy to split a line. I have already provided several answers here on SO on how to split a string. Anyway, I will explain it here in detail and for your special case. I also provide a full working example later.

We use the basic functionality of std::getline which can read a complete line or the line up to a given character. Please see here.

Let us take an example. If the text is stored in a std::string we will first put it into a std::istringstream. Then we can use std::getline to extract the data from the std::istringstream. That is always the standard approach. First, read the complete line from a file using std::getline, then, put it in a std::istringstream again, to be able extract the parts of the string again with std::getline.

If a source line looks like that:

Time [s]:                            1

We can obsserve that we have several parts:

  • An identifier "Time [s]",
  • a colon, which acts as a separator,
  • one or more spaces and
  • the value "1"

So, we could write something like this:

std::string line{};  // Here we will store a complete line read from the source file
std::getline(configFileStream, line); // Read a complete line from the source file
std::istringstream iss{ line }; // Put line into a istringstream for further extraction

std::string id{}; // Here we will store the target value "id"
std::string value{}; // Here we will store the target "value"
std::getline(iss, id, ':'); // Read the ID, get read of the colon
iss >> std::ws; // Skip all white spaces
std::getline(iss, value); // Finally read the value

So, that is a lot of text. You may have heard that you can chain IO-Operations, like in std::cout << a << b << c. This works, because the << operation always returns a reference to the given stream. And the same is true for std::getline. And because it does this, we can use nested statements. Meaning, we can put the second std::getline at this parameter position (actually the first paramater) where it expects a std::istream. If we follow this approach consequently then we can write the nested statement:

std::getline(std::getline(iss, id, ':') >> std::ws, value);

Ooops, whats going on here? Let's analyze from inside out. First the operation std::getline(iss, id, ':') extracts a string from the std::istringstream and assign it to variable "id". OK, understood. Remember: std::getline, will return a reference to the given stream. So, then the above reduced statement is

std::getline(iss >> std::ws, value)

Next, iss >> std::ws will be evaluated and will result in eating up all not necessary white spaces. And guess what, it will return a refernce to the gievn stream "iss".

Statement looks now like:

std::getline(iss, value)

And this will read the value. Simple.

But, we are not finished yet. Of course std::getline will return again "iss". And in the below code, you will see something like

if (std::getline(std::getline(iss, id, ':') >> std::ws, value))

which will end up as if (iss). So, we use iss as a boolean expression? Why does this work and what does it do? It works, because the bool operator of the std::stream is overwritten and returns, if the state is OK or has a failure. Please see here for an explanation. Always check the result of any IO-operation.

And last but not least, we need to explain the if statement with initializer. You can read about it here.

I can write

if (std::string id{}, value{}; std::getline(std::getline(iss, id, ':') >> std::ws, value)) {

which is the similar to

std::string id{}, value{}; 
if (std::getline(std::getline(iss, id, ':') >> std::ws, value)) {

But the first example has the advantage that the defined variables will be only visible within the if-statements scope. So, we "scope" the variable as narrow as possible.

You should try to do that as often as possible. You should also always check the return state of an IO-operation by applying if to a stream-operation, as shown above.

The complete program for reading everything will then just be a few lines of code.

#include <iostream>
#include <sstream>
#include <fstream>
#include <string>
#include <unordered_map>
#include <iomanip>

int main() {

// Open config file and check, if it coul be opened
if (std::ifstream configFileStream{ "r:\\config.txt" }; configFileStream) {

// Here we wills tore the resulting config data
std::unordered_map<std::string, std::string> configData;

// Read all lines of the source file
for (std::string line{}; std::getline(configFileStream, line); )
{
// If the line contains a colon, we treat it as valid data
if (if (line.find(':') != std::string::npos)) {

// Split data in line into an id and a value part and save it
std::istringstream iss{ line };
if (std::string id{}, value{}; std::getline(std::getline(iss, id, ':') >> std::ws, value)) {

// Add config data to our map
configData[id] = value;
}
}
}
// Some debug output
for (const auto& [id, value] : configData)
std::cout << "ID: " << std::left << std::setw(35) << id << " Value: " << value << '\n';
}
else std::cerr << "\n*** Error: Could not open config file for reading\n";

return 0;
}

For this example I store the ids and values in a map, so that they can be accessed easily.

File based configuration handling in C (Unix)

Okay, so let's hit the other part. You need to think about what you'd like to have as your "language". In the UNIX world, the sort of canonical version is probably whitespace-delimited text (think /etc/hosts) or ":" delimited text (like /etc/passwd).

You have a couple of options, the simplest in some sense being to use scanf(3). Again, read the man page for details, but if a line entry is something like

port    100

then you'll be looking for something like

char inbuf[MAXLINE];
int val;

scanf("%s %d\n", &inbuf[0], &val);

You can get a bit more flexibility if you write a simple FSA parse: read characters one at a time from the line, and use a finite automaton to define what to do.

Parsing a Very Simply Config File

You could use std::ifstream to read line by line, and boost::split to split the line by ,.

Sample code:

You could check tokens size for sanity checks of file loaded.

#include <fstream>
#include <vector>
#include <boost/algorithm/string/split.hpp>
#include <boost/algorithm/string/classification.hpp>

int main(int argc, char* argv[]) {
std::ifstream ifs("e:\\save.txt");

std::string line;
std::vector<std::string> tokens;
while (std::getline(ifs, line)) {
boost::split(tokens, line, boost::is_any_of(","));
if (line.empty())
continue;

for (const auto& t : tokens) {
std::cout << t << std::endl;
}
}

return 0;
}

You could also use String Toolkit Library if you don't want to implemented. Docs

building a very simple parser in C

Put the standard headers at file scope, not a block scope:

 #include <stdio.h>

int main(int argc, char *argv[])
{
...


Related Topics



Leave a reply



Submit