Creating a simple configuration file and parser in C++
In general, it's easiest to parse such typical config files in two stages: first read the lines, and then parse those one by one.
In C++, lines can be read from a stream using std::getline()
. While by default it will read up to the next '\n'
(which it will consume, but not return), you can pass it some other delimiter, too, which makes it a good candidate for reading up-to-some-char, like =
in your example.
For simplicity, the following presumes that the =
are not surrounded by whitespace. If you want to allow whitespaces at these positions, you will have to strategically place is >> std::ws
before reading the value and remove trailing whitespaces from the keys. However, IMO the little added flexibility in the syntax is not worth the hassle for a config file reader.
#include <sstream>
const char config[] = "url=http://example.com\n"
"file=main.exe\n"
"true=0";
std::istringstream is_file(config);
std::string line;
while( std::getline(is_file, line) )
{
std::istringstream is_line(line);
std::string key;
if( std::getline(is_line, key, '=') )
{
std::string value;
if( std::getline(is_line, value) )
store_line(key, value);
}
}
(Adding error handling is left as an exercise to the reader.)
What c lib to use when I need to parse a simple config file under linux?
libconfig but it does quite more than what you're asking
A simple way to read TXT config files in C++
Also in C++ it is easy to split a line. I have already provided several answers here on SO on how to split a string. Anyway, I will explain it here in detail and for your special case. I also provide a full working example later.
We use the basic functionality of std::getline
which can read a complete line or the line up to a given character. Please see here.
Let us take an example. If the text is stored in a std::string
we will first put it into a std::istringstream
. Then we can use std::getline
to extract the data from the std::istringstream
. That is always the standard approach. First, read the complete line from a file using std::getline
, then, put it in a std::istringstream
again, to be able extract the parts of the string again with std::getline
.
If a source line looks like that:
Time [s]: 1
We can obsserve that we have several parts:
- An identifier "Time [s]",
- a colon, which acts as a separator,
- one or more spaces and
- the value "1"
So, we could write something like this:
std::string line{}; // Here we will store a complete line read from the source file
std::getline(configFileStream, line); // Read a complete line from the source file
std::istringstream iss{ line }; // Put line into a istringstream for further extraction
std::string id{}; // Here we will store the target value "id"
std::string value{}; // Here we will store the target "value"
std::getline(iss, id, ':'); // Read the ID, get read of the colon
iss >> std::ws; // Skip all white spaces
std::getline(iss, value); // Finally read the value
So, that is a lot of text. You may have heard that you can chain IO-Operations, like in std::cout << a << b << c
. This works, because the << operation always returns a reference to the given stream. And the same is true for std::getline
. And because it does this, we can use nested statements. Meaning, we can put the second std::getline
at this parameter position (actually the first paramater) where it expects a std::istream
. If we follow this approach consequently then we can write the nested statement:
std::getline(std::getline(iss, id, ':') >> std::ws, value);
Ooops, whats going on here? Let's analyze from inside out. First the operation std::getline(iss, id, ':')
extracts a string from the std::istringstream
and assign it to variable "id". OK, understood. Remember: std::getline, will return a reference to the given stream. So, then the above reduced statement is
std::getline(iss >> std::ws, value)
Next, iss >> std::ws
will be evaluated and will result in eating up all not necessary white spaces. And guess what, it will return a refernce to the gievn stream "iss".
Statement looks now like:
std::getline(iss, value)
And this will read the value. Simple.
But, we are not finished yet. Of course std::getline will return again "iss". And in the below code, you will see something like
if (std::getline(std::getline(iss, id, ':') >> std::ws, value))
which will end up as if (iss)
. So, we use iss
as a boolean expression? Why does this work and what does it do? It works, because the bool operator
of the std::stream
is overwritten and returns, if the state is OK or has a failure. Please see here for an explanation. Always check the result of any IO-operation.
And last but not least, we need to explain the if
statement with initializer. You can read about it here.
I can write
if (std::string id{}, value{}; std::getline(std::getline(iss, id, ':') >> std::ws, value)) {
which is the similar to
std::string id{}, value{};
if (std::getline(std::getline(iss, id, ':') >> std::ws, value)) {
But the first example has the advantage that the defined variables will be only visible within the if
-statements scope. So, we "scope" the variable as narrow as possible.
You should try to do that as often as possible. You should also always check the return state of an IO-operation by applying if
to a stream-operation, as shown above.
The complete program for reading everything will then just be a few lines of code.
#include <iostream>
#include <sstream>
#include <fstream>
#include <string>
#include <unordered_map>
#include <iomanip>
int main() {
// Open config file and check, if it coul be opened
if (std::ifstream configFileStream{ "r:\\config.txt" }; configFileStream) {
// Here we wills tore the resulting config data
std::unordered_map<std::string, std::string> configData;
// Read all lines of the source file
for (std::string line{}; std::getline(configFileStream, line); )
{
// If the line contains a colon, we treat it as valid data
if (if (line.find(':') != std::string::npos)) {
// Split data in line into an id and a value part and save it
std::istringstream iss{ line };
if (std::string id{}, value{}; std::getline(std::getline(iss, id, ':') >> std::ws, value)) {
// Add config data to our map
configData[id] = value;
}
}
}
// Some debug output
for (const auto& [id, value] : configData)
std::cout << "ID: " << std::left << std::setw(35) << id << " Value: " << value << '\n';
}
else std::cerr << "\n*** Error: Could not open config file for reading\n";
return 0;
}
For this example I store the ids and values in a map, so that they can be accessed easily.
File based configuration handling in C (Unix)
Okay, so let's hit the other part. You need to think about what you'd like to have as your "language". In the UNIX world, the sort of canonical version is probably whitespace-delimited text (think /etc/hosts
) or ":" delimited text (like /etc/passwd
).
You have a couple of options, the simplest in some sense being to use scanf(3). Again, read the man page for details, but if a line entry is something like
port 100
then you'll be looking for something like
char inbuf[MAXLINE];
int val;
scanf("%s %d\n", &inbuf[0], &val);
You can get a bit more flexibility if you write a simple FSA parse: read characters one at a time from the line, and use a finite automaton to define what to do.
Parsing a Very Simply Config File
You could use std::ifstream to read line by line, and boost::split to split the line by ,
.
Sample code:
You could check tokens size for sanity checks of file loaded.
#include <fstream>
#include <vector>
#include <boost/algorithm/string/split.hpp>
#include <boost/algorithm/string/classification.hpp>
int main(int argc, char* argv[]) {
std::ifstream ifs("e:\\save.txt");
std::string line;
std::vector<std::string> tokens;
while (std::getline(ifs, line)) {
boost::split(tokens, line, boost::is_any_of(","));
if (line.empty())
continue;
for (const auto& t : tokens) {
std::cout << t << std::endl;
}
}
return 0;
}
You could also use String Toolkit Library if you don't want to implemented. Docs
building a very simple parser in C
Put the standard headers at file scope, not a block scope:
#include <stdio.h>
int main(int argc, char *argv[])
{
...
Related Topics
C++: Timing in Linux (Using Clock()) Is Out of Sync (Due to Openmp)
Why Can't I Assign an Array Variable Directly to Another Array Variable with the '=' Operator
C++ Access Violation Reading Location 0Xcdcdcdcd Error on Calling a Function
Is Uninitialized Data Behavior Well Specified
Boost C++ Regex - How to Get Multiple Matches
Is Using an Union in Place of a Cast Well Defined
Is There a Proper 'Ownership-In-A-Package' for 'Handles' Available
How to Overload the Conditional Operator
Avoiding If Statement Inside a for Loop
Print Out All Combinations of Index
How to Compare Char Variables (C-Strings)
How to Know the Right Max Size of Vector? Max_Size()? But No
What Is the Size of Sizeof(Vector)? C++
Why Is There No Piecewise Tuple Construction