How to write custom input stream in C++
The proper way to create a new stream in C++ is to derive from std::streambuf
and to override the underflow()
operation for reading and the overflow()
and sync()
operations for writing. For your purpose you'd create a filtering stream buffer which takes another stream buffer (and possibly a stream from which the stream buffer can be extracted using rdbuf()
) as argument and implements its own operations in terms of this stream buffer.
The basic outline of a stream buffer would be something like this:
class compressbuf
: public std::streambuf {
std::streambuf* sbuf_;
char* buffer_;
// context for the compression
public:
compressbuf(std::streambuf* sbuf)
: sbuf_(sbuf), buffer_(new char[1024]) {
// initialize compression context
}
~compressbuf() { delete[] this->buffer_; }
int underflow() {
if (this->gptr() == this->egptr()) {
// decompress data into buffer_, obtaining its own input from
// this->sbuf_; if necessary resize buffer
// the next statement assumes "size" characters were produced (if
// no more characters are available, size == 0.
this->setg(this->buffer_, this->buffer_, this->buffer_ + size);
}
return this->gptr() == this->egptr()
? std::char_traits<char>::eof()
: std::char_traits<char>::to_int_type(*this->gptr());
}
};
How underflow()
looks exactly depends on the compression library being used. Most libraries I have used keep an internal buffer which needs to be filled and which retains the bytes which are not yet consumed. Typically, it is fairly easy to hook the decompression into underflow()
.
Once the stream buffer is created, you can just initialize an std::istream
object with the stream buffer:
std::ifstream fin("some.file");
compressbuf sbuf(fin.rdbuf());
std::istream in(&sbuf);
If you are going to use the stream buffer frequently, you might want to encapsulate the object construction into a class, e.g., icompressstream
. Doing so is a bit tricky because the base class std::ios
is a virtual base and is the actual location where the stream buffer is stored. To construct the stream buffer before passing a pointer to a std::ios
thus requires jumping through a few hoops: It requires the use of a virtual
base class. Here is how this could look roughly:
struct compressstream_base {
compressbuf sbuf_;
compressstream_base(std::streambuf* sbuf): sbuf_(sbuf) {}
};
class icompressstream
: virtual compressstream_base
, public std::istream {
public:
icompressstream(std::streambuf* sbuf)
: compressstream_base(sbuf)
, std::ios(&this->sbuf_)
, std::istream(&this->sbuf_) {
}
};
(I just typed this code without a simple way to test that it is reasonably correct; please expect typos but the overall approach should work as described)
Custom input stream. Stream buffer and underflow method
From my old C++ experience a stream buf is the underlying buffer for the stream. When the stream needs more data it calls underflow. Inside this method you are suppose to read from your source and setg. When the stream has data to be written back to the source it calls overflow. Inside this method you read from the stream,write back to your source and setp. For example if you are reading the data from a socket in your streambuf
socketbuf::int_type socketbuf::underflow(){
int bytesRead = 0;
try{
bytesRead = soc->read(inbuffer,BUFFER_SIZE-1,0);
if( bytesRead <= 0 ){
return traits_type::eof();
}
}catch(IOException ioe){
cout<<"Unable to read data"<<endl;
return traits_type::eof();
}
setg(inbuffer,inbuffer,inbuffer+bytesRead);
return traits_type::to_int_type(inbuffer[0]);
}
socketbuf::int_type socketbuf::overflow(socketbuf::int_type c){
int bytesWritten = 0;
try{
if(pptr() - pbase() > 0){
bytesWritten = soc->write(pbase(),(pptr() - pbase()),0);
if( bytesWritten <= 0 ) return traits_type::not_eof(c);
}
}catch(IOException ioe){
cout<<"Unable to write data"<<endl;
return traits_type::eof();
}
outbuffer[0] = traits_type::to_char_type(c);
setp(outbuffer,outbuffer+1,outbuffer+BUFFER_SIZE);
return traits_type::not_eof(c);
}
Now coming to your code, you added
result = traits_type::to_int_type('+'); // <-- this was added
A stream reads a string until it sees a LF(line feed). So when the LF character come you are over writing that with a '+' so the stream will wait (for LF) forever.By adding this check your code should do what you are expecting. output '+++' if you input 'abc'
if (result != 10)// <-- add this in addition
result = traits_type::to_int_type('+'); // <-- this was added
Hope it helps you.
How to create stream which handles both input and output in C++?
Creating a class that behaves like a stream is easy. Let's say we want to create such class with the name MyStream
, the definition of the class will be as simple as:
#include <istream> // class "basic_iostream" is defined here
class MyStream : public std::basic_iostream<char> {
private:
std::basic_streambuf buffer; // your streambuf object
public:
MyStream() : std::basic_iostream<char>(&buffer) {} // note that ampersand
};
The constructor of your class should call the constructor of std::basic_iostream<char>
with a pointer to a custom std::basic_streambuf<char>
object. std::basic_streambuf
is just a template class which defines the structure of a stream buffer. So you have to get your own stream buffer. You can get it in two ways:
- From another stream: Every stream has a member
rdbuf
which takes no arguments and returns a pointer to the stream buffer being used by it. Example:
...
std::basic_streambuf* buffer = std::cout.rdbuf(); // take from std::cout
...
- Create your own: You can always create a buffer class by deriving from
std::basic_streambuf<char>
and customize it as you want.
Now we defined and implemented MyStream
class, we need the stream buffer. Let's select option 2 from above and create our own stream buffer and name this MyBuffer
. We will need the following:
- Constructor to initialize the object.
- Continuous memory block to store output by program temporarily.
- Continuous memory block to store input from the user (or something other) temporarily.
- Method
overflow
, which is called when allocated memory for storing output is full. - Method
underflow
, which is called when all input is read by the program and more input requested. - Method
sync
, which is called when output is flushed.
As we know what things are needed to create a stream buffer class, let's declare it:
class MyBuffer : public std::basic_streambuf<char> {
private:
char inbuf[10];
char outbuf[10];
int sync();
int_type overflow(int_type ch);
int_type underflow();
public:
MyBuffer();
};
Here inbuf
and outbuf
are two arrays which will store input and output respectively. int_type
is a special type which is like char and created to support multiple character types like char
, wchar_t
, etc.
Before we jump into the implementation of our buffer class, we need to know how the buffer will work.
To understand how buffers work, we need to know how arrays work. Arrays are nothing special but pointers to continuous memory. When we declare a char
array with two elements, the operating system allocate 2 * sizeof(char)
memory for our program. When we access an element from the array with array[n]
, it is converted to *(array + n)
, where n
is index number. When you add n
to an array, it jumps to next n * sizeof(<the_type_the_array_points_to>)
(figure 1). If you don't know what pointer arithmetics I would recommend you to learn that before you continue. cplusplus.com has a good article on pointers for beginners.
array array + 1
\ /
------------------------------------------
| | | 'a' | 'b' | | |
------------------------------------------
... 105 106 107 108 ...
| |
-------
|
memory allocated by the operating system
figure 1: memory address of an array
As we know much about pointers now, let's see how stream buffers work. Our buffer contains two arrays inbuf
and outbuf
. But how the standard library would know input must be stored to inbuf
and output must be stored to outbuf
? So, there two areas called get area and put area which is input and output area respectively.
Put area is specified with the following three pointers (figure 2):
pbase()
or put base: start of put areaepptr()
or end put pointer: end of put areapptr()
or put pointer: where next character will be put
These are actually functions which return the corresponding pointer. These pointers are set by setp(pbase, epptr)
. After this function call, pptr()
is set to pbase()
. To change it we'll use pbump(n)
which repositions pptr()
by n character, n can be positive or negative. Note that the stream will write to the previous memory block of epptr()
but not epptr()
.
pbase() pptr() epptr()
| | |
------------------------------------------------------------------------
| 'H' | 'e' | 'l' | 'l' | 'o' | | | | | | |
------------------------------------------------------------------------
| |
--------------------------------------------------------
|
allocated memory for the buffer
figure 2: output buffer (put area) with sample data
Get area is specified with the following three pointers (figure 3):
eback()
or end back, start of get areaegptr()
or end get pointer, end of get areagptr()
or get pointer, the position which is going to be read
These pointers are set with setg(eback, gptr, egptr)
function. Note that the stream will read the previous memory block of egptr()
but not egptr()
.
eback() gptr() egptr()
| | |
------------------------------------------------------------------------
| 'H' | 'e' | 'l' | 'l' | 'o' | ' ' | 'C' | '+' | '+' | | |
------------------------------------------------------------------------
| |
--------------------------------------------------------
|
allocated memory for the buffer
figure 3: input buffer (get area) with sample data
Now that we have discussed almost all we need to know before creating a custom stream buffer, it's time to implement it! We'll try to implement our stream buffer such way that it will work like std::cout
!
Let's start with the constructor:
MyBuffer() {
setg(inbuf+4, inbuf+4, inbuf+4);
setp(outbuf, outbuf+9);
}
Here we set all three get pointers to one position, which means there are no readable characters, forcing underflow()
when input wanted. Then we set put pointer in such a way so the stream can write to whole outbuf
array except the last element. We'll preserve it for future use.
Now, let's implement sync()
method, which is called when output is flushed:
int sync() {
int return_code = 0;
for (int i = 0; i < (pptr() - pbase()); i++) {
if (std::putchar(outbuf[i]) == EOF) {
return_code = EOF;
break;
}
}
pbump(pbase() - pptr());
return return_code;
}
This does it's work very easily. First, it determines how many characters there are to print, then prints one by one and repositions pptr()
(put pointer). It returns EOF or -1 if character any character is EOF, 0 otherwise.
But what to do if put area is full? So, we need overflow()
method. Let's implement it:
int_type overflow(int_type ch) {
*pptr() = ch;
pbump(1);
return (sync() == EOF ? EOF : ch);
}
Not very special, this just put the extra character into the preserved last element of outbuf
and repositions pptr()
(put pointer), then calls sync()
. It returns EOF if sync()
returned EOF, otherwise the extra character.
Everything is now complete, except input handling. Let's implement underflow()
, which is called when all characters in input buffer are read:
int_type underflow() {
int keep = std::max(long(4), (gptr() - eback()));
std::memmove(inbuf + 4 - keep, gptr() - keep, keep);
int ch, position = 4;
while ((ch = std::getchar()) != EOF && position <= 10) {
inbuf[position++] = char(ch);
read++;
}
if (read == 0) return EOF;
setg(inbuf - keep + 4, inbuf + 4 , inbuf + position);
return *gptr();
}
A little difficult to understand. Let's see what's going on here. First, it calculates how many characters it should preserve in buffer (which is at most 4) and stores it in the keep
variable. Then it copies last keep
number characters to the start of the buffer. This is done because characters can be put back into the buffer with unget()
method of std::basic_iostream
. Program can even read next characters without extracting it with peek()
method of std::basic_iostream
. After the last few characters are put back, it reads new characters until it reaches the end of the input buffer or gets EOF as input. Then it returns EOF if no characters are read, continues otherwise. Then it repositions all get pointers and return the first character read.
As our stream buffer is implemented now, we can setup our stream class MyStream
so it uses our stream buffer. So we change the private buffer
variable:
...
private:
MyBuffer buffer;
public:
...
You can now test your own stream, it should take input from and show output from terminal.
Note that this stream and buffer can only handle char
based input and output. Your class must derive from corresponding class to handle other types of input and output (e.g std::basic_streambuf<wchar_t>
for wide characters) and implement member functions or method to so they can handle that type of character.
C++ custom stream
If you look at how streams work, it's just a case of overloading operator<<
for both your stream object, and the various things you want to send to it. There's nothing special about <<
, it just reads nicely, but you could use +
or whatever else you want.
Custom buffered input stream. End of input
Pressing Enter sends character to the i/o buffer. That doesn't mean 'end of input'.
In your file you can easily have something like
Dear Mr. Smith,<CR><EOL>I am writing to you this message.<CR><EOL>Kind regards,<CR><EOL>Your Name<EOF>
The standard stream gives you a lot of flexibility in how to read this input.
For example:
istream get() will return you 'D'
istream operator >> will return "Dear"
istream getline () will return "Dear Mr. Smith,"
streambuf sgetn (6) will return "Dear M"
You can also adjust their behaviour to your needs. So you can read as much or as little as you want.
In your code the reading operation is:
std::streamsize read = m_stream_buffer->sgetn(m_buffer, m_size);
which means "give me m_size characters or less if end of input occurred".
Have a look at the documentation of streambuf for a better explanation.
http://www.cplusplus.com/reference/streambuf/streambuf
std::streambuf works on per character basis. No getline() or operator>> here.
If you want to stop at a particular character (e.g. ) you will probably need a loop with sgetc().
Writing a custom input manipulator
You did not show your input, but I don't think getline()
would be appropriate to use in this situation. operator>>
is meant to read a single word, not a whole line.
In any case, you are leaking both char[]
arrays that you allocate. You need to delete[]
them when you are done using them. For the str
array (which FYI, you don't actually need, as you could just copy characters from the temp
string directly into res
instead), you can just delete[]
it before exiting. But for res
, the membuf
would have to hold on to that pointer and delete[]
it when the membuf
itself is no longer being used.
But, more importantly, your use of membuf
is simply wrong. You are creating it as a local variable of skipchar()
, so it will be destroyed when skipchar()
exits, leaving the stream
with a dangling pointer to an invalid object. The streambuf*
pointer you assign to the stream
must remain valid for the entire duration that it is assigned to the istream
, which means creating the membuf
object with new
, and then the caller will have to remember to manually delete
it at a later time (which kind of defeats the purpose of using operator>>
). However, a stream manipulator really should not change the rdbuf
that the stream
is pointing at in the first place, since there is not a good way to restore the previous streambuf
after subsequent read operations are finished (unless you define another manipulator to handle that, ie cin >> skipchar >> str >> stopskipchar;
).
In this situation, I would suggest a different approach. Don't make a stream manipulator that assigns a new streambuf
to the stream
, thus affecting all subsequent operator>>
calls. Instead, make a manipulator that takes a reference to the output variable, and then reads from the stream
and outputs only what is needed (similar to how standard manipulators like std::quoted
and std::get_time
work), eg:
struct skipchars
{
string &str;
};
istream& operator>>(istream& stream, skipchars output)
{
string temp;
if (stream >> temp) {
for (size_t i = 0; i < temp.size(); i += 10) {
output.str += temp.substr(i, 5);
}
}
return stream;
}
int main()
{
string str;
cout << "enter smth:\n";
cin >> skipchars{str};
cout << "entered string: " << str;
return 0;
}
Online Demo
Alternatively:
struct skipcharsHelper
{
string &str;
};
istream& operator>>(istream& stream, skipcharsHelper output)
{
string temp;
if (stream >> temp) {
for (size_t i = 0; i < temp.size(); i += 10) {
output.str += temp.substr(i, 5);
}
}
return stream;
}
skipcharsHelper skipchars(string &str)
{
return skipcharsHelper{str};
}
int main()
{
string str;
cout << "enter smth:\n";
cin >> skipchars(str);
cout << "entered string: " << str;
return 0;
}
Online Demo
How to write custom input function for Flex in C++ mode?
The simple solution, if you just want to provide a string input, is to make the string into a std::istringstream
, which is a valid std::istream
. The simplicity of this solution reduces the need for an equivalent to yy_scan_string
.
On the other hand, if you have a data source you want to read from which is not derived from std::istream
, you can easily create a lexical scanner which does whatever is necessary. Just subclass yyFlexLexer
, add whatever private data members you will need and a constructor which initialises them, and override int LexerInput(char* buffer, size_t maxsize);
to read at least one and no more than maxsize
bytes into buffer
, returning the number of characters read. (YY_INPUT
also works in the C++ interface, but subclassing is more convenient precisely because it lets you maintain your own reader state.)
Notes:
If you decide to subclass and override
LexerInput
, you need to be aware that "interactive" mode is actually implemented inLexerInput
. So if you want your lexer to have an interactive mode, you'll have to implement it in your override, too. In interactive mode,LexerInput
always reads exactly one character (unless, of course, it's at the end of the file).As you can see in the Flex code repository, a future version of Flex will use refactored versions of these functions, so you might need to be prepared to modify your code in the future, although Flex generally maintains backwards compatibility for a long time.
implementing simple input stream
Have you looked at boost.iostreams? It does most of the grunt work for you (possibly not for your exact use case, but for C++ standard library streams in general).
Related Topics
How to Compile Qt 5 Under Windows or Linux, 32 or 64 Bit, Static or Dynamic on Visual Studio or G++
C++ Convert from 1 Char to String
Singleton Instance Declared as Static Variable of Getinstance Method, Is It Thread-Safe
Purpose of Trigraph Sequences in C++
C++ Standard Library and Boehm Garbage Collector
Math-Like Chaining of the Comparison Operator - as In, "If ( (5<J<=1) )"
C++ Access Static Members Using Null Pointer
Is C/C++ Bool Type Always Guaranteed to Be 0 or 1 When Typecast'Ed to Int
Altering Dll Search Path for Static Linked Dll
How to Make Std::Vector's Operator[] Compile Doing Bounds Checking in Debug But Not in Release
Undefined Symbols "Vtable for ..." and "Typeinfo For..."
How to Simulate "Press Any Key to Continue"
Difference Between Char* and Char[]
Force to Link Against Unused Shared Library
How to Create a Game Loop with Xlib
How to Switch Between Blas Libraries Without Recompiling Program