C++ and C file I/O
Opinion
I don't know of any real project that uses C++ streams. They are too slow and difficult to use. There are several newer libraries like FastFormat and the Boost version that claim to be better there was a piece in the last ACCU Overload magazine about them. Personally I have used the c FILE library for the last 15 years or so in C++ and I can see no reason yet to change.
Speed
Here is small test program (I knock together quickly) to show the basic speed problem:
#include <stdio.h>
#include <time.h>
#include<iostream>
#include<fstream>
using namespace std;
int main( int argc, const char* argv[] )
{
const int max = 1000000;
const char* teststr = "example";
int start = time(0);
FILE* file = fopen( "example1", "w" );
for( int i = 0; i < max; i++ )
{
fprintf( file, "%s:%d\n", teststr, i );
}
fclose( file );
int end = time(0);
printf( "C FILE: %ds\n", end-start );
start = time(0);
ofstream outdata;
outdata.open("example2.dat");
for( int i = 0; i < max; i++ )
{
outdata << teststr << ":" << i << endl;
}
outdata.close();
end = time(0);
printf( "C++ Streams: %ds\n", end-start );
return 0;
}
And the results on my PC:
C FILE: 5s
C++ Streams: 260s
Process returned 0 (0x0) execution time : 265.282 s
Press any key to continue.
As we can see just this simple example is 52x slower. I hope that there are ways to make it faster!
NOTE: changing endl to '\n' in my example improved C++ streams making it only 3x slower than the FILE* streams (thanks jalf) there may be ways to make it faster.
Difficulty to use
I can't argue that printf() is not terse but it is more flexible (IMO) and simpler to understand, once you get past the initial WTF for the macro codes.
double pi = 3.14285714;
cout << "pi = " << setprecision(5) << pi << '\n';
printf( "%.5f\n", pi );
cout << "pi = " << fixed << showpos << setprecision(3) << pi << '\n';
printf( "%+.3f\n", pi );
cout << "pi = " << scientific << noshowpos << pi<< '\n';
printf( "%e\n", pi );
The Question
Yes, may be there is need of a better C++ library, may be FastFormat is that library, only time will tell.
dave
File I/O in C - How to read from a file and then write to it?
When a file is opened for read and write, an fseek() is used when switching between operations. fseek( fp, 0, SEEK_CUR);
does not change the position of the file pointer in the file.
#include<stdio.h>
#include<stdlib.h>
int main ( ) {
int read = 0;
int write = 48;
int each = 0;
FILE *fp;
fp = fopen("z.txt", "w");//create a file
if (fp == NULL)
{
printf("Error while opening the file.\n");
return 0;
}
fprintf ( fp, "abcdefghijklmnopqrstuvwxyz");
fclose ( fp);
fp = fopen("z.txt", "r+");//open the file for read and write
if (fp == NULL)
{
printf("Error while opening the file.\n");
return 0;
}
for ( each = 0; each < 5; each++) {
fputc ( write, fp);
write++;
}
fseek ( fp, 0, SEEK_CUR);//finished with writes. switching to read
for ( each = 0; each < 5; each++) {
read = fgetc ( fp);
printf ( "%c ", read);
}
printf ( "\n");
fseek ( fp, 0, SEEK_CUR);//finished with reads. switching to write
for ( each = 0; each < 5; each++) {
fputc ( write, fp);
write++;
}
fseek ( fp, 0, SEEK_CUR);//finished with writes. switching to read
for ( each = 0; each < 5; each++) {
read = fgetc ( fp);
printf ( "%c ", read);
}
printf ( "\n");
fclose ( fp);
return 0;
}
output
the file initially contained
abcdefghijklmnopqrstuvwxyz
after the read and write, it contains
01234fghij56789pqrstuvwxyz
C Program search in FILE I/O
while (fscanf(fp, "%d %s %d %f", &a.id, a.name, &a.quantity, &a.price) == 4)
{
fscanf(fp, "%d %s %d %f", &a.id, a.name, &a.quantity, &a.price);
printf("%d %s %d %f\n\n", a.id, a.name, a.quantity, a.price);
}
You are calling fscanf
twice, it skips every 2nd line.
fgets(array, 255, (FILE*)fp);
printf("%d %s %d %f\n\n", a.id, a.name, a.quantity, a.price);
This part is reading the line in to text. It should then use sscanf
or strtok
to parse line.
You may also have to flush stdin otherwise scanf(" %c", &checker)
may not work. It makes things easier to break this up in to functions. Example:
int find_item_by_id(const char* fname, int find_id)
{
int found = 0;
struct product a;
FILE *fp = fopen(fname, "r");
if (fp)
{
while (fscanf(fp, "%d %s %d %f", &a.id, a.name, &a.quantity, &a.price) == 4)
{
if (a.id == find_id)
{
found = 1;
break;
}
}
fclose(fp);
}
return found;
}
void add_item(const char* fname)
{
struct product a;
printf("Enter product ID : ");
scanf(" %d", &a.id);
printf("Enter product name : ");
scanf(" %s", a.name);
printf("Enter product quantity : ");
scanf(" %d", &a.quantity);
printf("Enter product price : ");
scanf(" %f", &a.price);
if (find_item_by_id(fname, a.id) != 0)
{
printf("item already exists\n");
return;
}
FILE *fp = fopen(fname, "a+");
if (fp)
{
fprintf(fp, "%d %s %d %.2f\n", a.id, a.name, a.quantity, a.price);
fclose(fp);
}
}
int main()
{
const char* filename = "c:\\test\\_test.txt";
printf("list:\n");
struct product a;
FILE *fp = fopen(filename, "r");
if (fp)
{
while (fscanf(fp, "%d %s %d %f", &a.id, a.name, &a.quantity, &a.price) == 4)
printf("%d %s %d %f\n", a.id, a.name, a.quantity, a.price);
fclose(fp);
}
while(1)
{
add_item(filename);
printf("Add nother item? (y/n)\n");
int result;
while (1)
{
result = getchar();
if (result == 'y' || result == 'n')
break;
}
if (result != 'y')
break;
}
return 0;
}
What are some best practices for file I/O in C?
Best practices in my eyes:
- Check every call to fopen, printf, puts, fprintf, fclose etc. for errors
- use getchar if you must, fread if you can
- use putchar if you must, fwrite if you can
- avoid arbitrary limits on input line length (might require malloc/realloc)
- know when you need to open output files in binary mode
- use Standard C, forget conio.h :-)
- newlines belong at the end of a line, not at the beginning of some text, i.e. it is
printf ("hello, world\n");
and not"\nHello, world"
like those mislead by the Mighty William H. often write to cope with the sillyness of their command shell. Outputting newlines first breaks line buffered I/O. - if you need more than 7bit ASCII, chose Unicode (the most common encoding is UTF-8 which is ASCII compatible). It's the last encoding you'll ever need to learn. Stay away from codepages and ISO-8859-*.
Performance Difference Between C and C++ Style File IO
You're using endl
to print a newline. That is the problem here, as it does more than just printing a newline — endl
also flushes the buffer which is an expensive operation (if you do that in each iteration).
Use \n
if you mean so:
file << i << '\n';
And also, must compile your code in release mode (i.e turn on the optimizations).
C vs C++ file handling
Sometimes there's existing code expecting one or the other that you need to interact with, which can affect your choice, but in general the C++ versions wouldn't have been introduced if there weren't issues with the C versions that they could fix. Improvements include:
RAII semantics, which means e.g.
fstream
s close the files they manage when they leave scopemodal ability to throw exceptions when errors occur, which can make for cleaner code focused on the typical/successful processing (see http://en.cppreference.com/w/cpp/io/basic_ios/exceptions for API function and example)
type safety, such that how input and output is performed is implicitly selected using the variable type involved
- C-style I/O has potential for crashes: e.g.
int my_int = 32; printf("%s", my_int);
, where%s
tellsprintf
to expect a pointer to an ASCIIZ character buffer butmy_int
appears instead; firstly, the argument passing convention may meanint
s are passed differently toconst char*
s, secondlysizeof int
may not equalsizeof const char*
, and finally, even ifprintf
extracts32
as aconst char*
at best it will just print random garbage from memory address 32 onwards until it coincidentally hits a NUL character - far more likely the process will lack permissions to read some of that memory and the program will crash. Modern C compilers can sometimes validate the format string against the provided arguments, reducing this risk.
- C-style I/O has potential for crashes: e.g.
extensibility for user-defined types (i.e. you can teach streams how to handle your own classes)
support for dynamically sizing receiving strings based on the actual input, whereas the C functions tend to need hard-coded maximum buffer sizes and loops in user code to assemble arbitrary sized input
Streams are also sometimes criticised for:
verbosity of formatting, particularly "io manipulators" setting width, precision, base, padding, compared to the
printf
-style format stringsa sometimes confusing mix of manipulators that persist their settings across multiple I/O operations and others that are reset after each operation
lack of convenience class for RAII pushing/saving and later popping/restoring the manipulator state
being slow, as Ben Voigt comments and documents here
File I/O Extraction with structures in C
I don't know what the format is. It can't be space-separated, some of the fields have spaces in them. It doesn't look fixed-width. Because you mentioned strtok
I'm going to assume its tab-separated.
You can use strsep
use that. strtok
has a lot of problems that strsep
solves, but strsep
isn't standard C. I'm going to assume this is some assignment requiring standard C, so I'll begrudgingly use strtok
.
The basic thing to do is to read each line, and then split it into columns with strtok
or strsep
.
char line[1024];
while (fgets(line, sizeof(line), fp) != NULL) {
char *column;
int col_num = 0;
for( column = strtok(line, "\t");
column;
column = strtok(NULL, "\t") )
{
col_num++;
printf("%d: %s\n", col_num, column);
}
}
fclose(fp);
strtok
is funny. It keeps its own internal state of where it is in the string. The first time you call it, you pass it the string you're looking at. To get the rest of the fields, you call it with NULL
and it will keep reading through that string. So that's why there's that funny for loop that looks like its repeating itself.
Global state is dangerous and very error prone. strsep
and strtok_r
fix this. If you're being told to use strtok
, find a better resource to learn from.
Now that we have each column and its position, we can do what we like with it. I'm going to use a switch
to choose only the columns we want.
for( column = strtok(line, "\t");
column;
column = strtok(NULL, "\t") )
{
col_num++;
switch( col_num ) {
case 1:
case 2:
case 3:
case 4:
case 5:
case 9:
case 10:
case 13:
printf("%s\t", column);
break;
default:
break;
}
}
puts("");
You can do whatever you like with the columns at this point. You can print them immediately, or put them in a list, or a structure.
Just remember that column
is pointing to memory in line
and line
will be overwritten. If you want to store column
, you'll have to copy it first. You can do that with strdup
but *sigh* that isn't standard C. strcpy
is really easy to use wrong. If you're stuck with standard C, write your own strdup
.
char *mystrdup( const char *src ) {
char *dst = malloc( (sizeof(src) * sizeof(char)) + 1 );
strcpy( dst, src );
return dst;
}
Related Topics
Is C++ Considered Weakly Typed? Why
Should Accessors Return Values or Constant References
Visual Studio Project Out of Date
PDF Specifications for Coders: Adobe or Iso
Why Vector Access Operators Are Not Specified as Noexcept
How to Use Cmake_Export_Compile_Commands
Assignment Operator with Reference Members
How to Vectorize My Loop with G++
C++ Trying to Get Function Address from a Std::Function
Which Header Should I Include for 'Size_T'
Is Using a Vector of Boolean Values Slower Than a Dynamic Bitset
How to Disable Exceptions in Stl
Understanding Boost::Disjoint_Sets
Eigen How to Concatenate Matrix Along a Specific Dimension