C++ and C File I/O

C++ and C file I/O

Opinion

I don't know of any real project that uses C++ streams. They are too slow and difficult to use. There are several newer libraries like FastFormat and the Boost version that claim to be better there was a piece in the last ACCU Overload magazine about them. Personally I have used the c FILE library for the last 15 years or so in C++ and I can see no reason yet to change.

Speed

Here is small test program (I knock together quickly) to show the basic speed problem:

#include <stdio.h>
#include <time.h>

#include<iostream>
#include<fstream>

using namespace std;

int main( int argc, const char* argv[] )
{
const int max = 1000000;
const char* teststr = "example";

int start = time(0);
FILE* file = fopen( "example1", "w" );
for( int i = 0; i < max; i++ )
{
fprintf( file, "%s:%d\n", teststr, i );
}
fclose( file );
int end = time(0);

printf( "C FILE: %ds\n", end-start );

start = time(0);
ofstream outdata;
outdata.open("example2.dat");
for( int i = 0; i < max; i++ )
{
outdata << teststr << ":" << i << endl;
}
outdata.close();
end = time(0);

printf( "C++ Streams: %ds\n", end-start );

return 0;
}

And the results on my PC:

C FILE: 5s
C++ Streams: 260s

Process returned 0 (0x0) execution time : 265.282 s
Press any key to continue.

As we can see just this simple example is 52x slower. I hope that there are ways to make it faster!

NOTE: changing endl to '\n' in my example improved C++ streams making it only 3x slower than the FILE* streams (thanks jalf) there may be ways to make it faster.

Difficulty to use

I can't argue that printf() is not terse but it is more flexible (IMO) and simpler to understand, once you get past the initial WTF for the macro codes.

double pi = 3.14285714;

cout << "pi = " << setprecision(5) << pi << '\n';
printf( "%.5f\n", pi );

cout << "pi = " << fixed << showpos << setprecision(3) << pi << '\n';
printf( "%+.3f\n", pi );

cout << "pi = " << scientific << noshowpos << pi<< '\n';
printf( "%e\n", pi );

The Question

Yes, may be there is need of a better C++ library, may be FastFormat is that library, only time will tell.

dave

File I/O in C - How to read from a file and then write to it?

When a file is opened for read and write, an fseek() is used when switching between operations. fseek( fp, 0, SEEK_CUR); does not change the position of the file pointer in the file.

#include<stdio.h>
#include<stdlib.h>

int main ( ) {
int read = 0;
int write = 48;
int each = 0;
FILE *fp;

fp = fopen("z.txt", "w");//create a file
if (fp == NULL)
{
printf("Error while opening the file.\n");
return 0;
}
fprintf ( fp, "abcdefghijklmnopqrstuvwxyz");
fclose ( fp);

fp = fopen("z.txt", "r+");//open the file for read and write
if (fp == NULL)
{
printf("Error while opening the file.\n");
return 0;
}

for ( each = 0; each < 5; each++) {
fputc ( write, fp);
write++;
}
fseek ( fp, 0, SEEK_CUR);//finished with writes. switching to read

for ( each = 0; each < 5; each++) {
read = fgetc ( fp);
printf ( "%c ", read);
}
printf ( "\n");
fseek ( fp, 0, SEEK_CUR);//finished with reads. switching to write

for ( each = 0; each < 5; each++) {
fputc ( write, fp);
write++;
}
fseek ( fp, 0, SEEK_CUR);//finished with writes. switching to read

for ( each = 0; each < 5; each++) {
read = fgetc ( fp);
printf ( "%c ", read);
}
printf ( "\n");

fclose ( fp);
return 0;
}

output

the file initially contained

abcdefghijklmnopqrstuvwxyz

after the read and write, it contains

01234fghij56789pqrstuvwxyz

C Program search in FILE I/O

while (fscanf(fp, "%d %s %d %f", &a.id, a.name, &a.quantity, &a.price) == 4)
{
fscanf(fp, "%d %s %d %f", &a.id, a.name, &a.quantity, &a.price);
printf("%d %s %d %f\n\n", a.id, a.name, a.quantity, a.price);
}

You are calling fscanf twice, it skips every 2nd line.

fgets(array, 255, (FILE*)fp);
printf("%d %s %d %f\n\n", a.id, a.name, a.quantity, a.price);

This part is reading the line in to text. It should then use sscanf or strtok to parse line.

You may also have to flush stdin otherwise scanf(" %c", &checker) may not work. It makes things easier to break this up in to functions. Example:

int find_item_by_id(const char* fname, int find_id)
{
int found = 0;
struct product a;
FILE *fp = fopen(fname, "r");
if (fp)
{
while (fscanf(fp, "%d %s %d %f", &a.id, a.name, &a.quantity, &a.price) == 4)
{
if (a.id == find_id)
{
found = 1;
break;
}
}
fclose(fp);
}

return found;
}

void add_item(const char* fname)
{
struct product a;
printf("Enter product ID : ");
scanf(" %d", &a.id);

printf("Enter product name : ");
scanf(" %s", a.name);

printf("Enter product quantity : ");
scanf(" %d", &a.quantity);

printf("Enter product price : ");
scanf(" %f", &a.price);

if (find_item_by_id(fname, a.id) != 0)
{
printf("item already exists\n");
return;
}

FILE *fp = fopen(fname, "a+");
if (fp)
{
fprintf(fp, "%d %s %d %.2f\n", a.id, a.name, a.quantity, a.price);
fclose(fp);
}
}

int main()
{
const char* filename = "c:\\test\\_test.txt";

printf("list:\n");
struct product a;
FILE *fp = fopen(filename, "r");
if (fp)
{
while (fscanf(fp, "%d %s %d %f", &a.id, a.name, &a.quantity, &a.price) == 4)
printf("%d %s %d %f\n", a.id, a.name, a.quantity, a.price);
fclose(fp);
}

while(1)
{
add_item(filename);

printf("Add nother item? (y/n)\n");
int result;
while (1)
{
result = getchar();
if (result == 'y' || result == 'n')
break;
}
if (result != 'y')
break;
}

return 0;
}

What are some best practices for file I/O in C?

Best practices in my eyes:

  • Check every call to fopen, printf, puts, fprintf, fclose etc. for errors
  • use getchar if you must, fread if you can
  • use putchar if you must, fwrite if you can
  • avoid arbitrary limits on input line length (might require malloc/realloc)
  • know when you need to open output files in binary mode
  • use Standard C, forget conio.h :-)
  • newlines belong at the end of a line, not at the beginning of some text, i.e. it is printf ("hello, world\n"); and not "\nHello, world" like those mislead by the Mighty William H. often write to cope with the sillyness of their command shell. Outputting newlines first breaks line buffered I/O.
  • if you need more than 7bit ASCII, chose Unicode (the most common encoding is UTF-8 which is ASCII compatible). It's the last encoding you'll ever need to learn. Stay away from codepages and ISO-8859-*.

Performance Difference Between C and C++ Style File IO

You're using endl to print a newline. That is the problem here, as it does more than just printing a newline — endl also flushes the buffer which is an expensive operation (if you do that in each iteration).

Use \n if you mean so:

file << i << '\n';

And also, must compile your code in release mode (i.e turn on the optimizations).

C vs C++ file handling

Sometimes there's existing code expecting one or the other that you need to interact with, which can affect your choice, but in general the C++ versions wouldn't have been introduced if there weren't issues with the C versions that they could fix. Improvements include:

  • RAII semantics, which means e.g. fstreams close the files they manage when they leave scope

  • modal ability to throw exceptions when errors occur, which can make for cleaner code focused on the typical/successful processing (see http://en.cppreference.com/w/cpp/io/basic_ios/exceptions for API function and example)

  • type safety, such that how input and output is performed is implicitly selected using the variable type involved

    • C-style I/O has potential for crashes: e.g. int my_int = 32; printf("%s", my_int);, where %s tells printf to expect a pointer to an ASCIIZ character buffer but my_int appears instead; firstly, the argument passing convention may mean ints are passed differently to const char*s, secondly sizeof int may not equal sizeof const char*, and finally, even if printf extracts 32 as a const char* at best it will just print random garbage from memory address 32 onwards until it coincidentally hits a NUL character - far more likely the process will lack permissions to read some of that memory and the program will crash. Modern C compilers can sometimes validate the format string against the provided arguments, reducing this risk.
  • extensibility for user-defined types (i.e. you can teach streams how to handle your own classes)

  • support for dynamically sizing receiving strings based on the actual input, whereas the C functions tend to need hard-coded maximum buffer sizes and loops in user code to assemble arbitrary sized input

Streams are also sometimes criticised for:

  • verbosity of formatting, particularly "io manipulators" setting width, precision, base, padding, compared to the printf-style format strings

  • a sometimes confusing mix of manipulators that persist their settings across multiple I/O operations and others that are reset after each operation

  • lack of convenience class for RAII pushing/saving and later popping/restoring the manipulator state

  • being slow, as Ben Voigt comments and documents here

File I/O Extraction with structures in C

I don't know what the format is. It can't be space-separated, some of the fields have spaces in them. It doesn't look fixed-width. Because you mentioned strtok I'm going to assume its tab-separated.

You can use strsep use that. strtok has a lot of problems that strsep solves, but strsep isn't standard C. I'm going to assume this is some assignment requiring standard C, so I'll begrudgingly use strtok.

The basic thing to do is to read each line, and then split it into columns with strtok or strsep.

char line[1024];
while (fgets(line, sizeof(line), fp) != NULL) {
char *column;
int col_num = 0;
for( column = strtok(line, "\t");
column;
column = strtok(NULL, "\t") )
{
col_num++;

printf("%d: %s\n", col_num, column);
}
}
fclose(fp);

strtok is funny. It keeps its own internal state of where it is in the string. The first time you call it, you pass it the string you're looking at. To get the rest of the fields, you call it with NULL and it will keep reading through that string. So that's why there's that funny for loop that looks like its repeating itself.

Global state is dangerous and very error prone. strsep and strtok_r fix this. If you're being told to use strtok, find a better resource to learn from.

Now that we have each column and its position, we can do what we like with it. I'm going to use a switch to choose only the columns we want.

    for( column = strtok(line, "\t");
column;
column = strtok(NULL, "\t") )
{
col_num++;

switch( col_num ) {
case 1:
case 2:
case 3:
case 4:
case 5:
case 9:
case 10:
case 13:
printf("%s\t", column);
break;
default:
break;
}
}

puts("");

You can do whatever you like with the columns at this point. You can print them immediately, or put them in a list, or a structure.

Just remember that column is pointing to memory in line and line will be overwritten. If you want to store column, you'll have to copy it first. You can do that with strdup but *sigh* that isn't standard C. strcpy is really easy to use wrong. If you're stuck with standard C, write your own strdup.

char *mystrdup( const char *src ) {
char *dst = malloc( (sizeof(src) * sizeof(char)) + 1 );
strcpy( dst, src );
return dst;
}


Related Topics



Leave a reply



Submit