Tellg() Function Give Wrong Size of File

tellg() function give wrong size of file?

tellg does not report the size of the file, nor the offset
from the beginning in bytes. It reports a token value which can
later be used to seek to the same place, and nothing more.
(It's not even guaranteed that you can convert the type to an
integral type.)

At least according to the language specification: in practice,
on Unix systems, the value returned will be the offset in bytes
from the beginning of the file, and under Windows, it will be
the offset from the beginning of the file for files opened in
binary mode
. For Windows (and most non-Unix systems), in text
mode, there is no direct and immediate mapping between what
tellg returns and the number of bytes you must read to get to
that position. Under Windows, all you can really count on is
that the value will be no less than the number of bytes you have
to read (and in most real cases, won't be too much greater,
although it can be up to two times more).

If it is important to know exactly how many bytes you can read,
the only way of reliably doing so is by reading. You should be
able to do this with something like:

#include <limits>

file.ignore( std::numeric_limits<std::streamsize>::max() );
std::streamsize length = file.gcount();
file.clear(); // Since ignore will have set eof.
file.seekg( 0, std::ios_base::beg );

Finally, two other remarks concerning your code:

First, the line:

*buffer = new char[length];

shouldn't compile: you have declared buffer to be a char*,
so *buffer has type char, and is not a pointer. Given what
you seem to be doing, you probably want to declare buffer as
a char**. But a much better solution would be to declare it
as a std::vector<char>& or a std::string&. (That way, you
don't have to return the size as well, and you won't leak memory
if there is an exception.)

Second, the loop condition at the end is wrong. If you really
want to read one character at a time,

while ( file.get( buffer[i] ) ) {
++ i;
}

should do the trick. A better solution would probably be to
read blocks of data:

while ( file.read( buffer + i, N ) || file.gcount() != 0 ) {
i += file.gcount();
}

or even:

file.read( buffer, size );
size = file.gcount();

EDIT: I just noticed a third error: if you fail to open the
file, you don't tell the caller. At the very least, you should
set the size to 0 (but some sort of more precise error
handling is probably better).

Find size of binary file, function tellg() returns -1

You can get the size of a file like this

long long filesize(const char *fname)
{
int ch;
FILE *fp;
long long answer = 0;
fp = fopen(fname, "rb");
if(!fp)
return -1;
while( (ch = fgetc(fp)) != EOF)
answer++;
fclose(fp);
return answer;
}

It's portable, and whilst it does a pass over the file, usually you'll have to pass through the file anyway, so you're not blowing up the big O efficiency of your function. Plus fgetc() is highly optimised for buffering.

Fstream's tellg / seekg returning higher value than expected

At a guess, you're opening the file in translated mode, probably under Windows. When you simply seek to the end of the file, the current position doesn't take the line-end translations into account. The end of a line (in the external file) is marked with the pair "\r\n" -- but when you read it in, that's converted to just a "\n". When you use getline to read one line at a time, the \ns all get discarded as well, so even on a system (e.g. Unix/Linux) that does no translation from external to internal representation, you can still expect those to give different sizes.

Then again, you should really forget that new [] exists at all. If you want to read an entire file into a string, try something like this:

std::stringstream continut;
continut << fisier.rdbuf();

continut.str() is then an std::string containing the data from the file.

Using C++ filestreams (fstream), how can you determine the size of a file?

You can open the file using the ios::ate flag (and ios::binary flag), so the tellg() function will give you directly the file size:

ifstream file( "example.txt", ios::binary | ios::ate);
return file.tellg();

How can I determine the current size of the file opened by std::ofstream?

fstreams can be both input and output streams. tellg() will return the input position and tellp() will tell you of the output position. tellp() will after appending to a file tell you its size.

Consider initializing your Logger like this (edit: added example for output stream operator):

#include <iostream>
#include <fstream>

class Logger {
std::string m_filename;
std::ofstream m_os;
std::ofstream::pos_type m_curr_size;
std::ofstream::pos_type m_max_size;
public:
Logger(const std::string& logfile, std::ofstream::pos_type max_size) :
m_filename(logfile),
m_os(m_filename, std::ios::app),
m_curr_size(m_os.tellp()),
m_max_size(max_size)
{}

template<typename T>
friend Logger& operator<<(Logger&, const T&);
};

template<typename T>
Logger& operator<<(Logger& log, const T& msg) {
log.m_curr_size = (log.m_os << msg << std::flush).tellp();

if(log.m_curr_size>log.m_max_size) {
log.m_os.close();
//rename & compress
log.m_os = std::ofstream(log.m_filename, std::ios::app);
log.m_curr_size = log.m_os.tellp();
}
return log;
}

int main()
{
Logger test("log", 4LL*1024*1024*1024*1024);
test << "hello " << 10 << "\n";
return 0;
}

If you use C++17 or have an experimental version of <filesystem> available, you could also use that to get the absolute file size, like this:

#include <iostream>
#include <fstream>
#include <filesystem>

namespace fs = std::filesystem;

class Logger {
fs::directory_entry m_logfile;
std::ofstream m_os;
std::uintmax_t m_max_size;

void rotate_if_needed() {
if(max_size_reached()) {
m_os.close();
//rename & compress
m_os = std::ofstream(m_logfile.path(), std::ios::app);
}
}
public:
Logger(const std::string& logfile, std::uintmax_t max_size) :
m_logfile(logfile),
m_os(m_logfile.path(), std::ios::app),
m_max_size(max_size)
{
// make sure the path is absolute in case the process
// have changed current directory when we need to rotate the log
if(m_logfile.path().is_relative())
m_logfile = fs::directory_entry(fs::absolute(m_logfile.path()));
}

std::uintmax_t size() const { return m_logfile.file_size(); }
bool max_size_reached() const { return size()>m_max_size; }

template<typename T>
friend Logger& operator<<(Logger&, const T&);
};

template<typename T>
Logger& operator<<(Logger& log, const T& msg) {
log.m_os << msg << std::flush;
log.rotate_if_needed();
return log;
}

int main()
{
Logger test("log", 4LL*1024*1024*1024*1024);
std::cout << test.size() << "\n";
test << "hello " << 10 << "\n";
std::cout << test.size() << "\n";
test << "some more " << 3.14159 << "\n";
std::cout << test.size() << "\n";
return 0;
}

How can I get the real size of a file with C++?

To get file's size and other info like it's creation and modification time, it's owner, permissions etc. you can use the stat() function.

#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>

struct stat s;
if (stat("path/to/file.txt", &s)) {
// error
}

printf("filesize in bytes: %u\n", s.st_size);

Documentation:

  • Linux version: stat(2)

  • Windows version: _stat, _stat32, _stat64, _stati64, _stat32i64, _stat64i32, _wstat, _wstat32, _wstat64, _wstati64, _wstat32i64, _wstat64i32

Tellg returning unexpected value

A couple of things:

  • First and foremost, you're not reading line by line, so there
    is no reason to assume that you advance the number of characters
    in a line each time through the loop. If you want to read line
    by line, use std::getline, and then extract the fields from
    the line, either using std::istringstream or some other
    method.

  • The result of tellg is not an integer, and when converted to
    an integral type (not necessarily possible), there is no
    guaranteed relationship with the number of bytes you have
    extracted. On Unix machines, the results will correspond, and
    under Windows if (and only if) the file has been opened in
    binary mode. On other systems, there may be no visible
    relationship what so ever. The only valid portable use of the
    results of tellg is to pass it to a seekg later; anything
    else depends on the implementation.

  • How do you know that each line contains exactly 76 characters?
    Depending on how the file was produced, there might be a BOM at
    the start (which would count as three characters if the file in
    encoded in UTF8 and you are in "C" locale). And what about
    trailing whitespace. Again, if your input is line oriented, you
    should be reading lines, and then parsing them.

  • Finally, but perhaps the most important: you're using the
    results of >> without verifying that the operator worked. In
    your case, the output suggests that it did, but you can never be
    sure without verifying.

Globally, your loop should look like:

std::string line;
while ( std::getline( i, line ) ) {
std::istringstream l( line );
std::string a;
std::string b;
std::string c;
std::string d;
std::string e;
l >> a >> b >> c >> d >> e >> std::ws;
if ( !l || l.get() != EOF ) {
// Format error in line...
} else {
// ...
}
}

Outputting tellg still won't tell you anything, but at least
you'll read the input correctly. (Outputting the length of
line might be useful in some cases.)

How can I get a file's size in C++?

#include <fstream>

std::ifstream::pos_type filesize(const char* filename)
{
std::ifstream in(filename, std::ifstream::ate | std::ifstream::binary);
return in.tellg();
}

See http://www.cplusplus.com/doc/tutorial/files/ for more information on files in C++.

edit: this answer is not correct since tellg() does not necessarily return the right value. See http://stackoverflow.com/a/22986486/1835769



Related Topics



Leave a reply



Submit