Stop on Newline When Using Read(...)

Does gets() stops reading when it reaches '\r' or '\n' or '\r\n'?

From the C Standard (5.2.2 Character display semantics)

\n (new line) Moves the active position to the initial position of the
next line.

And (7.21.2 Streams)

2 A text stream is an ordered sequence of characters composed into
lines, each line consisting of zero or more characters plus a
terminating new-line character
. Whether the last line requires a
terminating new-line character is implementation-defined. Characters
may have to be added, altered, or deleted on input and output to
conform to differing conventions for representing text in the host
environment. Thus, there need not be a one-to-one correspondence
between the characters in a stream and those in the external
representation
. Data read in from a text stream will necessarily
compare equal to the data that were earlier written out to that stream
only if: the data consist only of printing characters and the control
characters horizontal tab and new-line; no new-line character is
immediately preceded by space characters; and the last character is a
new-line character. Whether space characters that are written out
immediately before a new-line character appear when read in is
implementation-defined.

Thus the new line character is the character '\n'.

Take into account that the function gets is unsafe and is not supported any more by the C Standard.

Java : read file and stop at new line context

If you're looking to break your input file into pieces delimited by a blank line, something like this might work:

FileReader fr = new FileReader(//the txt file mentioned above); 
Scanner s = new Scanner(fr);
while (s.hasNext()){
String paragraph = new String();
while(s.hasNext()) {
String line = s.next();
if (line.length() == 0)
break;
if (paragraph.length() != 0)
paragraph = paragraph + "\n";
paragraph = paragraph + "\n" + line;
}
// Do something with paragraph here...
}

read() from stdin doesn't ignore newline

You ask:

When inputting data from standard input, generally the user presses enter when done. But read() considers '\n' as input too in which case n = 1 and the conditional doesn't evaluate to false.

The first point is certainly true. The enter key is equivalent to the newline key, so when the user presses enter, the keyboard generates a newline character, and the read() function therefore returns that character. It is crucial that it does do that.

Therefore, your condition is misguided - an empty line will include the newline and therefore the byte count will be one. Indeed, there's only one way to get the read() call to return 0 when the standard input is the keyboard, and that's to type the 'EOF' character - typically control-D on Unix, control-Z on DOS. On Unix, that character is interpreted by the terminal driver as 'send the previous input data to the program even if there is no newline yet'. And if the user has typed nothing else on the line, then the return from read() will be zero. If the input is coming from a file, then after the last data is read, subsequent reads will return 0 bytes.

If the input is coming from a pipe, then after all the data in the pipe is read, the read() call will block until the last file descriptor that can write to the pipe is closed; if that file descriptor is in the current process, then the read() will hang forever, even though the hung process will never be able to write() to the file descriptor - assuming a single-threaded process, of course.

Ignoring \n when reading from a file in C?

C doesn't provide much in the way of conveniences, you have to provide them all yourself or use a 3rd party library such as GLib. If you're new to C, get used to it. You're working very close to the bare metal silicon.

Generally you read a file line by line with fgets(), or my preference POSIX getline(), and strip the final newline off yourself by looking at the last index and replacing it with a null if it's a newline.

#include <string.h>
#include <stdio.h>

char *line = NULL;
size_t line_capacity = 0; /* getline() will allocate line memory */

while( getline( &line, &line_capacity, fp ) > 0 ) {
size_t last_idx = strlen(line) - 1;

if( line[last_idx] == '\n' ) {
line[last_idx] = '\0';
}

/* No double newline */
puts(line);
}

You can put this into a little function for convenience. In many languages it's referred to as chomp.

#include <stdbool.h>
#include <string.h>

bool chomp( char *str ) {
size_t len = strlen(str);

/* Empty string */
if( len == 0 ) {
return false;
}

size_t last_idx = len - 1;
if( str[last_idx] == '\n' ) {
srt[last_idx] = '\0';
return true;
}
else {
return false;
}
}

It will be educational for you to implement fgets and getline yourself to understand how reading lines from a file actually works.



Related Topics



Leave a reply



Submit