Why and Where Are \N Newline Characters Getting Introduced to C()

Why and where are \n newline characters getting introduced to c()?

I doubt this is a bug. Instead, it looks like you're running into a known limitation of the console. As it says in Section 1.8 - R commands, case sensitivity, etc. of An Introduction to R:

Command lines entered at the console are limited[3] to about 4095 bytes (not characters).

[3] some of the consoles will not allow you to enter more, and amongst those which do some will silently discard the excess and some will use it as the start of the next line.

Either put the command in a file and source it, or break the code into multiple lines by inserting your own newlines at appropriate points (between commas). For example:

column_names <-
c("County Code/DFG/Aggregation Code", "District Code", "School Code",
"County Name", "District Name", "School Name", "DFG", "Special Needs",
"TOTAL POPULATION TOTAL POPULATION Number Enrolled LAL", ...)

What is the newline character in the C language: \r or \n?

It's \n. When you're reading or writing text mode files, or to stdin/stdout etc, you must use \n, and C will handle the translation for you. When you're dealing with binary files, by definition you are on your own.

What does gets() save when it reads just a newline

This part in the description of gets might be confusing:

It takes all the characters up to (but not including) the newline

It might be better to say that it takes all the characters including the newline but stores all characters not including the newline.

So if the user enters some string, the gets function will read some string and the newline character from the user's terminal, but store only some string in the buffer - the newline character is lost. This is good, because no one wants the newline character anyway - it's a control character, not a part of the data that user wanted to enter.

Therefore, if you only press enter, gets interprets it as an empty string. Now, as noted by some people, your code has multiple bugs.


printf("This is the input as a string: %s\n", input);

No problem here, though you might want to delimit your string by some artificial characters for better debugging:

printf("This is the input as a string: '%s'\n", input);


printf("Is it the string end character? %d\n", input == '\0');

Not good: you want to check 1 byte here, not the whole buffer. If you try to compare the whole buffer with 0, the answer is always false because the compiler converts \0 to NULL and interprets the comparison like "does the buffer exist at all?".

The right way is:

printf("Does the first byte contain the string end character? %d\n", input[0] == '\0');

This compares just 1 byte to \0.


printf("Is it a newline string? %d\n", input == "\n");

Not good: this compares the address of the buffer with the address of "\n" - the answer is always false. The right way to compare string in C is strcmp:

printf("Is it a newline string? %d\n", strcmp(input, "\n") == 0);

Note the peculiar usage: strcmp returns 0 when the strings are equal.


printf("Is it the empty string? %d\n", input == "");

The same bug here. Use strcmp here too:

printf("Is it the empty string? %d\n", strcmp(input, "") == 0);


BTW as people always say, gets cannot be used in a secure way, because it doesn't support protection from buffer overflow. So you should use fgets instead, even though it's less convenient:

char input[100];
while (fgets(input, sizeof input, stdin))
{
...
}

This leads to possible confusion: fgets doesn't delete the newline byte from the input it reads. So if you replace gets in your code by fgets, you will get different results. Fortunately, your code will illustrate the difference in a clear way.

Can't figure out why getchar() is picking up newline for first occurence in C

On the first prompt, you type something like aEnter, so your input stream contains the characters 'a', '\n'. The first getchar call reads the a and leaves the newline in the input stream.

In response to the second prompt, you type bcEnter, so your input stream now contains '\n', 'b', 'c', '\n'.

You can probably figure out what happens from here - the next getchar call reads that newline character from the input stream.

There are a couple of ways to deal with this. One is to test your input, and try again if it's a newline:

do
{
a = getchar();
} while ( a == '\n' ); // or while( isspace( a )), if you want to reject
// any whitespace character.

Another is to not use getchar; instead, use scanf with the %c conversion specifier and a blank space in the format string:

scanf( " %c", &c ); // you will need to change the types of your 
... // variables from int to char for this.
scanf( " %c", &a );
scanf( " %c", &b );
scanf( " %c", &c );

The leading space in the format string tells scanf to ignore any leading whitespace, so you won't pick up the newline character.

Does gets() stops reading when it reaches '\r' or '\n' or '\r\n'?

From the C Standard (5.2.2 Character display semantics)

\n (new line) Moves the active position to the initial position of the
next line.

And (7.21.2 Streams)

2 A text stream is an ordered sequence of characters composed into
lines, each line consisting of zero or more characters plus a
terminating new-line character
. Whether the last line requires a
terminating new-line character is implementation-defined. Characters
may have to be added, altered, or deleted on input and output to
conform to differing conventions for representing text in the host
environment. Thus, there need not be a one-to-one correspondence
between the characters in a stream and those in the external
representation
. Data read in from a text stream will necessarily
compare equal to the data that were earlier written out to that stream
only if: the data consist only of printing characters and the control
characters horizontal tab and new-line; no new-line character is
immediately preceded by space characters; and the last character is a
new-line character. Whether space characters that are written out
immediately before a new-line character appear when read in is
implementation-defined.

Thus the new line character is the character '\n'.

Take into account that the function gets is unsafe and is not supported any more by the C Standard.

scanf() leaves the newline character in the buffer

The scanf() function skips leading whitespace automatically before trying to parse conversions other than characters. The character formats (primarily %c; also scan sets %[…] — and %n) are the exception; they don't skip whitespace.

Use " %c" with a leading blank to skip optional white space. Do not use a trailing blank in a scanf() format string.

Note that this still doesn't consume any trailing whitespace left in the input stream, not even to the end of a line, so beware of that if also using getchar() or fgets() on the same input stream. We're just getting scanf to skip over whitespace before conversions, like it does for %d and other non-character conversions.


Note that non-whitespace "directives" (to use POSIX scanf terminology) other than conversions, like the literal text in scanf("order = %d", &order); doesn't skip whitespace either. The literal order has to match the next character to be read.

So you probably want " order = %d" there if you want to skip a newline from the previous line but still require a literal match on a fixed string, like this question.

fgets() goes newline when storing string but gets() no issue about newline

"...but gets() no issue about newline"

Note, that although your observation about gets() being preferable in this case over fgets() for handling newline, the unfavorable behaviors that come with gets() make it dangerous to use, with the result that "it was officially removed by the 2011 standard." (credit) Even without the \n mitigations mentioned below, fgets() is highly preferred over gets().

"fgets() goes newline when storing string..." and "...why does fgets() automatically enter a newline for each input of string"

fgets() does not enter the newline upon reading the line, rather if one exists, the newline is picked up as part of the line when fgets() called. For example in this case, when using stdin as the input method, the user clicks the <return> to finish inputting text. Upon hitting the <return> key, a \n is entered just like any other character, and becomes the last character entered. When the line is read using fgets(), if the \n is seen before any of its other stop reading criteria, fgets() stops reading, and stores all characters, including \n, terminates line with \0 and stores into the buffer. (If sizeof(buffer) - 1 or EOF is seen first, fgets() will never see the newline.)

To easily eliminate the \n, (or other typical unwanted line endings), use the following single line statements after each of your calls to fgets():

fgets(user.id, ID_SIZE, stdin);
user.id[strcspn(user.id, "\n")] = 0;
//fflush(stdin);//UB, should not be called
...
fgets(user.fname, MAX_FNAME_SIZE, stdin);
user.fname[strcspn(user.fname, "\n")] = 0;
...
fgets(user.lname, MAX_LNAME_SIZE, stdin);
user.lname[strcspn(user.lname, "\n")] = 0;
...

This technique works for truncating any string by searching for the unwanted char, whether it be "\n", "\n\r", "\r\n", etc. When using more than one search character, eg "\r\n", it searches until it reaches either the \r or the \n and terminates at that position.

"This [method] handles the rare buffer than begins with '\0', something that causes grief for the buffer[strlen(buffer) - 1] = '\0'; [method]." (@Chux - comment section of link below.)

Credit here

Behavior of fgets(), newline character and how it gets stored in the memory

fgets(buf, n, stream) reads input and saves it into buf until 1 of 4 things happen.

  • Buffer is nearly full. Once n-1 characters read (and saved), '\0' is appended to buf. Function returns buf. There is likely remaining characters in stream to read.1

  • '\n' is read from stream. '\n' is appended to buf. '\0' is appended to buf. Function returns buf. The line has been completely read.

  • End-of-file occurs. Had some characters been read before, '\0' is appended to buf. Function returns buf. Else NULL is returned.

  • Input error (rare). NULL is returned. state of buf is indeterminate.

The only thing different about reading '\n' versus other characters is that it informs fgets() to stop reading.


1 Should a full buffer get read without '\n', to read and toss the rest of the line:

int ch;
while ((ch = fgetc(stream)) != '\n' && c != EOF) {
;
}

Removing trailing newline character from fgets() input

The elegant way:

Name[strcspn(Name, "\n")] = 0;

The slightly ugly way:

char *pos;
if ((pos=strchr(Name, '\n')) != NULL)
*pos = '\0';
else
/* input too long for buffer, flag error */

The slightly strange way:

strtok(Name, "\n");

Note that the strtok function doesn't work as expected if the user enters an empty string (i.e. presses only Enter). It leaves the \n character intact.

There are others as well, of course.



Related Topics



Leave a reply



Submit