Wc -M in Unix Adds One Character

wc -m in unix adds one character

As hexdump has shown you, whatever editor you are using is adding a '\n' or 0x0A (new line) character at the end of the line when you save the file, even if you aren't writing one explicitly.

See: http://www.asciitable.com/

Why does wc count one extra character in my file?

You have a trailing new line and this is what wc reports.

See for example if we create a file with printf:

$ printf "hello" > a
$ cat a | hexdump -c
0000000 h e l l o
0000005
$ wc a
0 1 5 a

However, if we write with something like echo, a trailing new line is appended:

$ echo "hello" > a
$ cat a | hexdump -c
0000000 h e l l o \n
0000006
$ wc a
1 1 6 a

wc -m in linux excess 1 value

TL;DR

Use awk's length() function:

md5sum file.png | awk '{print length($1)}'
32

It's because awk will add a line feed character to the output. You can check:

md5sum file.png | awk '{print $1}' | xxd

You can tell awk to not doing that using ORS output record separator variable:

md5sum file.png | awk '{print $1}' ORS='' | wc -m
32

Why wc adds plus one

try:

echo -n "stuff"|wc

echo adds a newline, so if you count by bytes or chars, there is at least 1

see following examples:

kent$  echo ""|wc -c
1

kent$ echo -n ""|wc -c
0

kent$ echo ""|wc -m
1

kent$ echo -n ""|wc -m
0

if you count by "word", there is no difference:

kent$  echo  -n ""|wc -w
0

kent$ echo ""|wc -w
0

wc character count seems inflated by 1

echo will add a newline characterafter the output by default. Use echo -n to avoid this. Also wc -c will count bytes, use wc -m for character count.

wc -c counting wrong number of characters unix

Since you're interested in counting the number of columns, you could use awk for this:

Using your input:

$ cat file
A,0,0,0,21,36,12,0,0,0,17.2,34,18,17.2,30.5,96,126,517,2399,2,111.83,38.583,111,1,0,0,0,0,0,0

gives:

$ awk -F, '{print NF}' file
30

If you're interested in the number of comma's:

$ head -1 file | awk -F, '{print NF-1}'
29

BTW, I think you trying to call wc -m to count the characters.

WC command of mac showing one less result

Last line does not contain a new line.

One trick to get the result you want would be:

sed -n '=' <yourfile> | wc -l

This tells sed just to print the line number of each line in your file which wc then counts. There are probably better solutions, but this works.

newline sequence counts as one character?

\n is a way of representing a newline character in various languages and programs but as the name suggests, a newline is only stored in a file as a single character.

The backslash helps both computers and humans to realise you are referring to a newline character without you having to actually type one, which would be confusing in a lot of instances.

wc -m seems to stop while loop in bash

The code shown does not do what you think (and claim in your question).

Your curl command fetches the web and throws it to stdout: you are not keeping this information for future use. Then, your wc does not have any parameter, so it starts reading from stdin. And in stdin you have the list of usernames from $filename, so the number that gets computed are not the chars of the web, but the remaining chars of the file. Once that has been accounted, there is nothing left in stdin to be read, so the loop ends because it got to the end of the file.

You are looking for something like:

#!/bin/bash
filename="$1"

set -o pipefail
rm -f output.txt
while read username; do
x=$(curl -fs "http://example.website.domain/$username/index.html" | wc -m)
if [ $? -eq 0 ]
then
echo "$username $x" >> output.txt
else
echo "The page doesn't exist"
fi
done < "$filename"

Here, the page fetched is directly fed to the wc. If curl fails you won't see that (the exit code of a series of piped commands is the exit code of the last command by default), so we use set -o pipefail to get the exit code of the rightmost exit code with a value different from zero. Now you can check if everything went OK, and in that case, you can write the result.

I also added an rm of the output file to make sure we are not growing an existing one and changed the redirection to the output file to an append to avoid re-creating the file on each iteration and ending up with the result of the last iteration (thanks to @tripleee for noting this).

Update (by popular request):

The pattern:

<cmd>
if [ $? -eq 0 ]...

is usually a bad idea. It is better to go for:

if <cmd>...

So it would be better if you switch to:

if x=$(curl -fs "http://example.website.domain/$username/index.html" | wc -m); then
echo...


Related Topics



Leave a reply



Submit