how to cut a string using length in unix shell
Given your new requirements, is this what you're trying to do:
$ cat tst.awk
BEGIN { FIELDWIDTHS="9 18 11 5" }
NR==FNR { f2[$1]=$2; f3[$1]=$3; next }
$1 in f2 { print $1 f2[$1] f3[$1] $4 $5 }
$ awk -f tst.awk file1 file2
000123 moorsevi har NC asee terel
000125 staevil strd NC klass aklsd
000126 carolie asdr NC skdkld kaks
Uses GNU awk for FIELDWIDTHS
.
Get the book Effective Awk Programming, 4th Edition, by Arnold Robbins.
How to cut a string after a specific character in unix
Using sed:
$ var=server@10.200.200.20:/home/some/directory/file
$ echo $var | sed 's/.*://'
/home/some/directory/file
Length of string in bash
UTF-8 string length
In addition to fedorqui's correct answer, I would like to show the difference between string length and byte length:
myvar='Généralités'
chrlen=${#myvar}
oLang=$LANG oLcAll=$LC_ALL
LANG=C LC_ALL=C
bytlen=${#myvar}
LANG=$oLang LC_ALL=$oLcAll
printf "%s is %d char len, but %d bytes len.\n" "${myvar}" $chrlen $bytlen
will render:
Généralités is 11 char len, but 14 bytes len.
you could even have a look at stored chars:
myvar='Généralités'
chrlen=${#myvar}
oLang=$LANG oLcAll=$LC_ALL
LANG=C LC_ALL=C
bytlen=${#myvar}
printf -v myreal "%q" "$myvar"
LANG=$oLang LC_ALL=$oLcAll
printf "%s has %d chars, %d bytes: (%s).\n" "${myvar}" $chrlen $bytlen "$myreal"
will answer:
Généralités has 11 chars, 14 bytes: ($'G\303\251n\303\251ralit\303\251s').
Nota: According to Isabell Cowan's comment, I've added setting to $LC_ALL
along with $LANG
.
Length of an argument, working sample
Argument work same as regular variables
showStrLen() {
local bytlen sreal oLang=$LANG oLcAll=$LC_ALL
LANG=C LC_ALL=C
bytlen=${#1}
printf -v sreal %q "$1"
LANG=$oLang LC_ALL=$oLcAll
printf "String '%s' is %d bytes, but %d chars len: %s.\n" "$1" $bytlen ${#1} "$sreal"
}
will work as
showStrLen théorème
String 'théorème' is 10 bytes, but 8 chars len: $'th\303\251or\303\250me'
Useful printf
correction tool:
If you:
for string in Généralités Language Théorème Février "Left: ←" "Yin Yang ☯";do
printf " - %-14s is %2d char length\n" "'$string'" ${#string}
done
- 'Généralités' is 11 char length
- 'Language' is 8 char length
- 'Théorème' is 8 char length
- 'Février' is 7 char length
- 'Left: ←' is 7 char length
- 'Yin Yang ☯' is 10 char length
Not really pretty output!
For this, here is a little function:
strU8DiffLen() {
local charlen=${#1} LANG=C LC_ALL=C
return $(( ${#1} - charlen ))
}
or written in one line:
strU8DiffLen() { local chLen=${#1} LANG=C LC_ALL=C;return $((${#1}-chLen));}
Then now:
for string in Généralités Language Théorème Février "Left: ←" "Yin Yang ☯";do
strU8DiffLen "$string"
printf " - %-$((14+$?))s is %2d chars length, but uses %2d bytes\n" \
"'$string'" ${#string} $((${#string}+$?))
done
- 'Généralités' is 11 chars length, but uses 14 bytes
- 'Language' is 8 chars length, but uses 8 bytes
- 'Théorème' is 8 chars length, but uses 10 bytes
- 'Février' is 7 chars length, but uses 8 bytes
- 'Left: ←' is 7 chars length, but uses 9 bytes
- 'Yin Yang ☯' is 10 chars length, but uses 12 bytes
Unfortunely, this is not perfect!
But there left some strange UTF-8 behaviour, like double-spaced chars, zero spaced chars, reverse deplacement and other that could not be as simple...
Have a look at diffU8test.sh or diffU8test.sh.txt for more limitations.
Extract substring in Bash
Use cut:
echo 'someletters_12345_moreleters.ext' | cut -d'_' -f 2
More generic:
INPUT='someletters_12345_moreleters.ext'
SUBSTRING=$(echo $INPUT| cut -d'_' -f 2)
echo $SUBSTRING
How to split a string in shell and get the last field
You can use string operators:
$ foo=1:2:3:4:5
$ echo ${foo##*:}
5
This trims everything from the front until a ':', greedily.
${foo <-- from variable foo
## <-- greedy front trim
* <-- matches anything
: <-- until the last ':'
}
How to get the length of each word in a column without AWK, sed or a loop?
while read -r num word; do
printf '%s %s %s\n' "$num" "$word" "${#word}"
done < file
how to grep only the first word of the output
You can use awk
just to print the first column from the output
[ /Downloads - 11:34 AM ]$ du -s /Users/test_user
80839384 /Users/test_user
[ /Downloads - 11:34 AM ]$ du -s /Users/test_user | awk '{print $1}'
80839384
[ /Downloads - 11:34 AM ]$
Related Topics
Difference Between Patch and Diff Files
How to Use Expect with Optional Prompts
How to Tell If a File Is Older Than 30 Minutes from /Bin/Sh
Count Occurrences of a Char in Plain Text File
How to Set a Static Ip Address in a Docker Container
Postgresql -Bash: Psql: Command Not Found
How to Count the Number of Characters in a Bash Variable
Generating a CSV List from Linux 'Ps'
Error: Service "Xxx" Uses an Undefined Network "Xxx"
How to Fix Permission Denied for .Git/ Directory When Performing Git Push
How to Look Up a Variable by Name with #!/Bin/Sh (Posix Sh)
Read Line by Line in Bash Script
Linux's Thread Local Storage Implementation
Why Do We Need a Swapper Task in Linux
How to Delete the First Column ( Which Is in Fact Row Names) from a Data File in Linux