How to Specify More Spaces for the Delimiter Using Cut

How to specify more spaces for the delimiter using cut?

Actually awk is exactly the tool you should be looking into:

ps axu | grep '[j]boss' | awk '{print $5}'

or you can ditch the grep altogether since awk knows about regular expressions:

ps axu | awk '/[j]boss/ {print $5}'

But if, for some bizarre reason, you really can't use awk, there are other simpler things you can do, like collapse all whitespace to a single space first:

ps axu | grep '[j]boss' | sed 's/\s\s*/ /g' | cut -d' ' -f5

That grep trick, by the way, is a neat way to only get the jboss processes and not the grep jboss one (ditto for the awk variant as well).

The grep process will have a literal grep [j]boss in its process command so will not be caught by the grep itself, which is looking for the character class [j] followed by boss.

This is a nifty way to avoid the | grep xyz | grep -v grep paradigm that some people use.

cut: can we set multiple spaces as the delimiter?

No, you cannot. If you want to be able to use more than one character (or even a regex) for the delimiter then use awk instead.

Use space as a delimiter with cut command


cut -d ' ' -f 2

Where 2 is the field number of the space-delimited field you want.

Include white spaces while using CUT command

I'm assuming that this part of your script is a typo:

val=`cut -c7-18 $c_line`
echo $var >> $targetfile
# ^ should be $val

So the problem is that you're using $val, when you should be using "$val".

When you expand a variable without quotes, word-splitting occurs, so echo sees two arguments:

echo 'First' 'last'

These arguments are printed, separated by a single space.

How to make the 'cut' command treat same sequental delimiters as one?

Try:

tr -s ' ' <text.txt | cut -d ' ' -f4

From the tr man page:


-s, --squeeze-repeats replace each input sequence of a repeated character
that is listed in SET1 with a single occurrence
of that character

BASH: Using cut on space delimited file: Treating two spaces as one

You can solve this in just one line of awk:

% awk '/^#/ {printf "%04d.%02d.%02d.%02d.%02d.%02d\n", $2, $3, $4, $5, $6, $7}' ~/stuff 

Yields:

2007.04.29.10.01.17

How to avoid leading space using cut command?

Using just grep, you can accomplish this with the following pipe:

grep -oe "[^ ][^ ]*  *[^ ][^ ]*$"

grep # a tool for matching text
-o # only prints out matching text
-e # uses a regex
[^ ] # match anything that isn't a space
* # match zero or more of the previous element
$ # the end of the line

Note: This does not account for trailing whitespace.

Demonstration:

$ echo '  3 abcd
23 xyz
1234 abc' | grep -oe "[^ ][^ ]* *[^ ][^ ]*$"
3 abcd
23 xyz
1234 abc

How to use cut with multiple character delimiter in Unix?

Since | is a valid regex expression, it needs to be escaped with \\| or put in square brackets: [|].

You can do this:

awk -F' \\|\\|\\| ' '{print $1}' file

Some other variations that work as well:

awk -F' [|][|][|] ' '{print "$1"}' file
awk -F' [|]{3} ' '{print "$1"}' file
awk -F' \\|{3} ' '{print "$1"}' file
awk -F' \\|+ ' '{print "$1"}' file
awk -F' [|]+ ' '{print "$1"}' file

\ as separator does not work well in square brackets, only escaping, and many escape chars :)

cat file
abc \\\ xyz \\\ foo bar

Example: 4 \ for every \ in the expression, so 12 \ in total.

awk -F' \\\\\\\\\\\\ ' '{print $2}' file
xyz

or

awk -F' \\\\{3} ' '{print $2}' file
xyz

or this but it's not much simpler

awk -F' [\\\\]{3} ' '{print $2}' file
xyz

awk -F' [\\\\][\\\\][\\\\] ' '{print $2}' file
xyz

Cut command to specify the tab as the delimiter

Cut splits the input lines at the given delimiter (-d, --delimiter).

To split by tabs omit the -d option, because splitting by tabs is the default.

By using the -f (--fields) option you can specify the fields you are interrested in.

echo -e "a\tb\tc" |cut -f 1 # outputs "a"
echo -e "a\tb\tc" |cut -f 2 # outputs "b"
echo -e "a\tb\tc" |cut -f 3 # outputs "c"
echo -e "a\tb\tc" |cut -f 1,3 # outputs "a\tc"
echo -e "a\tb\tc\td\te" |cut -f 2-4 # outputs "b\tc\td"

You can also specify the output delimiter (--output-delimiter) and get rid of lines not containing any delimiters (-s/--only-delimited)

echo -e "a\tb\tc\td\te" |cut -f 2-4 --output-delimiter=":" # outputs b:c:d

If you are interrested in the first field of your input file simply do...

cut -f 1 file.txt


Related Topics



Leave a reply



Submit