Removing parts of a string that contain digit with SED/Perl
You need the -r switch and a character class for the sed.
$ echo "AB208804_1 446 576 AB208804_1orf 0" | sed -r 's/_[0-9]+//g'
AB208804 446 576 AB208804orf 0
Or, since you asked; in perl:
$ echo "AB208804_1 446 576 AB208804_1orf 0" | perl -ne 's/_\d+//g; print $_'
AB208804 446 576 AB208804orf 0
Remove from the beginning till certain part in a string
sed 's/^(.*)_([^_]*)$/_\2/' < input.txt
How to delete from a text file, all lines that contain a specific string?
To remove the line and print the output to standard out:
sed '/pattern to match/d' ./infile
To directly modify the file – does not work with BSD sed:
sed -i '/pattern to match/d' ./infile
Same, but for BSD sed (Mac OS X and FreeBSD) – does not work with GNU sed:
sed -i '' '/pattern to match/d' ./infile
To directly modify the file (and create a backup) – works with BSD and GNU sed:
sed -i.bak '/pattern to match/d' ./infile
how to remove part of the string if the condition exists
This works:
$ sed -E 's/^([^:]*:[^:]*):[0-9][0-9]$/\1/' file
The [^:]
means 'any character other than a :' so it works by making the deletion at the end only if there are two leading colons.
This awk
works too:
$ awk 'gsub(/:/,":")==2 {sub(/:[0-9][0-9]$/,"")} 1' file
In this case, gsub
returns the number of replacements made. So if there are two colons, delete the ending.
You can also use GNU grep (with PCRE) to only match the template of what you are looking for:
$ grep -oP '^\w+\*\d\d:\d\d' file
Or perl
same way:
$ perl -lnE 'say "$1" if /(^\w+\*\d\d:\d\d)/' file
Remove leading and trailing numbers from string, while leaving 2 numbers, using sed or awk
You may try this sed
:
sed -E 's/^[0-9]+([0-9]{2})|([0-9]{2})[0-9]+$/\1\2/g' file
51word24
anotherword
12yetanother1
62andherese123anotherline43
23andherese123anotherline45
53andherese123anotherline41
Command Details:
^[0-9]+([0-9]{2})
: Match 1+ digits at start if that is followed by 2 digits (captured in a group) and replace with 2 digits in group #1.([0-9]{2})[0-9]+$
: Match 1+ digits at the end if that is preceded by 2 digits (captured in a group) and replace with 2 digits in group #2.
Sed Regex to delete all numbers except ordinals
Since sed doesn't have support for lookarounds you have to define each path using:
[0-9]+(([sS]([^Tt]|$)|[Tt]([^Hh]|$)|[RNrn]([^Dd]|$))|[^RNSTrnst0-9]|$)
Live demo
For case-insensitivity I included both upper and lower cases into bracket notations.
GNU sed command (POSIX ERE):
sed -r 's/[0-9]+(([sS]([^Tt]|$)|[Tt]([^Hh]|$)|[RNrn]([^Dd]|$))|[^RNSTrnst0-9]|$)/\1/g' file
Regex breakdown:
[0-9]+ # Match digits
( # Start of Capturing Group #1
( # Start of Capturing Group #2
[sS] # Match S or s
( # Start of Capturing Group #3
[^Tt] # If a character exists after S it shouldn't be T
| # Or
$ # Match end of line position
) # End of Capturing Group #3
| # Or
[RNrn] # Match a letter from set
( # Start of Capturing Group #4
[^Dd] # If a character exists after R or N it shouldn't be D
| # Or
$ # Match end of line position
) # End of Capturing Group #4
) # End of Capturing Group #2
| # Or
[^RNSrns0-9] # Match a letter from other than one in set
| # Or
$ # Match end of line position
) # End of Capturing Group #1
How to delete certain characters after a pattern using sed or awk?
1st solution: Could you please try following, written and tested with shown samples in GNU awk
(where assuming ;;;
occurring one time in lines).
awk '
match($0,/.*;;;/){
laterPart=substr($0,RSTART+RLENGTH)
gsub(/[,.:;()~?]/,"",laterPart)
print substr($0,RSTART,RLENGTH) laterPart
}' Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
match($0,/.*;;;/){ ##Using atch function to match everything till ;;; here.
laterPart=substr($0,RSTART+RLENGTH) ##Creating variable laterPart which has rest of the line apart from matched regex part above.
gsub(/[,.:;()~?]/,"",laterPart) ##Globally substituting ,.:;()~? with NULL in laterPart variable.
print substr($0,RSTART,RLENGTH) laterPart ##Printing sub string of matched regex and laterPart var here.
}' Input_file ##Mentioning Input_file name here.
2nd solution: In case you have multiple occurrences of ;;;
in lines and you want to substitute characters from all fields, after 1st occurrence of ;;;
then try following.
awk 'BEGIN{FS=OFS=";;;"} {for(i=2;i<=NF;i++){gsub(/[,.:;()~?,]/,"",$i)}} 1' Input_file
Removing specific character from anywhere between two specific strings?
Using a substitution and a loop:
sed ':l s/\(number="[^" \t]*\)\s\s*/\1/g;tl' input
this one gives:
number="+123123123" text="This is some text"
number="+123456" text="This may contain numbers"
number="+123456789" text="Numbers here should keep their spaces"
number="+98765" text="example 123 123 123"
Removing non-alphanumeric characters with sed
tr's -c
(complement) flag may be an option
echo "Â10.41.89.50-._ " | tr -cd '[:alnum:]._-'
Related Topics
Bash Ip If Then Else Statement
Parsing Result of Diff in Shell Script
Error While Running Parallel Make
Replace Key:Value from One File in Another File in Shellscript
How to Detect Whether Tomcat and Ant Are Installed on Linux Machine
Shared Library Mysteriously Doesn't Get Linked to Application
Linux Cronjob Doesn't Work (Execute Script)
How to Start Linux with Gui Without Monitor
Monodevelop - Runs Only Using Sudo
Curl Error "No Alternative Certificate.."
Where Do Char Device Appear After Cdev_Add() Registers Successfully with Major on 117.
Prevent Git Checkout from Overwriting a File
How to Run My Own Script at Every Bootup
How to Use Multiple Lower Layers in Overlayfs
Possible I/O Sync Issue with Ruby Script Under Nohup
Why Sizeof(Spinlock_T) Is Greater Than Zero on Uni-Processor