Remove Left and Right Square Brackets Using Sed/Bash

Remove Left and right square brackets using sed/bash

If i understood you correctly this command should help:

sed -i 's/\[Contributor\]//g'

How can I use sed to delete line with square brackets?

IMHO, you can't do it with just one sed but below might be an approach

$ cat 38680195
qqq abc cdef (1234) [5689a]
qqq abc cdef (134) [4hgh]
line with <angle brackets>
line without brackets
line with {curly braces}
line with [square brackets]
another line without brackets
$ sed -n '/\[[^]]*\]/p' 38680195 && sed -i.backup '/\[[^]]*\]/d' 38680195
qqq abc cdef (1234) [5689a]
qqq abc cdef (134) [4hgh]
line with [square brackets]
$ cat 38680195
line with <angle brackets>
line without brackets
line with {curly braces}
another line without brackets

Note
A backup of original file is placed in 38680195.backup

$ cat 38680195.backup 
qqq abc cdef (1234) [5689a]
qqq abc cdef (134) [4hgh]
line with <angle brackets>
line without brackets
line with {curly braces}
line with [square brackets]
another line without brackets

Edit

A smarter approach may be

grep '\[[^]]*\]' 38680195 && sed -i.backup '/\[[^]]*\]/d' 38680195

In this case if none of the lines contain [], the sed part wont be executed because it is logically ANDed (See &&).

How do i replace [] brackets using SED

Here is the final code I ended up with

`echo "$string" | sed 's/[^a-zA-Z0-9]/ /g'`

I had to put = and - at the very end.

Remove everything but brackets with sed, then indent

This will not be pretty... =) Here is my solution as a sed script. Notice that it requires that the first line notifies the shell how to invoke sed to execute our script. As you can see, the "-n" flag is used so we force sed only to print what we explicitly command it to through the "p" or "P" commands. The "-f" option tells sed to read the commands from a file, with the name following the option. As the file name of the script is concatenated by the shell into the final command, it will properly read commands from the script (ie. if you run "./myscript.sed" the shell will execute "/bin/sed -nf myscript.sed").

#!/bin/sed -nf

s/[^][{}]//g

t loop
: loop

t dummy
: dummy

s/^\s*[[{]/&/
t open

s/^\s*[]}]/&\
/
t close
d

: open
s/^\(\s*\)[[]\s*[]]/\1[]\
/
s/^\(\s*\)[{]\s*[}]/\1{}\
/

t will_loop
b only_open

: will_loop
P
s/.*\n//
b loop

: only_open

s/^\s*[[{]/&\
/
P
s/.*\n//
s/[][{}]/ &/g
b loop

: close
s/ \([][{}]\)/\1/g
P
s/.*\n//
b loop

Before we start, we must first strip everything into brackets and square brackets. That's the responsibility of the first "s" command. It tells sed to replace every character that isn't a bracket or a square bracket with nothing, ie. remove it. Notice that the square brackets in the match represent a group of characters to match, but when the first character inside them is a "^", it will actually match any character except the ones specified after the "^". Because we want to match the closing square bracket and we need to close with a square bracket the group of characters to ignore, we tell that a closing square bracket should be included in the group by making it the first character following the "^". We can then specify the rest of the characters: opening square bracket, open bracket and close bracket (group of ignored characters: "][{}"), and then close the group with the closing square bracket. I tried to detail more here because this can be confusing.

Now for the actual logic. The algorithm is pretty simple:

while line isn't empty
if line starts with optional spaces followed by [ or {
if after the [ or { there are optional spaces followed by a respective ] or }
print the respective pair, with only the indentation spaces, followed by a newline
else
print the opening square or normal bracket, followed by a newline
remove what was printed from the pattern space (a.k.a. the buffer)
add a space before every open or close bracket (normal or square)
end-if
else
remove a space before every open or close bracket (normal or square)
print the closing square or normal bracket, followed by a newline
remove what was printed from the pattern space
end-if
end-while

But there are a couple of quirks. First of all, sed doesn't support a "while" loop or an "if" statement directly. The closest we can get to is the "b" and "t" commands. The "b" command branches (jumps) to a predefined label, similar to a C goto statement. The "t" also branches to a predefined label, but only if a substitution has happened since the start of the script running on the current line or since the last "t" command. Labels are written with the ":" command.

Because it is very likely that the first command actually performs at least one substitution, the first "t" command that follows it will cause a branch. Because we need to test for some other substitutions, we need to make sure that the next "t" command won't automatically succeed because of that first command. That is why we start with a "t" command to a line just above it (ie. if it branches or not, it will still continue at the same point), so we can "reset" the internal flag used by "t" commands.

Because the "loop" label will be branched to from at least one "b" command, it is possible that the same flag will be set when the "b" is executed, because only "t" commands can clear it. Therefore, we need to do the same workaround to reset the flag, this time by using a "dummy" label.

We now start the algorithm by checking for the presence of an open square bracket or an open close bracket. Because we only want to test for their presence, we must replace the match with itself, which is what "&" represents, and sed will automatically set the internal flag for the "t" command if the match succeeds. If the match succeeds, we use the "t" command to branch into into "open" label.

If it doesn't succeed, we need to see if we match a close square or normal the bracket. The command is nearly identical, but now we append a newline after the closing bracket. We do this by adding an escaped newline (ie. a backslash followed by an actual newline) after where we place the match (ie. after the "&"). Similarly to above, we use the "t" command to branch to the "close" label if the match succeeds. If it doesn't succeed, we will consider the line as invalid, and promptly empty the pattern space (buffer) and restart the script on the next line, all with the single "d" command.

Entering the "open" label, we will first handle the case of a pair of matching open and close brackets. If we do match them, we will print them with the indentation spaces preceding them, without any spaces between them, and ending with a newline. There is one specific command for each type of bracket pair (square or normal), but they are analogous. Because we have to keep track of how many indentation spaces there are we must store them in a special "variable". We do this by using the group capture, which will store the part of the match that starts after the "(" and ends before the ")". Therefore, we use it to capture the spaces after the start of the line and before the open bracket. We then proceed to match the open bracket followed by spaces and the respective close bracket. When we write the replacement, we make sure to reinsert the spaces by using the special variable "\1", which contains the data stored by the first group capture in the match. We then write the respective pair of open and close brackets and the escaped newline.

If we managed to do any of the replacements, we must print what we have just written, remove it from the pattern space and restart the loop with the remaining characters of the line. Because of this, we first branch with the "t" command to the "will_loop" label. Otherwise, we branch to the "only_open" label, which will handle the case of only an open bracket, without the consecutive respective close bracket.

Inside the "will_loop" label, we just print everything in the pattern space up to the first newline (which we manually added) with the "P" command. We then manually remove everything up to that first newline, so we can proceed with the rest of the line. This is similar to what the "D" command does, but without restarting the execution of the script. Finally we branch to the start of the loop again.

Inside the "only_open" label, we match an open bracket in a similar fashion as previously, but now we rewrite it appended with a newline. We then print that line and remove it from the pattern space. Now we replace all brackets (open or close, square or normal) with itself preceded by a single space character. This is so we can increment the indentation. Finally we branch to the beginning of the loop again.

The final label "close" will handle a closing bracket. We first remove every single space before a bracket, effectively decrementing the indentation. To do this, we need to use captures, because although we want to match the space and the bracket that follows, we only want to write back the bracket. Finally, print everything up to the newline that we manually added before entering the "close" label, remove what we printed from pattern space and restart the loop.

Some observations:

  1. This doesn't check for the syntactic correctness of the code (ie. {{[}] would be accepted)
  2. It will add and remove indentation as brackets are encountered, regardless of their type. This means that when it adds an indentation, it will remove it even if the encountered close bracket is not of the same type.

Hope this helps, and sorry for the long post =)

How to replace paired square brackets with other syntax with sed?

sed -e 's/\[\([^]]*\)\]/\\macro{\1}/g' file.txt

This looks for an opening bracket, any number of explicitly non-closing brackets, then a closing bracket. The group is captured by the parens and inserted into the replacement expression.

sed OR regex how to remove everything between round brackets

This sed should work:

sed 's/([^)]*)/(*)/g'

How to escape square closing bracket in sed

You don't even need to escape:

echo "xyx[xzx]xyx" | sed 's|[][]| |g'
xyx xzx xyx

However keep in mind that order of ] then [ is important here.

How to remove square brackets and any text inside?

try this sed line:

sed 's/\[[^]]*\]//g' 

example:

kent$  echo "The fish [ate] the bird.
[This is some] text.
Here is a number [1001] and another [1201]."|sed 's/\[[^]]*\]//g'
The fish the bird.
text.
Here is a number and another .

explanation:

the regex is actually straightforward:

\[     #match [
[^]]* #match any non "]" chars
\] #match ]

so it is

match string, starting with [ then all chars but ] and ending with ]



Related Topics



Leave a reply



Submit