How to Remove Double Line Breaks with Sed

How can I replace each newline (\n) with a space using sed?

Use this solution with GNU sed:

sed ':a;N;$!ba;s/\n/ /g' file

This will read the whole file in a loop (':a;N;$!ba), then replaces the newline(s) with a space (s/\n/ /g). Additional substitutions can be simply appended if needed.

Explanation:


  1. sed starts by reading the first line excluding the newline into the pattern space.
  2. Create a label via :a.
  3. Append a newline and next line to the pattern space via N.
  4. If we are before the last line, branch to the created label $!ba ($! means not to do it on the last line. This is necessary to avoid executing N again, which would terminate the script if there is no more input!).
  5. Finally the substitution replaces every newline with a space on the pattern space (which is the whole file).

Here is cross-platform compatible syntax which works with BSD and OS X's sed (as per @Benjie comment):

sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/ /g' file

As you can see, using sed for this otherwise simple problem is problematic. For a simpler and adequate solution see this answer.

Remove all occurrence of new line between two patterns (sed or awk?)

You can use a simple state machine with awk, such as with the following input file, slightly modified to also allow text outside the markers (if there is no such text, it will still work as desired, this is just to handle extra cases):

xyzzy
plugh
<INFOSTART
A=1
B=2
C=3
D=4
<INFOEND
twisty
passages
<INFOSTART
G=1
Z=3
<INFOEND
after
last

With a data file like that (or your original), the following awk command gives you what you need, combining lines between the start and end markers into a single line:

awk ' /^<INFOSTART$/ {inside=1; sep=""; next}
/^<INFOEND$/ {inside=0; print ""; next}
inside {printf sep""$0; sep=" "; next}
{print}' input_file

xyzzy
plugh
A=1 B=2 C=3 D=4
twisty
passages
G=1 Z=3
after
last

Examining the awk code in more detail, the following sections expand on each line.


The following segment runs whenever you find a line consisting of only the start marker. It sets the inside state to true (non-zero) to indicate that you should start combining lines, and sets the initial separator to the empty string to ensure no leading space on the combined line. The next simply goes and grabs the next input line immediately, starting a new cycle:

/^<INFOSTART$/ {inside=1; sep=""; next}

Assuming you didn't find a start marker, this segment runs for an end marker. If found, the inside state is set back to false (zero) to start printing out lines exactly as they appear in the input file. It also outputs a newline to properly finish the combined line, then restarts the cycle with the next input line:

/^<INFOEND$/   {inside=0; print ""; next}

If you've established that the line is neither a start nor end marker, your behaviour depends on the inside state. For true, you need to combine the input lines into a single output line, so you simply print, without a trailing newline, the separator followed by the line itself. Then you set the separator to a space so the next input line will be properly separated from the previous one. It then cycles back for the next input line:

inside         {printf sep""$0; sep=" "; next}

Finally, if you get here, you know you're outside of a start/end section so you just echo the line exactly as it exists in the input file:

               {print}'

If you don't want the nicely formatted version, you can use the following minified version, assuming you're certain the only <INFO... lines are the start and end markers:

awk '/^<INFOS/{a=1;b="";next}/^<INFOE/{a=0;print"";next}a{printf b$0;b=" ";next}1'

However, since this will probably be in a script rather than a one-liner command, I'd tend to stick with the readable version myself.

Can I use the sed command to replace multiple empty line with one empty line?

Give this a try:

sed '/^$/N;/^\n$/D' inputfile

Explanation:

  • /^$/N - match an empty line and append it to pattern space.
  • ; - command delimiter, allows multiple commands on one line, can be used instead of separating commands into multiple -e clauses for versions of sed that support it.
  • /^\n$/D - if the pattern space contains only a newline in addition to the one at the end of the pattern space, in other words a sequence of more than one newline, then delete the first newline (more generally, the beginning of pattern space up to and including the first included newline)

(How) can I remove all newlines (\n) using sed?

Sorry can't be done using sed, please see:
http://sed.sourceforge.net/sedfaq5.html#s5.10
and a discussion here: http://objectmix.com/awk/26812-sed-remove-last-new-line-2.html

Looks like sed will add back the \n if it is present as the last character.

sed - remove line break if line does not end on \

give this awk one-liner a try:

awk '{printf "%s%s",$0,(/"$/?"\n":"")}' file

test

kent$  cat f
"foo"
"bar"
"a long
text with
many many
lines"
"lalala"

kent$ awk '{printf "%s%s",$0,(/"$/?"\n":"")}' f
"foo"
"bar"
"a longtext withmany manylines"
"lalala"

How to use sed to remove only double empty lines?

I've commented the sed command you don't understand:

sed '
## In first line: append second line with a newline character between them.
1N;
## Do the same with third line.
N;
## When found three consecutive blank lines, delete them.
## Here there are two newlines but you have to count one more deleted with last "D" command.
/^\n\n$/d;
## The combo "P+D+N" simulates a FIFO, "P+D" prints and deletes from one side while "N" appends
## a line from the other side.
P;
D
'

Remove 1N because we need only two lines in the 'stack' and it's enought with the second N, and change /^\n\n$/d; to /^\n$/d; to delete all two consecutive blank lines.

A test:

Content of infile:

1

2
3

4

5

6

7

Run the sed command:

sed '
N;
/^\n$/d;
P;
D
' infile

That yields:

1
2
3

4

5

6
7

Remove newlines (\n) if they are followed by a string - using SED

This might work for you (GNU sed):

sed ':a;N;/\nfox/s/\n//;ta;P;D' file

Read two lines into the pattern space and if the second line matches the criteria, remove the newline and repeat. The first line is always printed and then deleted. If the pattern space still has a line in it i.e. the criteria was not matched, another line is appended etc however if the line did meet the criteria the pattern space is empty and two lines will be read in as they would be such as at the beginning of the file.



Related Topics



Leave a reply



Submit