Sed Command Working on Command Line But Not in Perl Script

sed command working on command line but not in perl script

In a Perl script you need valid Perl language, just like you need valid C text in a C program. In the terminal sed.. is understood and run by the shell as a command but in a Perl program it is just a bunch of words, and that line sed.. isn't valid Perl.

You would need this inside qx() (backticks) or system() so that it is run as an external command. Then you'd indeed need "some backslashes," which is where things get a bit picky.

But why run a sed command from a Perl script? Do the job with Perl

use warnings;
use strict;
use File::Copy 'move';

my $file = 'filename';
my $out_file = 'new_' . $file;

open my $fh, '<', $file or die "Can't open $file: $!";
open my $fh_out, '>', $out_file or die "Can't open $out_file: $!";

while (<$fh>)
{
s/\$( [^{] [a-z_]* )/\${$1}/gix;
print $fh_out $_;
}
close $fh_out;
close $fh;

move $out_file, $file or die "Can't move $out_file to $file: $!";

The regex uses a negated character class, [^...], to match any character other than { following $, thus excluding already braced words. Then it matches a sequence of letters or underscore, as in the question (possibly none, since the first non-{ already provides at least one).

With 5.14+ you can use the non-destructive /r modifier

print $fh_out s/\$([^{][a-z_]*)/\${$1}/gir;

with which the changed string is returned (and original is unchanged), right for the print.

The output file, in the end moved over the original, should be made using File::Temp. Overwriting the original this way changes $file's inode number; if that's a concern see this post for example, for how to update the original inode.

A one-liner (command-line) version, to readily test

perl -wpe's/\$([^{][a-z_]*)/\${$1}/gi' file

This only prints to console. To change the original add -i (in-place), or -i.bak to keep backup.


A reasonable question of "Isn't there a shorter way" came up.

Here is one, using the handy Path::Tiny for a file that isn't huge so we can read it into a string.

use warnings;
use strict;
use Path::Tiny;

my $file = 'filename';
my $out_file = 'new_' . $file;

my $new_content = path($file)->slurp =~ s/\$([^{][a-z_]*)/\${$1}/gir;

path($file)->spew( $new_content );

The first line reads the file into a string, on which the replacement runs; the changed text is returned and assigned to a variable. Then that variable with new text is written out over the original.

The two lines can be squeezed into one, by putting the expression from the first instead of the variable in the second. But opening the same file twice in one (complex) statement isn't exactly solid practice and I wouldn't recommend such code.

However, since module's version 0.077 you can nicely do

path($file)->edit_lines( sub { s/\$([^{][a-z_]*)/\${$1}/gi } );

or use edit to slurp the file into a string and apply the callback to it.

So this cuts it to one nice line after all.

I'd like to add that shaving off lines of code mostly isn't worth the effort while it sure can lead to trouble if it disturbs the focus on the code structure and correctness even a bit. However, Path::Tiny is a good module and this is legitimate, while it does shorten things quite a bit.

SED command not working in perl script

In Perl, backticks have the same escape and interpolation rules as double quoted strings: A backslash forming an unknown escape code forgets the backslash, e.g. "\." eq ".".

Therefore, the Perl code

print `echo \"1\"`;
print `echo \\"1\\"`;

outputs

1
"1"

If you want to embed that sed command into Perl, you have to escape the backslashes so that they even reach the shell:

$gf = `sed 's/\\s\\+//g' aaa > bbb`;

Actually, you won't get any output into $gf as you redirect the output to a file. We could just do

use autodie;

system "sed 's/\\s\\+//g' aaa > bbb";

or with single quotes:

use autodie;

system q{ sed 's/\s\+//g' aaa > bbb };

which keeps the backslashes.

Still, this is quite unneccessary as Perl could apply the substitution itself.

use autodie; # automatic error handling

open my $out, ">", "bbb";
open my $in, "<", "aaa";
while (<$in>) {
s/\s\+//g; # remove all spaces followed by a plus
print {$out} $_;
}

Bash working on command line but not in perl script

If the rest of your script is in Perl, I would strongly suggest replacing your calls to sed with a native implementation.

For example, the replacements you have made using sed could be replaced with something like this:

use strict;
use warnings;

for my $file (glob '*.csv') {
open my $in, '<', $file;
my @lines;
while (<$in>) {
next if /"",""/;
next if /___/;
next if /---/;
next if /===/;
push @lines, $_;
}
close $in;

# this will overwrite your files!
# change $file to something else to test
open my $out, '>', $file;
print $out $_ for @lines;
}

This loops through each file ending in .csv, reading each line. It skips any lines that match one of the patterns (you could do this using a single regex with | between each pattern if you wanted but I left it the same as your calls to sed). It pushes any remaining lines to an array. It then reopens the input file for writing and prints the array.

Granted, it's slightly longer in terms of numbers of lines but it saves you having to use system to call external commands when Perl is more than capable. It also means that each file is only opened once, rather than once per substitution as in your original code.

Sed command in perl script

I think you're running into a bit of quoting hell with the backticks, where you need to concatenate the ssh command and all its arguments into a string to send to sh. I'd suggest this, where each component of the ssh command is a separate word:

use autodie;

my @ssh_cmd = ('ssh', $remote_addr, 'sed', '-n', "/$last_hour/,\\\$p", "$dir/$remote_filename");
open my $pipe, '-|', @ssh_cmd;
my $outstr = join '', <$pipe>;
close $pipe;

Sed command inside perl script doesn't work

You probably want a "-e" after the "sed" at least. And a "\" before the "$".

Complicated sed command inside Perl script

Perl can do the transformation that you're using sed to do, almost trivially. I doubt if this is minimal Perl, but it works on the sample data, except it does not map Xris to XUrl (I assume that is a typo in the question).

#!/usr/bin/env perl

use strict;
use warnings;

while (<>)
{
if (m/: \[.*$/)
{
chomp;
my $next = <>;
$_ =~ s/: \[.*/: /;
$_ .= $next;
}
print;
}

When run on the data file from the question, the output is:

"ABC": "abcd.com"
"Xris": "xyz.com"
"users": "user.com"
"id": "96444aa4b618.com"

which is pretty much what was wanted. You can probably revise the code so that the chomp is not necessary, but you need to remove the newline at the end of $_ in the if statement:

#!/usr/bin/env perl

use strict;
use warnings;

while (<>)
{
if (m/: \[.*$/)
{
my $next = <>;
$_ =~ s/: \[.*\n/: /;
$_ .= $next;
}
print;
}

Same output from the same input.

Perl system() not working with sed


system( `sed -i 's/[ 0-9]+) //' fileName` );

First backticks execute a system command and return the output from it. So you are calling system() on the output from sed, which is probably not what you want. You might do this:

system(sed => -i => 's/[ 0-9]\+) //', 'fileName' );

And shelling out to sed is rather unnecessary anyway, as perl can do anything that sed can, e.g.:

$^I = 1;

@ARGV = 'fileName';
while (<>) {
s/[ \d]+\) //;
print;
}

sed unterminated `s'command` error when running from a script

Why run another tool like sed once you are inside a Perl program? If anything, now you have far more tools and power so just do it with Perl.

One simple way to do your sed thing

use warnings;
use strict;

die "Usage: $0 file(s)\n" if not @ARGV;

while (<>) {
s/b/batman\nRobin/;
print;
}

Run this program by supplying the file (temp) to it on the command line. The die line is there merely to support/enforce such usage; it is inessential for script's operation.

This program then is a simple filter

  • <> operator reads line by line all files submitted on the command line

  • A line is assigned by it to $_ variable, a default for many things in Perl

  • The s/// operator by default binds to $_, which gets changed (if pattern matches)

  • print by default prints the $_ variable

  • Use nearly anything you want for delimiters in regex, see m// and s/// operators

This can also be done as

while (<>) {
print s/b/batman\nRobin/r
}

With /r modifier s/// returns the changed string (or the original if pattern didn't match)

Finally that's also just

print s/b/batman\nRobin/r while <>;

but I'd expect that with a script you really want to do more and then this probablyisn't it.

On the other side of things you could write it more properly

use warnings;
use strict;
use feature qw(say);

die "Usage: $0 file(s)\n" if not @ARGV;

while (my $line = <>) {
chomp $line;

$line =~ s/b/batman\nRobin/;

say $line;
}

With a line in a lexical variable nicely chomp-ed this is ready for more work.

Command pipeline run out of Perl fails at sed prepending a path

Since this runs out of a Perl script there is no good reason to go for external tools for any of the processing. Perl has varied support for this kind of work and excels at it.

use warnings;
use strict;
use feature 'say';

use List::Util qw(uniq);

my $file = shift @ARGV;
die "Usage: $0 filename\n" if not $file or not -f $file;

open my $fh, '<', $file or die "Can't open $file: $!";

my $patt1 = qr/./; # match any one character; for testing
my $patt2 = qr/./; # these are "$a" and "$b"

# Only lines with both patterns
my @lines = grep { /$patt1/ and /$patt2/ } <$fh>;

my $dir = '/some/path/';
my %freq;

my @sorted =
map { "$dir " . join ' ', @$_ }
grep { ++$freq{join("", @{$_}[4,5])} == 1 }
sort {
$a->[1] cmp $b->[1] or
$a->[4] cmp $b->[4] or
$a->[5] cmp $b->[5]
}
map { [ split ] }
@lines;

say for @sorted;

I use $patt1 and $patt2 instead of $a and $b, which are special names that shouldn't be used (and are very bad variable names). I set them to match any one character, for my tests.

In the sorting statement, an arrayref is first made of words for each line (fields for external sort). Then these arrayrefs are sorted by the second field and then by 5th and 6th. The sorted set is then filtered so to keep only the first line from each subset of lines with equal 5th and 6th field (unique in these sorted fields, -uk 5,6 in externalsort).

Finally the lines are reconstituted as strings and prepended a $dir.

This has been tested with a file I made up but as I am not certain what exactly your pipeline is meant to do it may need changes to match that purpose.

The script takes all lines from a file and retains all those which have patterns. The memory usage is then multiple times that, due to the sort statement, and for files of extreme size this may be too much.
In such a case we'd have an example when an external tool is helpful, since system sort does not load whole files in memory when they are too large.



Related Topics



Leave a reply



Submit