Why doesn't sort file1 file1 work?
It doesn't work because '>' redirection implies truncation, and to avoid keeping the whole output of sort
in the memory before re-directing to the file, bash truncates and redirects output before running sort
. Thus, contents of the file1
file will be truncated before sort
will have a chance to read it.
Join gives warning file1 is not in sorted order
The suggestion in the join
man page is to use sort -k 1b,1
when you're joining on field 1. (It says "when join has no options" but as far as field selection is concerned, your join is equivalent to no options. -1 1
and -2 1
are the defaults.) You can add -t '|'
to that and it will match your join
perfectly.
-k1
means all fields from 1 to the end. -k1,1
means just field 1. The b
is necessary if you have leading whitespace and want to ignore it. sort syntax is weird. And this is after POSIX redesigned it to try to make it sensible. If you ever write a sort command that doesn't look complicated, it's probably not doing what you wanted.
Add --debug
to your sort command to see what it's using as a key. With a sample file containing these lines:
ADBC|Banks
ADB|Banks
ADBC|Banks
you can see the effect of various -k
options:
$ sort -s -t '|' -k 1 --debug file
sort: using simple byte comparison
ADBC|Banks
___________
ADBC|Banks
__________
ADB|Banks
_________
$ sort -s -t '|' -k 1,1 --debug file
sort: using simple byte comparison
ADBC|Banks
_____
ADB|Banks
___
ADBC|Banks
____
$ sort -s -t '|' -k 1b,1 --debug file
sort: using simple byte comparison
ADB|Banks
___
ADBC|Banks
____
ADBC|Banks
____
Now you're probably wondering about the -s
I threw in there. Without it, there is a default last-resort comparison of the whole line as a string, which applies to lines with equal keys. That's not normally a problem and you probably don't need to use -s
. It's just that when using --debug
, the last-resort comparison clutters the list so I like to use -s
to get rid of it.
bash redirection to files not working
The > yolo.txt
shell redirect happens before any of the commands run. In particular, the shell opens yolo.txt
for writing and truncates it before executing cat yolo.txt bar.txt
. So by the time cat
opens yolo.txt
, yolo.txt
is empty. Therefore the c
line in bar.txt
is unique, so uniq -u
passes it through.
I guess you wanted to use sponge
to avoid this problem, since that's what sponge
is for. But you used it incorrectly. This is the correct usage:
cat yolo.txt bar.txt | sort | uniq -u | sponge yolo.txt && cat yolo.txt
Note that I just pass the output filename to sponge
as a command-line argument, instead of using a shell redirect.
Unix: cat-ing a file out to itself - why does this blank the file?
The shell starts sort
after opening file1.txt
for output (truncating it, ie discarding all the data). Then it starts cat
with file1.txt
open for reading. The semantics of the shell are such that it might be feasible for the pipeline to get a page or so of input from file.txt
, but in practice almost all shells (which is to say all of them, but perhaps there are some shells I've never used that do not behave this way) will truncate the file before cat
ever reads any data.
To perform this operation, you must use a temporary file. (Well, it's not mandatory to use a temporary file. If the file is small enough, something like this will probably work cat file1.txt | ( sleep 2; sort > file1.txt )
, but is not guaranteed.)
Execlp function unable to find binaries
The contents of the two write()
s will not be separated; the reader just reads them as one block, with the two strings concatenated:
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>
int main(void){
int out[2];
pipe(out);
char file1[1024],file2[1024];
int pid=fork();
if(pid>0){
close(out[0]);
scanf(" %s",file1);
scanf(" %s",file2);
write(out[1],file1,strlen(file1));
//write(out[1],"\0",1);
write(out[1],file2,strlen(file2));
}
if(pid==0){
int cnt=0;
close(out[1]);
cnt=read(out[0],file1,1024);
file1[cnt]=0;
cnt = read(out[0],file2,1024);
file2[cnt]=0;
fprintf(stderr, "about to execlp(sort|%s|%s|NULL)\n",file1,file2);
execlp("sort","sort",file1,NULL);
return 0;
}
return 0;
}
./a.out
input is: wtf omg
Output:
plasser@pisbak$ ./a.out
wtf omg
about to execlp(sort|wtfomg||NULL)
plasser@pisbak$ sort: cannot read: wtfomg: No such file or director
So, the message is not about the binary not being found, but sort is unable to find the file wtfomg
which was passed as an argument.
Related Topics
How to Create a Configure Script
Linux: Compute a Single Hash for a Given Folder & Contents
Unit Testing for Shell Scripts
Apache Not Accepting Incoming Connections from Outside of Localhost
How to Configure Linux Capabilities Per User
How to Get the Difference (Only Additions) Between Two Files in Linux
Sed with Literal String--Not Input File
Why Child Process Still Alive After Parent Process Was Killed in Linux
Unix - Create Path of Folders and File
How to See Full Absolute Path of a Symlink
How to Run a Cron Job Inside a Docker Container
Why Is the Probe Method Needed in Linux Device Drivers in Addition to Init
Creating Subdomains in Amazon Ec2
What Do the Numbers in /Proc/Loadavg Mean on Linux