Why Doesn't "Sort File1 > File1" Work

Why doesn't sort file1 file1 work?

It doesn't work because '>' redirection implies truncation, and to avoid keeping the whole output of sort in the memory before re-directing to the file, bash truncates and redirects output before running sort. Thus, contents of the file1 file will be truncated before sort will have a chance to read it.

Join gives warning file1 is not in sorted order

The suggestion in the join man page is to use sort -k 1b,1 when you're joining on field 1. (It says "when join has no options" but as far as field selection is concerned, your join is equivalent to no options. -1 1 and -2 1 are the defaults.) You can add -t '|' to that and it will match your join perfectly.

-k1 means all fields from 1 to the end. -k1,1 means just field 1. The b is necessary if you have leading whitespace and want to ignore it. sort syntax is weird. And this is after POSIX redesigned it to try to make it sensible. If you ever write a sort command that doesn't look complicated, it's probably not doing what you wanted.

Add --debug to your sort command to see what it's using as a key. With a sample file containing these lines:

ADBC|Banks
ADB|Banks
ADBC|Banks

you can see the effect of various -k options:

$ sort -s -t '|' -k 1 --debug file
sort: using simple byte comparison
ADBC|Banks
___________
ADBC|Banks
__________
ADB|Banks
_________
$ sort -s -t '|' -k 1,1 --debug file
sort: using simple byte comparison
ADBC|Banks
_____
ADB|Banks
___
ADBC|Banks
____
$ sort -s -t '|' -k 1b,1 --debug file
sort: using simple byte comparison
ADB|Banks
___
ADBC|Banks
____
ADBC|Banks
____

Now you're probably wondering about the -s I threw in there. Without it, there is a default last-resort comparison of the whole line as a string, which applies to lines with equal keys. That's not normally a problem and you probably don't need to use -s. It's just that when using --debug, the last-resort comparison clutters the list so I like to use -s to get rid of it.

bash redirection to files not working

The > yolo.txt shell redirect happens before any of the commands run. In particular, the shell opens yolo.txt for writing and truncates it before executing cat yolo.txt bar.txt. So by the time cat opens yolo.txt, yolo.txt is empty. Therefore the c line in bar.txt is unique, so uniq -u passes it through.

I guess you wanted to use sponge to avoid this problem, since that's what sponge is for. But you used it incorrectly. This is the correct usage:

cat yolo.txt bar.txt | sort | uniq -u | sponge yolo.txt && cat yolo.txt

Note that I just pass the output filename to sponge as a command-line argument, instead of using a shell redirect.

Unix: cat-ing a file out to itself - why does this blank the file?

The shell starts sort after opening file1.txt for output (truncating it, ie discarding all the data). Then it starts cat with file1.txt open for reading. The semantics of the shell are such that it might be feasible for the pipeline to get a page or so of input from file.txt, but in practice almost all shells (which is to say all of them, but perhaps there are some shells I've never used that do not behave this way) will truncate the file before cat ever reads any data.

To perform this operation, you must use a temporary file. (Well, it's not mandatory to use a temporary file. If the file is small enough, something like this will probably work cat file1.txt | ( sleep 2; sort > file1.txt ), but is not guaranteed.)

Execlp function unable to find binaries

The contents of the two write()s will not be separated; the reader just reads them as one block, with the two strings concatenated:


#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>

int main(void){
int out[2];
pipe(out);
char file1[1024],file2[1024];
int pid=fork();
if(pid>0){
close(out[0]);
scanf(" %s",file1);
scanf(" %s",file2);
write(out[1],file1,strlen(file1));
//write(out[1],"\0",1);
write(out[1],file2,strlen(file2));
}
if(pid==0){
int cnt=0;
close(out[1]);
cnt=read(out[0],file1,1024);
file1[cnt]=0;
cnt = read(out[0],file2,1024);
file2[cnt]=0;

fprintf(stderr, "about to execlp(sort|%s|%s|NULL)\n",file1,file2);
execlp("sort","sort",file1,NULL);

return 0;
}
return 0;
}

./a.out

input is: wtf omg

Output:


plasser@pisbak$ ./a.out
wtf omg
about to execlp(sort|wtfomg||NULL)
plasser@pisbak$ sort: cannot read: wtfomg: No such file or director

So, the message is not about the binary not being found, but sort is unable to find the file wtfomg which was passed as an argument.



Related Topics



Leave a reply



Submit