Linux: Merging multiple files, each on a new line
just use awk
awk 'FNR==1{print ""}1' *.txt
Concatenating Files And Insert New Line In Between Files
You can do:
for f in *.txt; do (cat "${f}"; echo) >> finalfile.txt; done
Make sure the file finalfile.txt
does not exist before you run the above command.
If you are allowed to use awk
you can do:
awk 'FNR==1{print ""}1' *.txt > finalfile.txt
Bash: concatenate multiple files and add \newline between each?
If you want the literal string "\newline"
, try this:
for f in *.md; do cat "$f"; echo "\newline"; done > output.md
This assumes that output.md
doesn't already exist. If it does (and you want to include its contents in the final output) you could do:
for f in *.md; do cat "$f"; echo "\newline"; done > out && mv out output.md
This prevents the error cat: output.md: input file is output file
.
If you want to overwrite it, you should just rm
it before you start.
How to merge two files line by line in Bash
You can use paste
:
paste file1.txt file2.txt > fileresults.txt
How to append contents of multiple files into one file
You need the cat
(short for concatenate) command, with shell redirection (>
) into your output file
cat 1.txt 2.txt 3.txt > 0.txt
Using cat command start each file with new line
Quick and dirty, add newlines before each METADATA:
cat F_Worker_TEMP_VO.dat ....dat | sed 's/METADATA/\nMETADATA/g' > Worker.dat
Dirty, add newlines before each METADATA except first:
cat F_Worker_TEMP_VO.dat ....dat | sed 's/\(.\)METADATA/\1\nMETADATA/g' > Worker.dat
Loop:
for file in F_Worker_TEMP_VO.dat ... F_PERSON_NATIONALIDE_SSN_TEMP_VO.dat; do
cat "${file}"
echo
done > Worker.dat
Concatenate text files, separating them with a new line
A simple
sort -u *.db > uniquified # adjust glob as needed
should do it; sort
will interpose newlines between files should it be necessary.
cat *.db | sort -u
is a classic UUoC and the glitch with files lacking trailing newlines is not the only issue.
Having said that, 25GB probably won't fit in your RAM, so sort
will end up creating temporary files anyway. It might turn out to be faster to sort the files in four or five groups, and then merge the results. That could take better advantage of the large number of duplicates. But I'd only experiment if the simple command really takes an exorbitant amount of time.
Even so, sorting the files individually is probably even slower; usually the best bet is to max out your memory resources for each invocation of sort
. You could, for example, use xargs
with the -n
option to split the filelist into groups of a couple of dozen files each. Once you have each group sorted, you could use sort -m
to merge the sorted temporaries.
A couple of notes on how to improve sorting speed:
Use
LC_COLLATE=C sort
if you don't need locale-aware sorting of alphabetic data. That typically speeds sort up by a factor of three or four.Avoid using RAM disks for temporary space. (On many Linux distros,
/tmp
is a RAM disk.) Sincesort
uses temporary disks when it runs out of RAM, putting the temporary in a RAMdisk is counterproductive. For the same reason, don't put your own temporary output files in/tmp
./var/tmp
should be real disk; even better, if possible, use a second disk drive (not a slow USB drive, of course).Avoid slugging your machine down with excessive swapping while you're doing the sort, by turning swap off:
sudo swapoff -a
You can turn it back on afterwards, although I personally run my machine like this all the time because it avoids diving into complete unresponsiveness under memory pressure.
The ideal is to adjust
-S
so thatsort
uses as much memory as you can spare, and avoid the use of internal temporaries by sorting in chunks which fit into that amount of memory. (Merging the sorted chunks is a lot faster than sorting, and it reads and writes sequentially without needing extra disk space.) You'll probably need to do some experimentation to find a good chunk size.
Merging two text files into new one (back and forth every new line) using C in Linux using system-calls
you have to retain if you have read all your file or not, because the read in the first while will ... read, and that's not what you want.
Code edited after comment :
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <stdbool.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>
bool WriteLineFromFile(int dst, int src, bool *srcTerminated)
{
int lastChar = EOF;
char currentChar;
ssize_t nbCharRead;
ssize_t nbCharWrite;
do {
if ((nbCharRead = read(src, ¤tChar, 1)) < 0) {
fprintf(stderr, "%s : read(src, &buf, 1) : src=%d, errno='%s'.\n", __func__, src, strerror(errno));
return (false);
}
// End of file
if (nbCharRead == 0) {
(*srcTerminated) = true;
// Adding '\n' if necessary
if (lastChar != '\n' && lastChar != EOF) {
currentChar = '\n';
while ((nbCharWrite = write(dst, ¤tChar, 1)) != 1) {
if (nbCharWrite < 0) {
fprintf(stderr, "%s : write(dst, &buf, 1) : dst=%d, errno='%s'.\n", __func__, dst, strerror(errno));
return (false);
}
sleep(1);
}
}
return (true);
}
// Writing a char into the dst file
while ((nbCharWrite = write(dst, ¤tChar, 1)) != 1) {
if (nbCharWrite < 0) {
fprintf(stderr, "%s : write(dst, &buf, 1) : dst=%d, errno='%s'.\n", __func__, dst, strerror(errno));
return (false);
}
sleep(1);
}
lastChar = currentChar;
} while (currentChar != '\n');
return (true);
}
bool FileMerging(char *inputPathFile1, char *inputPathFile2, char *outputPathFile)
{
int inputFile1 = -1;
bool file1Terminated = false;
int inputFile2 = -1;
bool file2Terminated = false;
int outputFile = -1;
bool returnFunction = false;
// Openning all the file descriptor
if ((inputFile1 = open(inputPathFile1, O_RDONLY)) == -1) {
fprintf(stderr, "%s : open(inputPathFile1, O_RDONLY) : inputPathFile1='%s', errno='%s'.\n", __func__, inputPathFile1, strerror(errno));
goto END_FUNCTION;
}
if ((inputFile2 = open(inputPathFile2, O_RDONLY)) == -1) {
fprintf(stderr, "%s : open(inputPathFile2, O_RDONLY) : inputPathFile2='%s', errno='%s'.\n", __func__, inputPathFile2, strerror(errno));
goto END_FUNCTION;
}
if ((outputFile = open(outputPathFile, O_WRONLY | O_CREAT, 0644)) == -1) {
fprintf(stderr, "%s : open(outputPathFile, O_RDONLY) : outputPathFile='%s', errno='%s'.\n", __func__, outputPathFile, strerror(errno));
goto END_FUNCTION;
}
// Alternativly print a line from inputFile1 and inputFile2 to outputFile
do {
if (!file1Terminated) {
if (!WriteLineFromFile(outputFile, inputFile1, &file1Terminated)) {
goto END_FUNCTION;
}
}
if (!file2Terminated) {
if (!WriteLineFromFile(outputFile, inputFile2, &file2Terminated)) {
goto END_FUNCTION;
}
}
} while (!file1Terminated || !file2Terminated);
returnFunction = true;
/* GOTO */END_FUNCTION:
if (inputFile1 != -1) {
close(inputFile1);
}
if (inputFile2 != -1) {
close(inputFile2);
}
if (outputFile != -1) {
close(outputFile);
}
return (returnFunction);
}
int main(int argc, char *argv[])
{
if (argc != 4) {
fprintf(stderr, "This program wait 3 arguments on the command-line : inputFilePath1 inputPathFile2 outputPathFile.\n");
return (EXIT_FAILURE);
}
if (!FileMerging(argv[1], argv[2], argv[3])) {
return (EXIT_FAILURE);
}
return (EXIT_SUCCESS);
}
Related Topics
How to Open Sublime Text 2 Files from the Command Line in Linux to a Tab, Not a New Window
X11 Forwarding Request Failed on Channel 0
What Is the Fastest Way to Find All the File with the Same Inode
Changing Contents of a File Through Shell Script
Highlight Text Similar to Grep, But Don't Filter Out Text
Choosing Between Multiple Executables with Same Name in Linux
How to Swap /Dev/Sda with /Dev/Sdb
How to Find a File/Directory That Could Be Anywhere on Linux Command Line
Install Zsh Without Root Access
Wkhtmltopdf Installation Error on Ubuntu
How to Find My Shell Version Using a Linux Command
Convert a Fixed Width File from Text to CSV
Best Way to Divide in Bash Using Pipes