How can I efficiently move many files to a new server?
You actually have multiple options, my favorite would be using rsync
.
rsync [dir1] [dir2]
This command will actually compare the directories, and sync only the differences between them.
With this, I would be most likeley to use the following
rsync -z -e ssh user@example.com:/var/www/ /var/www/
-z Zip
-e Shell Command
You could also use SFTP, FTP via SSH.
Or even wget
.
wget -rc ssh://user@example.com:/var/www/
Efficient way to transfer files to 1000s of servers
Sure, you would use some sort of script, since you don't want to do that manually.
But instead of sending all the files from one server to all the others, you would start sending the file to k Servers. As soon as these k Servers received the file (let's say at time t), they can start distributing the file too, so after approx. time 2*t already k^2 servers have the file instead of 2*k in the original solution. After time 3*t already k^3 Servers have got the file... You continue with that algorithm until every server has got it's file.
To make the whole process yet a bit faster, you could also divide the file in chunks, so that a server can start redistributing it before it has received the whole file (you will end up with something like torrent)
Fastest way to move files on a Windows System
Robocopy
You can restart the command and it'll resume. I use it all the time over the network. Works on large files as well.
Efficient way to create an image of millions of files for transfer
Do you really think a single 1.5TB file will be easier to copy over than the individual files? Especially considering you then have to split them up again. Also it requires twice as must disk space at both ends to hold both the archive and the small files.
I recommend using a tool with backup and resume support such as robocopy to replicate the individual files to the target machine.
http://technet.microsoft.com/en-us/library/cc733145.aspx
Fastest way to copy files from one directory to another
Since your i/o subsystem is almost certainly the botteneck here, using the parallel task library is probably about as good as it gets:
static void Main(string[] args)
{
DirectoryInfo source = new DirectoryInfo( args[0] ) ;
DirectoryInfo destination = new DirectoryInfo( args[1] ) ;
HashSet<string> filesToBeCopied = new HashSet<string>( ReadFileNamesFromDatabase() , StringComparer.OrdinalIgnoreCase ) ;
// you'll probably have to play with MaxDegreeOfParallellism so as to avoid swamping the i/o system
ParallelOptions options= new ParallelOptions { MaxDegreeOfParallelism = 4 } ;
Parallel.ForEach( filesToBeCopied.SelectMany( fn => source.EnumerateFiles( fn ) ) , options , fi => {
string destinationPath = Path.Combine( destination.FullName , Path.ChangeExtension( fi.Name , ".jpg") ) ;
fi.CopyTo( destinationPath , false ) ;
}) ;
}
public static IEnumerable<string> ReadFileNamesFromDatabase()
{
using ( SqlConnection connection = new SqlConnection( "connection-string" ) )
using ( SqlCommand cmd = connection.CreateCommand() )
{
cmd.CommandType = CommandType.Text ;
cmd.CommandText = @"
select idPic ,
namePicFile
from DocPicFiles
" ;
connection.Open() ;
using ( SqlDataReader reader = cmd.ExecuteReader() )
{
while ( reader.Read() )
{
yield return reader.GetString(1) ;
}
}
connection.Close() ;
}
}
Moving lots (1million+) files from one server to another - 'too many arguments'
-T -
will read the file names from stdin. So you can do something like:
find . -name \*.jpg -print0 | tar -zcvf images.tar.gz --null -T -
However I would recommend rsync instead, as I noted in the comments.
As noted in the comments, the print0
uses nulls ('\0') to terminate file names and --null
for tar to use that, in order to accept more general file names with spaces and other terminators.
Related Topics
Use Grep to Remove Words from Dictionary Whose Roots Are Already Present
Install Gdal on Python 2.7 on a Amazon Linux Virtual Server
Updating Shiny Server Config to Change Timeout Error
Problems Building Libcurl 7.21.2 on Ubuntu 11.10 (Hiphop)
Difference Between Different Ways of Running Shell Script
Shell Script to Check Ubuntu Version and Then Copy Files
Qemu Hosting Mte Enabled Kernel Does Not Raise Fault
Intel Fortran Composer 2011 and Linux Mint 12
How Existing Kernel Driver Should Be Initialized as Pci Memory-Mapped
Tasklist.Exe Equivalent in Linux
How to Add My Scheduler to Linux Kernel
Linux Command 'Ll' Is Not Working
How to Get Use Count from Linux Kernel Module
Install R in Linux/Unix Without Having Root Privilage
Srlua Makefile Error Lua.H No Such File or Directory
How to Capture Remote System Network Traffic