How to Efficiently Move Many Files to a New Server

How can I efficiently move many files to a new server?

You actually have multiple options, my favorite would be using rsync.

rsync [dir1] [dir2]

This command will actually compare the directories, and sync only the differences between them.

With this, I would be most likeley to use the following

rsync -z -e ssh user@example.com:/var/www/ /var/www/

-z Zip

-e Shell Command

You could also use SFTP, FTP via SSH.

Or even wget.

wget -rc ssh://user@example.com:/var/www/

Efficient way to transfer files to 1000s of servers

Sure, you would use some sort of script, since you don't want to do that manually.
But instead of sending all the files from one server to all the others, you would start sending the file to k Servers. As soon as these k Servers received the file (let's say at time t), they can start distributing the file too, so after approx. time 2*t already k^2 servers have the file instead of 2*k in the original solution. After time 3*t already k^3 Servers have got the file... You continue with that algorithm until every server has got it's file.

To make the whole process yet a bit faster, you could also divide the file in chunks, so that a server can start redistributing it before it has received the whole file (you will end up with something like torrent)

Fastest way to move files on a Windows System

Robocopy

You can restart the command and it'll resume. I use it all the time over the network. Works on large files as well.

Efficient way to create an image of millions of files for transfer

Do you really think a single 1.5TB file will be easier to copy over than the individual files? Especially considering you then have to split them up again. Also it requires twice as must disk space at both ends to hold both the archive and the small files.

I recommend using a tool with backup and resume support such as robocopy to replicate the individual files to the target machine.

http://technet.microsoft.com/en-us/library/cc733145.aspx

Fastest way to copy files from one directory to another

Since your i/o subsystem is almost certainly the botteneck here, using the parallel task library is probably about as good as it gets:

static void Main(string[] args)
{
DirectoryInfo source = new DirectoryInfo( args[0] ) ;
DirectoryInfo destination = new DirectoryInfo( args[1] ) ;

HashSet<string> filesToBeCopied = new HashSet<string>( ReadFileNamesFromDatabase() , StringComparer.OrdinalIgnoreCase ) ;

// you'll probably have to play with MaxDegreeOfParallellism so as to avoid swamping the i/o system
ParallelOptions options= new ParallelOptions { MaxDegreeOfParallelism = 4 } ;

Parallel.ForEach( filesToBeCopied.SelectMany( fn => source.EnumerateFiles( fn ) ) , options , fi => {
string destinationPath = Path.Combine( destination.FullName , Path.ChangeExtension( fi.Name , ".jpg") ) ;
fi.CopyTo( destinationPath , false ) ;
}) ;

}

public static IEnumerable<string> ReadFileNamesFromDatabase()
{
using ( SqlConnection connection = new SqlConnection( "connection-string" ) )
using ( SqlCommand cmd = connection.CreateCommand() )
{
cmd.CommandType = CommandType.Text ;
cmd.CommandText = @"
select idPic ,
namePicFile
from DocPicFiles
" ;

connection.Open() ;
using ( SqlDataReader reader = cmd.ExecuteReader() )
{
while ( reader.Read() )
{
yield return reader.GetString(1) ;
}
}
connection.Close() ;

}
}

Moving lots (1million+) files from one server to another - 'too many arguments'

-T - will read the file names from stdin. So you can do something like:

find . -name \*.jpg -print0 | tar -zcvf images.tar.gz --null -T -

However I would recommend rsync instead, as I noted in the comments.

As noted in the comments, the print0 uses nulls ('\0') to terminate file names and --null for tar to use that, in order to accept more general file names with spaces and other terminators.



Related Topics



Leave a reply



Submit