Ruby Net::FTP Timeout Threads
The trick for me that worked was to use ruby's Timeout.timeout to ensure the FTP connection was not hanging.
begin
Timeout.timeout(10) do
ftp.getbinaryfile(rmls_path, local_path)
end
# ...
rescue Timeout::Error
errors << "#{thread_num}> File download timed out for: #{rmls_path}"
puts errors.last
rescue
errors << "unable to get file > ftp reponse: #{ftp.last_response}"
# ...
end
Hanging FTP downloads were causing my threads to appear to hang. Now that the threads are no longer hanging, I can use the more proper way of dealing with threads:
threads.each { |t| t.join }
rather than the ugly:
# If @last_updated has not been updated on the server in over 20 seconds, wait 3 seconds and check again
while Time.now < @last_updated + 20 do
sleep 3
end
# threads are hanging so joining the threads does not work.
threads.each { |t| t.kill }
How to give a timeout to an FTP connection
I'm not sure if it hangs indefinitely. If not, the best way would be to try and capture the error code when/if it eventually times out. That would give a bit more info for analysis.
Some possible workarounds below.
Timeouts using Process.fork
However in the meantime you might switch to running the FTP task in another process instead, and using timeout on that. This will prevent the ruby global interpreter lock from disabling a possible timeout event like you suspect now.
Something like this:
child = Process.fork do
# Run the whole FTP task in here...
ftp = Net::FTP.new(...)
...
end
# Timeout handling is done in the parent process
begin
Timeout::timeout(...) do
Process.wait(child)
end
rescue Timeout::Error
# Terminate child in case of timeout
Process.kill("KILL", child)
end
Timeouts using SystemTimer
Another option, since you're running ruby 1.8.6, would be to take a look at SystemTimer, which tries to get around the limitations of the ruby 1.8 Timeout
implementation.
How to recursively download FTP folder in parallel in Ruby?
The syncftp gem may help you:
http://rubydoc.info/gems/syncftp/0.0.3/frames
Ruby has a decent built-in FTP library in case you want to roll your own:
http://www.ruby-doc.org/stdlib-1.9.3/libdoc/net/ftp/rdoc/Net/FTP.html
To download files in parallel, you can use multiple threads with timeouts:
Ruby Net::FTP Timeout Threads
A great way to get parallel work done is Celluloid, the concurrent framework:
https://github.com/celluloid/celluloid
All that said, if the download speed is limited to your overall network bandwidth, then none of these approaches will help much.
To speed up the transfers in this case, be sure you're only downloading the information that's changed: new files and changed sections of existing files.
Segmented downloading can give massive speedups in some cases, such as downloaded big log files where only a small percentage of the file has changed, and the changes are all at the end of the file, and are all appends.
You can also consider shelling out to the command line. There are many tools that can help you with this. A good general-purpose one is "curl", which supports simple ranges for FTP files as well, for example you can get the first 100 bytes of a document using FTP like this:
curl -r 0-99 ftp://www.get.this/README
Are you open to other protocols besides FTP? Take a look at the "rsync" command, which is excellent for download synchronization. The rsync command has many optimizations to transfer just the changed data. For example rsync can sync a remote directory to a local directory like this:
rsync -auvC me@my.com:/remote/foo/ /local/foo/
Related Topics
How to Iterate Through an Array Starting from the Last Element? (Ruby)
How to Safely Join Relative Url Segments
Rails 4.1 Mailer Previews and Devise Custom Emails
Testing After_Commit with Rspec and Mocking
Check If a File Exists Using a Wildcard
How to Run My Ruby Code After Rails Server Start
How to Run Irb.Start in Context of Current Class
How to Convert String to Bytes in Ruby
Rails Console Not Working on Server
Ruby Undefined Method 'Bytesize' for #<Hash:0X2954Fe8>
Generating a Short Uuid String Using Uuidtools in Rails
Certificate Verify Failed in "Gem Install Foundation"
Rails 3.1 Actioncontroller::Routingerror (No Route Matches [Get] "/Assets/Rails.Png"):
Nameerror: Uninitialized Constant Faker
Detect Rspec Test Failure on After Each Method