How to Recursively Download a Folder Via Ftp on Linux

How to recursively download a folder via FTP on Linux

You could rely on wget which usually handles ftp get properly (at least in my own experience). For example:

wget -r ftp://user:pass@server.com/

You can also use -m which is suitable for mirroring. It is currently equivalent to -r -N -l inf.

If you've some special characters in the credential details, you can specify the --user and --password arguments to get it to work. Example with custom login with specific characters:

wget -r --user="user@login" --password="Pa$$wo|^D" ftp://server.com/

As pointed out by @asmaier, watch out that even if -r is for recursion, it has a default max level of 5:

-r
--recursive
Turn on recursive retrieving.

-l depth
--level=depth
Specify recursion maximum depth level depth. The default maximum depth is 5.

If you don't want to miss out subdirs, better use the mirroring option, -m:

-m
--mirror
Turn on options suitable for mirroring. This option turns on recursion and time-stamping, sets infinite
recursion depth and keeps FTP directory listings. It is currently equivalent to -r -N -l inf
--no-remove-listing.

Downloading all files from an FTP Server

Use wget in this manner (m for mirroring):

wget -m ftp://username:password@ip.of.old.host

If your username or password contains special characters, you may need to use the format:

wget -m --user=username --password=password ftp://ip.of.old.host

Alternatively, I found this guide which shows you how to do it using ncftp in Debian. You will require root access to the new server if ncftp is not installed already.

In short:

sudo apt-get install ncftp
ncftpget –T –R –v –u "ftpuser" ftp.nixcraft.net /home/vivek/backup /www-data

Using wget to download select directories from ftp server

Based on this doc it seems that the filtering functions of wget are very limited.

When using the --recursive option, wget will download all linked documents after applying the various filters, such as --no-parent and -I, -X, -A, -R options.

In your example:

wget -r -I /pub/special.requests/cew/2013/county/ ftp://ftp.bls.gov/pub/special.requests/cew/

This won't download anything, because the -I option specifies to include only links matching /pub/special.requests/cew/2013/county/, but on the page /pub/special.requests/cew/ there are no such links, so the download stops there. This will work though:

wget -r -I /pub/special.requests/cew/2013/county/ ftp://ftp.bls.gov/pub/special.requests/cew/2013/

... because in this case the /pub/special.requests/cew/2013/ page does have a link to county/

Btw, you can find more details in this doc than on the man page:

http://www.gnu.org/software/wget/manual/html_node/

Using wget to recursively fetch a directory with arbitrary files in it

You have to pass the -np/--no-parent option to wget (in addition to -r/--recursive, of course), otherwise it will follow the link in the directory index on my site to the parent directory. So the command would look like this:

wget --recursive --no-parent http://example.com/configs/.vim/

To avoid downloading the auto-generated index.html files, use the -R/--reject option:

wget -r -np -R "index.html*" http://example.com/configs/.vim/

How do I recursively ftp only certain file types from a linux server using the command line?

You can do it with wget

wget -r -np -A "*.htm*" ftp://site/dir

or:

wget -m -np -A "*.htm*" ftp://user:pass@host/dir

However, as per Types of Files:

Note that these two options do not affect the downloading of HTML files (as determined by a .htm or .html filename prefix). This behavior may not be desirable for all users, and may be changed for future versions of Wget.

Recursively PUT files to a remote server using FTP

Try using LFTP:
http://lftp.yar.ru/

or YAFC:
http://yafc.sourceforge.net/index.php

How to download files from my list with wget and ftp

Sorry, I missed the "single session" part when I commented. I think you need to have your script generate a second script to run a single FTP session.

So, your script will not do any FTP itself, it will just write another script that does the transfers. So, it will write a script that does this

ftp -n <SOMEADDRESS> <<EOS>
quote USER <USERNAME>
quote PASS <PASSWORD>
bin
get file1 localname1
get file2 localname2
...
get fileN localnameN
quit
EOS

Then it will execute that script, by doing:

bash < thatScript

So your script will look like this:

#!/bin/bash
ScriptName=funkyFTPer
cat - <<END > $ScriptName
ftp -n 192.168.0.1 <quote USER freddy
quote PASS frog
END

# Your selection code goes here ***PHNQZ***
echo get file1 localname1 >> $ScriptName
echo get file2 localname2 >> $ScriptName
echo get fileN localnameN >> $ScriptName

echo quit >> $ScriptName
echo EOS >> $ScriptName
echo "Now run bash < $ScriptName"

Then delete the script as it contains your password. Or you can put the password in your .netrc file.

As regards creating directories locally, you can do that in the first script using mkdir -p. The -p has the advantage that it creates all directories in between in one go and doesn't get upset if they already exist.

So, just looking at the area of code where it says ***PHNQZ*** above, let's say your code decides you need file freddy/frog/c.txt, you could do:

remotename="freddy/frog/c.txt"  
localdir=${remotename%/*} # Get just directory part using "bash Parameter Substitution"
mkdir -p "$localdir" # make directory and all parts in between

Downloading a full remote directory via FTP from the command line

Try lftp or wget and use the -m flag (see https://serverfault.com/questions/25199/using-wget-to-recursively-download-whole-ftp-directories)

Recursively ftp download, then extract gz files

I can read the contents of the ftp page if I start R with the internet2 option. I.e.

C:\Program Files\R\R-2.12\bin\x64\Rgui.exe --internet2

(The shortcut to start R on Windows can be modified to add the internet2 argument - right-click /Properties /Target, or just run that at the command line - and obvious on GNU/Linux).

The text on that page can be read like this:

 download.file("ftp://prism.oregonstate.edu//pub/prism/pacisl/grids", "f.txt")
txt <- readLines("f.txt")

It's a little more work to parse out the Directory listings, then read them recursively for the underlying files.

## (something like)
dirlines <- txt[grep("Directory <A HREF=", txt)]

## split and extract text after "grids/"
split1 <- sapply(strsplit(dirlines, "grids/"), function(x) rev(x)[1])

## split and extract remaining text after "/"
sapply(strsplit(split1, "/"), function(x) x[1])
[1] "dem" "ppt" "tdmean" "tmax" "tmin"

It's about here that this stops seeming very attractive, and gets a bit laborious so I would actually recommend a different option. There would no doubt be a better solution perhaps with RCurl, and I would recommend learning to use and ftp client for you and your user. Command line ftp, anonymous logins, and mget all works pretty easily.

The internet2 option was explained for a similar ftp site here:

https://stat.ethz.ch/pipermail/r-help/2009-January/184647.html



Related Topics



Leave a reply



Submit