Check for Existence of Wget/Curl

Check for existence of wget/curl

wget http://download/url/file 2>/dev/null || curl -O  http://download/url/file

How .bat can check if curl or wget exist

If you know the path where you'd expect to find the EXE, it's fairly easy:

IF EXIST C:\Windows\wget.exe ( *** do something with it ***)

...of course you could do IF NOT EXIST with a blurb to copy it, or use an ELSE statement.

Otherwise, if you don't know where you might find the file, you can search for it with something like this (original source found here):

@echo off
SETLOCAL
(set WF=)
(set TARGET=wget.exe)

:: Look for file in the current directory

for %%a in ("" %PATHEXT:;= %) do (
if not defined WF if exist "%TARGET%%%~a" set WF=%CD%\%TARGET%%%~a)

:: Look for file in the PATH

for %%a in ("" %PATHEXT:;= %) do (
if not defined WF for %%g in ("%TARGET%%%~a") do (
if exist "%%~$PATH:g" set WF=%%~$PATH:g))

:: Results
if defined WF (
*** do something with it here ***
) else (
echo The file: "%~1" was not found
)

You could wrap that whole block into a function and call it once for each EXE (change the %TARGET%s back into %~1, give it a :TITLE, then call :TITLE wget.exe)...

Alternately, you could take a different approach and just try the commands and see if they fail. Since ERRORLEVEL of 0 usually means success, you could do something like this:

wget -q <TARGET_URL>
IF NOT ERRORLEVEL 0 (
curl <TARGET_URL>
IF NOT ERRORLEVEL 0 (
ECHO Download failed!
EXIT 1
)
)
:: now continue on with your script...

How to check if an URL exists with the shell and probably curl?

Using --fail will make the exit status nonzero on a failed request. Using --head will avoid downloading the file contents, since we don't need it for this check. Using --silent will avoid status or errors from being emitted by the check itself.

if curl --output /dev/null --silent --head --fail "$url"; then
echo "URL exists: $url"
else
echo "URL does not exist: $url"
fi

If your server refuses HEAD requests, an alternative is to request only the first byte of the file:

if curl --output /dev/null --silent --fail -r 0-0 "$url"; then

Check if a remote file exists in bash

It is pretty hard to understand what it is you really want to accomplish. Let me try to rephrase your question.

I have urls.txt containing:

http://example.com/dira/foo.jpg
http://example.com/dira/bar.jpg
http://example.com/dirb/foo.jpg
http://example.com/dirb/baz.jpg
http://example.org/dira/foo.jpg

On example.com these URLs exist:

http://example.com/dira/foo.jpg
http://example.com/dira/foo_001.jpg
http://example.com/dira/foo_003.jpg
http://example.com/dira/foo_005.jpg
http://example.com/dira/bar_000.jpg
http://example.com/dira/bar_002.jpg
http://example.com/dira/bar_004.jpg
http://example.com/dira/fubar.jpg
http://example.com/dirb/foo.jpg
http://example.com/dirb/baz.jpg
http://example.com/dirb/baz_001.jpg
http://example.com/dirb/baz_005.jpg

On example.org these URLs exist:

http://example.org/dira/foo_001.jpg

Given urls.txt I want to generate the combinations with _001.jpg .. _005.jpg in addition to the original URL. E.g.:

http://example.com/dira/foo.jpg

becomes:

http://example.com/dira/foo.jpg
http://example.com/dira/foo_001.jpg
http://example.com/dira/foo_002.jpg
http://example.com/dira/foo_003.jpg
http://example.com/dira/foo_004.jpg
http://example.com/dira/foo_005.jpg

Then I want to test if these URLs exist without downloading the file. As there are many URLs I want to do this in parallel.

If the URL exists I want an empty file created.

(Version 1): I want the empty file created in a the similar directory structure in the dir images. This is needed because some of the images have the same name, but in different dirs.

So the files created should be:

images/http:/example.com/dira/foo.jpg
images/http:/example.com/dira/foo_001.jpg
images/http:/example.com/dira/foo_003.jpg
images/http:/example.com/dira/foo_005.jpg
images/http:/example.com/dira/bar_000.jpg
images/http:/example.com/dira/bar_002.jpg
images/http:/example.com/dira/bar_004.jpg
images/http:/example.com/dirb/foo.jpg
images/http:/example.com/dirb/baz.jpg
images/http:/example.com/dirb/baz_001.jpg
images/http:/example.com/dirb/baz_005.jpg
images/http:/example.org/dira/foo_001.jpg

(Version 2): I want the empty file created in the dir images. This can be done because all the images have unique names.

So the files created should be:

images/foo.jpg
images/foo_001.jpg
images/foo_003.jpg
images/foo_005.jpg
images/bar_000.jpg
images/bar_002.jpg
images/bar_004.jpg
images/baz.jpg
images/baz_001.jpg
images/baz_005.jpg

(Version 3): I want the empty file created in the dir images called the name from urls.txt. This can be done because only one of _001.jpg .. _005.jpg exists.

images/foo.jpg
images/bar.jpg
images/baz.jpg
#!/bin/bash

do_url() {
url="$1"

# Version 1:
# If you want to keep the folder structure from the server (similar to wget -m):
wget -q --method HEAD "$url" && mkdir -p images/"$2" && touch images/"$url"

# Version 2:
# If all the images have unique names and you want all images in a single dir
wget -q --method HEAD "$url" && touch images/"$3"

# Version 3:
# If all the images have unique names when _###.jpg is removed and you want all images in a single dir
wget -q --method HEAD "$url" && touch images/"$4"

}
export -f do_url

parallel do_url {1.}{2} {1//} {1/.}{2} {1/} :::: urls.txt ::: .jpg _{001..005}.jpg

GNU Parallel takes a few ms per job. When your jobs are this short, the overhead will affect the timing. If none of your CPU cores are running at 100% you can run more jobs in parallel:

parallel -j0 do_url {1.}{2} {1//} {1/.}{2} {1/} :::: urls.txt ::: .jpg _{001..005}.jpg

You can also "unroll" the loop. This will save 5 overheads per URL:

do_url() {
url="$1"
# Version 2:
# If all the images have unique names and you want all images in a single dir
wget -q --method HEAD "$url".jpg && touch images/"$url".jpg
wget -q --method HEAD "$url"_001.jpg && touch images/"$url"_001.jpg
wget -q --method HEAD "$url"_002.jpg && touch images/"$url"_002.jpg
wget -q --method HEAD "$url"_003.jpg && touch images/"$url"_003.jpg
wget -q --method HEAD "$url"_004.jpg && touch images/"$url"_004.jpg
wget -q --method HEAD "$url"_005.jpg && touch images/"$url"_005.jpg
}
export -f do_url

parallel -j0 do_url {.} :::: urls.txt

Finally you can run more than 250 jobs: https://www.gnu.org/software/parallel/man.html#EXAMPLE:-Running-more-than-250-jobs-workaround

How do I determine if a web page exists with shell scripting?

Under a *NIX, you can use curl to issue a simple HEAD request (HEAD only asks for the headers, not the page body):

curl --head http://myurl/

Then you can take only the first line, which contains the HTTP status code (200 OK, 404 Not Found, etc.):

curl -s --head http://myurl/ | head -n 1

And then check if you got a decent response (status code is 200 or 3**):

curl -s --head http://myurl/ | head -n 1 | grep "HTTP/1.[01] [23].."

This will output the first line if the status code is okay, or nothing if it isn't. You can also pipe that to /dev/null to get no output, and use $? to determine if it worked or no:

curl -s --head http://myurl/ | head -n 1 | grep "HTTP/1.[01] [23].." > /dev/null
# on success (page exists), $? will be 0; on failure (page does not exist or
# is unreachable), $? will be 1

EDIT -s simply tells curl to not show a "progress bar".

check existence of ftp file and parse exit code

How about using curl?

 curl -I --silent ftp://username:passwd@192.168.1.63/filenotexist.txt >/dev/null

$? is 0 if file exists,
$? is not 0 if file doesn't exists.



Related Topics



Leave a reply



Submit