Get final URL after curl is redirected
curl
's -w
option and the sub variable url_effective
is what you are
looking for.
Something like
curl -Ls -o /dev/null -w %{url_effective} http://google.com
More info
-L Follow redirects
-s Silent mode. Don't output anything
-o FILE Write output to <file> instead of stdout
-w FORMAT What to output after completion
More
You might want to add -I
(that is an uppercase i
) as well, which will make the command not download any "body", but it then also uses the HEAD method, which is not what the question included and risk changing what the server does. Sometimes servers don't respond well to HEAD even when they respond fine to GET.
Get final redirect with Curl PHP
Use curl_getinfo()
with CURLINFO_REDIRECT_URL
or CURLINFO_EFFECTIVE_URL
depending on your use case.
CURLINFO_REDIRECT_URL
- With theCURLOPT_FOLLOWLOCATION
option disabled: redirect URL found in the last transaction, that should be requested manually next. With theCURLOPT_FOLLOWLOCATION
option enabled: this is empty. The redirect URL in this case is available inCURLINFO_EFFECTIVE_URL
-- http://php.net/manual/en/function.curl-getinfo.php
Example:
<?php
$url = 'https://google.com/';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
$html = curl_exec($ch);
$redirectedUrl = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
curl_close($ch);
echo "Original URL: " . $url . "\n";
echo "Redirected URL: " . $redirectedUrl . "\n";
When I run this code, the output is:
Original URL: https://google.com/
Redirected URL: https://www.google.com/
PHP - Using cURL to get the final URL status after redirect
You're looking for CURLOPT_FOLLOWLOCATION
TRUE to follow any "Location: " header that the server sends as part of the HTTP header (note this is recursive, PHP will follow as many "Location: " headers that it is sent, unless CURLOPT_MAXREDIRS is set).
from: http://docs.php.net/manual/da/function.curl-setopt.php
If you don't plan to use the option CURLOPT_FOLLOWLOCATION
then you must make sure you're analyzign the headers correctly to get the status.
From http://php.net/manual/en/function.curl-getinfo.php
you can see
CURLINFO_HTTP_CODE - The last response code.(...)
that means: there can be more than one status code.
i.e with http://airbrake.io/login
there are two sent:
HTTP/1.1 301 Moved Permanently
(...)
HTTP/1.1 200 OK
(...)
That means, only 200 is going to be returned, and if you want to get ANY result, your function needs to look like:
if($httpStatus >= 300 && $httpStatus < 400) {
return getUrlStatus($redirectURL);
} else {
return $httpStatus;
}
get the last redirected url in curl php
Thank you everyone for helping me in my situation.
Actually I want to develop a scraper in php for ikea website used in Israel (in Hebrew).
After putting a lot of hours I recognize that there is no server side redirection in url which I put to get the redirected url. It may be javascript redirection.
I have now implemented the below code and it works for me.
<?php
$name="19875379";
$url = "http://www.ikea.co.il/default.asp?strSearch=".$name;
$ch = curl_init();
$timeout = 0;
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$header = curl_exec($ch);
$redir = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
//print_r($header);
$x = preg_match("/<script>location.href=(.|\n)*?<\/script>/", $header, $matches);
$script = $matches[0];
$redirect = str_replace("<script>location.href='", "", $script);
$redirect = "http://www.ikea.co.il" . str_replace("';</script>", "", $redirect);
echo $redirect;
?>
Thanks again everyone :)
PHP - Get final url from a url after all redirections (curl + php)
The code you have is working correctly, but it is only part of what you want. When you get to the final URL redirect, your return includes...
<HTML><head></head><body>
<script LANGUAGE="JavaScript1.2">
window.location.replace('https:\/\/loomyhome.com\/collections\/all-products\/products\/blue-my-mind-rug?sscid=71k5_300lf&')
</script>
</body></html>
So you then need to extract the URL from there. You can use a regex (not my best skill) which would be something like...
preg_match('#(https:.*?)\'\)#', $ret, $match);
echo stripslashes($match[1]);
(using stripslashes
to unescape the string). Gives...
https://loomyhome.com/collections/all-products/products/blue-my-mind-rug?sscid=71k5_3097f&
cURL taking long time to get the final URL of redirect URL
This works fine with CURLINFO_EFFECTIVE_URL
, but for it the option CURLOPT_FOLLOWLOCATION
must set to TRUE
. This is on the grounds that CURLINFO_EFFECTIVE_URL
returns precisely what it says, the effective url that ends up getting loaded. If the CURLOPT_FOLLOWLOCATION=False
then the effective url will be requested url, else it will be final url that is redirected to.
I did this using curl_getinfo. which gives me information regarding the last transfer
<?php
echo get_rurl("xurl");
//echo get_rurl("yurl");
function get_rurl($url){
// initialize cURL
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url); //specify your URL
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false); //disable follow redirects
$http_data = curl_exec($ch); //hit the $url
$curl_info = curl_getinfo($ch);
return $curl_info['redirect_url'];// extract final url
}
?>
or
Even you can use CURLINFO_REDIRECT_URL
or CURLINFO_EFFECTIVE_URL
depending upon your use cases. refer here
<?php
echo get_rurl("xurl");
//echo get_rurl("yurl");
function get_rurl($url){
// initialize cURL
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url); //specify your URL
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false); //disable follow redirects
$http_data = curl_exec($ch); //hit the $url
return curl_getinfo($ch, CURLINFO_REDIRECT_URL);
}
?>
Hope this helps to others users too.
Related Topics
How to Find All the Files That Were Created Today in Unix/Linux
Execute Shell Commands from Program Running in Wine
How to Get the Physical Address from the Logical One in a Linux Kernel Module
Getting Disconnection Notification Using Tcp Keep-Alive on Write Blocked Socket
Use Ssh to Start a Background Process on a Remote Server, and Exit Session
Linux Command "File" Shows "For Gnu/Linux 2.6.24"
How to Sleep in the Linux Kernel Space
How to Get the Last Word in Each Line with Bash
Run Bash Command on Jenkins Pipeline
Two File Descriptors to Same File
Minimal Core Dump (Stack Trace + Current Frame Only)
Node Server Crashes After Few Hours