How to Get the Real Url After File_Get_Contents If Redirection Happens

How to get the real URL after file_get_contents if redirection happens?

You might make a request with cURL instead of file_get_contents().

Something like this should work...

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, FALSE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$a = curl_exec($ch);
if(preg_match('#Location: (.*)#', $a, $r))
$l = trim($r[1]);

Source

file_get_contents request to external site

Just use the https url instead of the http url:

https://spotifycharts.com/api/?type=regional&country=nl&recurrence=daily&date=latest&limit=200

How do I ignore a moved-header with file_get_contents in PHP?

There is no content there. The redirect happens in the HTTP response before any content would be sent.

The server decides what you get to see (or not).

file_get_contents() returns the wrong page

let's call file_get_contents() a "dumb" function when it comes down to loading URL-Content. It will return the content as presented when the DOM has been loaded for the first time.

To get the actual content of MANY websites, you need to follow redirects as well, which you can achieve by using curl (refer to: How to get the real URL after file_get_contents if redirection happens?)

IF the final page uses a lot of AJAX to post-load data, even curl will not deliver the desired content, but some "naked" HTML-Page without actual content.


So, nowadays, you need to manually take care of loading asynchronous content, by parsing the content of the initial url, parsing JS-files, obtaining ajax-urls and call them again while passing cookies the target-page might have generated for your request...

Or use a "native client", which will execute the page just like a browser and is able to return the final data.

just calling file_get_contents("url"); and expecting the same sourcecode, as if you call the url in a browser wont work anymore for the majority of websites.

How to find out the end url upon redirection

Use curl example :

curl_setopt($init, CURLOPT_FOLLOWLOCATION, 1);

if you need know adres of redirec

curl_setopt($init, CURLOPT_HEADER, 1);
curl_setopt($init, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($init, CURLOPT_HTTPHEADER, array('Expect:') );

if ( strstr($content, 'Moved Temporarily') or  
strstr($content, 'Moved Permanently') or strstr($content, '302 Found') ) :

if ( preg_match('@Location: (.*?)\n@', $content, $red) ) :

print_r($red);

endif;

file_get_contents' function from multiple URLs and redirection limit reached warning

I would suggest using cURL for fetching remote data. You could do this:

$urls = [
"https://www.url1.com",
"https://www.url2.com",
"https://www.url3.com",
"https://www.url4.com",
"https://www.url5.com"
];
$decoded = array_map("loadJSON", $urls);

if (is_array($decoded[0])) {
foreach ($decoded[0] as $key => $value) {
if (is_array($value) && isset($value['price'])) {
$price = $value['price'];
echo '<span><b>' . $price . '</b><span>';
}
}
}

/**
* Downloads a JSON file from a URL and returns its decoded content
*/
function loadJSON($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); // If your server does not have SSL
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); // Follow redirections
curl_setopt($ch, CURLOPT_MAXREDIRS, 10); // 10 max redirections
$content = curl_exec($ch);
curl_close($ch);
$res = json_decode($content, true);
return $res;
}


Related Topics



Leave a reply



Submit