file_get_contents gets different file from google than shown in browser
Nothing really. The problem is that the page uses javascript and ajax to get contents. So, in order to get a "snapshot" of the page, you need to "run it". That is, you need to parse the javascript code, which php doesn't do.
Your best bet is to use an headless browser such as phantomjs. If you search, you find some tutorials explaining how to do it
NOTE
If all you're looking for is a way to retrieve raw data from the search, you might want to try to use google's search api.
file_get_contents script works with some websites but not others
$html = file_get_html('http://google.com/');
$title = $html->find('title')->innertext;
Or if you prefer with preg_match and you should be really using cURL instead of fgc...
function curl($url){
$headers[] = "User-Agent:Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13";
$headers[] = "Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$headers[] = "Accept-Language:en-us,en;q=0.5";
$headers[] = "Accept-Encoding:gzip,deflate";
$headers[] = "Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$headers[] = "Keep-Alive:115";
$headers[] = "Connection:keep-alive";
$headers[] = "Cache-Control:max-age=0";
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
curl_setopt($curl, CURLOPT_ENCODING, "gzip");
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
$data = curl_exec($curl);
curl_close($curl);
return $data;
}
$data = curl('http://www.google.com');
$regex = '#<title>(.*?)</title>#mis';
preg_match($regex,$data,$match);
var_dump($match);
echo $match[1];
file_get_contents is not working for some url
URL which is not retrieved by file_get_contents, because their server checks whether the request come from browser or any script. If they found request from script they simply disable page contents.
So that I have to make a request similar as browser request. So I have used following code to get 2nd url contents. It might be different for different web server. Because they might keep different checks.
Even though why dont you try to use following code! If you are lucky this might work for you!!
function getUrlContent($url) {
fopen("cookies.txt", "w");
$parts = parse_url($url);
$host = $parts['host'];
$ch = curl_init();
$header = array('GET /1575051 HTTP/1.1',
"Host: {$host}",
'Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language:en-US,en;q=0.8',
'Cache-Control:max-age=0',
'Connection:keep-alive',
'Host:adfoc.us',
'User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36',
);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 0);
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookies.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookies.txt');
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
$result = curl_exec($ch);
curl_close($ch);
return $result;
}
$url = "http://adfoc.us/1575051";
$html = getUrlContent($url);
Thanks everyone for the guidance.
file_get_contents does not work with protocolless urls?
you can't use file_get_contents
with //google.com
because what it's actually doing is file:///google.com
when you do this in your web browser it's actually using the current protocol you're currently on. So if you had https://mywebsite.com
and you linked to something as //google.com
it would actually do is https://google.com
. That being said you need to do file_get_contents('http://google.com');
Related Topics
Why Do I Get "Resource Id #4" When I Apply Print_R() to an Array in PHP
Reverse the Order of Space-Separated Words in a String
Sort a Set of Multidimensional Arrays by Array Elements
How Safe Is PHP Pdo Function: Lastinsertid
Add a Line Break in Woocommerce Product Titles
Laravel Validation Attributes "Nice Names"
Show Progressbar While Loading Pages Using Jquery Ajax in Single Page Website
PHP Date Time Greater Than Today
How to Do a Curl Request to the Same Server
How to Increment Count in the Replacement String When Using Preg_Replace
Change Innerhtml of a PHP Domelement
Trying to Replace Parts of String Start with Same Search Chars