PHP Curl How to Add the User Agent Value or Overcome the Servers Blocking Curl Requests

PHP cURL how to add the User Agent value OR overcome the Servers blocking cURL requests?

  1. In the server side, we can block some requests by recognize the header fields(including refer, cookie, user-agent and so on) in http request, the ip address, access frequency. And in most case, requests generated by machine usually has something different than human requests,for example, no refer & cookie, or with higher access frequency, we can write some rules to deny these requests.

  2. According to 1, you can try your best to simulate real requests by filling the header fields, using random and slower frequency, using more ip addresses. (sounds like attack)

  3. Generally, using lower frequency and do not make heavy load for their server, follow their access rules, they will seldom block your requests.

Can servers block curl requests?

Servers cannot block cURL requests per se, but they can block any request that they do not like. If the server checks for some parameters that your cURL request does not satisfy, it could decide to respond differently.

In the vast majority of cases, this difference in behavior is triggered by the presence (or absence) and values of the HTTP request headers. For example, the server might check that the User-Agent header is present and has a valid value (it could also check lots of other things).

To find out what the HTTP request coming from the browser looks like, use an HTTP debugging proxy like Fiddler or your browser's developer tools.

To add your own headers to your cURL request, use

curl_setopt($ch, CURLOPT_HTTPHEADER, array('HeaderName: HeaderValue'));

Reasons why cURL would connect fine over command line, but not in PHP?

most likely it's because curl-cli automatuically adds a user-agent header, and libcurl/php does not.

some sort of IP or user-agent blocking involved. However, I have spun up brand new machines on both DigitalOcean and Vultr, and both experience the same issue

setting up VM's on DigitalOcean/Vultr will not automatically make libcurl add user-agent headers to your https requests. that can be done with:

curl_setop($ch,CURLOPT_USERAGENT,"curl/".(curl_version()["version"])); // User-Agent: curl/7.52.1

to mimic curl-cli's user-agent string, or something like

curl_setopt($ch,CURLOPT_USERAGENT,"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36");

to pretend that you're a Google Chrome version 71, running on Windows 7 x64.

many websites (like, for example, Wikipedia.com ) blocks http requests lacking a User-Agent header.

How do I check what user agent curl is using?

Use the --verbose option to see all the headers sent by curl, including User-Agent:

A line starting with '>' means "header data" sent by curl

For example:

$ curl --verbose 'http://www.google.com/'
> GET / HTTP/1.1
> User-Agent: curl/7.37.0
> Host: www.google.com
> Accept: */*

File_get_contents, curl not working

It looks like they're blocking the user agent, or lack thereof, considering that php curl and file_get_contents doesn't seem to set the value in the request header.

You can fake this by setting it to something like Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:7.0.1) Gecko/20100101 Firefox/7.0.1

<?php
function get_page($url) {
$curl = curl_init();
curl_setopt($curl, CURLOPT_RETURNTRANSFER, True);
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl,CURLOPT_USERAGENT,'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:7.0.1) Gecko/20100101 Firefox/7.0.1');
$return = curl_exec($curl);
curl_close($curl);
return $return;
}

echo get_page('http://api.promasters.net.br/cotacao/v1/valores?moedas=USD&alt=json');


Related Topics



Leave a reply



Submit