How can I emulate a get request exactly like a web browser?
Are you sure the curl module honors ini_set('user_agent',...)? There is an option CURLOPT_USERAGENT described at http://docs.php.net/function.curl-setopt.
Could there also be a cookie tested by the server? That you can handle by using CURLOPT_COOKIE, CURLOPT_COOKIEFILE and/or CURLOPT_COOKIEJAR.
edit: Since the request uses https there might also be error in verifying the certificate, see CURLOPT_SSL_VERIFYPEER.
$url="https://new.aol.com/productsweb/subflows/ScreenNameFlow/AjaxSNAction.do?s=username&f=firstname&l=lastname";
$agent= 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)';
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_URL,$url);
$result=curl_exec($ch);
var_dump($result);
How to use curl to get a GET request exactly same as using Chrome?
If you need to set the user header string in the curl request, you can use the -H
option to set user agent like:
curl -H "user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36" http://stackoverflow.com/questions/28760694/how-to-use-curl-to-get-a-get-request-exactly-same-as-using-chrome
Updated user-agent form newest Chrome at 02-22-2021
Using a proxy tool like Charles Proxy really helps make short work of something like what you are asking. Here is what I do, using this SO page as an example (as of July 2015 using Charles version 3.10):
- Get Charles Proxy running
- Make web request using browser
- Find desired request in Charles Proxy
- Right click on request in Charles Proxy
- Select 'Copy cURL Request'
You now have a cURL request you can run in a terminal that will mirror the request your browser made. Here is what my request to this page looked like (with the cookie header removed):
curl -H "Host: stackoverflow.com" -H "Cache-Control: max-age=0" -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" -H "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.89 Safari/537.36" -H "HTTPS: 1" -H "DNT: 1" -H "Referer: https://www.google.com/" -H "Accept-Language: en-US,en;q=0.8,en-GB;q=0.6,es;q=0.4" -H "If-Modified-Since: Thu, 23 Jul 2015 20:31:28 GMT" --compressed http://stackoverflow.com/questions/28760694/how-to-use-curl-to-get-a-get-request-exactly-same-as-using-chrome
How can I emulate a web browser http request from code?
You should use Fiddler to capture the request that you want to simulate.
You need to look at the inspectors > raw.
This is an example of a request to the fiddler site from chrome
GET http://fiddler2.com/ HTTP/1.1
Host: fiddler2.com
Connection: keep-alive
Cache-Control: max-age=0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.94 Safari/537.36
Referer: https://www.google.be/
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8,nl;q=0.6
You can then set each one of these headers in your webrequest (see http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.aspx).
WebRequest request = (HttpWebRequest)WebRequest.Create("http://www.test.com");
request.UserAgent = "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.94 Safari/537.36";
How to use Python requests to fake a browser visit a.k.a and generate User Agent?
Provide a User-Agent
header:
import requests
url = 'http://www.ichangtou.com/#company:data_000008.html'
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
response = requests.get(url, headers=headers)
print(response.content)
FYI, here is a list of User-Agent strings for different browsers:
- List of all Browsers
As a side note, there is a pretty useful third-party package called fake-useragent that provides a nice abstraction layer over user agents:
fake-useragent
Up to date simple useragent faker with real world database
Demo:
>>> from fake_useragent import UserAgent
>>> ua = UserAgent()
>>> ua.chrome
u'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1667.0 Safari/537.36'
>>> ua.random
u'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.67 Safari/537.36'
How to simulate a full browser's request to an HTML document?
Browsers do use multiple connections, in order to speed up the downloading (parallel downloading of resources). They limit however the number of connections to the same host, which is one of the reasons for the existence of content-delivery networks.
The order of CSS and Script files in the header do matter, as scripts block parallel downloading (unless the script is not defered).
Also browsers parse HTML while they receive it (again to speed up things) - this is a reason if you put a script in the head that tries to manipulate DOM elements not yet loaded, you'll get an error.
But all of this are browser implementation details that may not be important for your task.
Best - look at the source code of some headless browser to find what's going on.
How to simulate exact browser request using PHP script?
Use postman. It will automatically create the php code from your request.
Step 1:
copy request as cUrl(cmd) using your browser (I use Chrome)
Step 2:
Click import on postman. Select raw text. paste copied request. imort it.
Step 3:
Select "code"
Step 4:
Select your preferred language.
It will generate your code automatically.
How can I emulate a popular web browser when I visit this website?/Why am I getting a 403 error?
You can use an URLConnection and set the User-Agent:
URL server = new URL("http://www.ace-spades.com/play");
URLConnection connection = server.openConnection();
connection.setRequestProperty("User-Agent", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/534.57.2 (KHTML, like Gecko) Version/5.1.7 Safari/534.57.2");
Basically, you could subclass JEditorPane and override getStream(URL page) to add the User-Agent string.
import java.io.IOException;
import java.io.InputStream;
import java.net.URL;
import java.net.URLConnection;
import javax.swing.JEditorPane;
public class UserAgentEditorPane extends JEditorPane {
private static final long serialVersionUID = 1L;
private String userAgent;
public UserAgentEditorPane(URL url, String userAgent) throws IOException {
super(url);
this.userAgent = userAgent;
}
@Override
protected InputStream getStream(URL page) throws IOException {
URLConnection conn = page.openConnection();
conn.setRequestProperty("User-Agent", userAgent);
setContentType(conn.getContentType());
return conn.getInputStream();
}
}
PHP: How to get website with cURL and act like a real browser?
There is no difference between a cURL request and the request that a browser makes, apart from the HTTP headers it requests, and that a browser has JavaScript running on the client.
The only thing that identifies an HTTP client is its headers -- typically the user agent string -- and seeing as you have set the user agent to exactly the same as the browser, there must be other checks in place.
By default, cURL doesn't send any default Accept
header, whereas browsers request pages with this header to show the capabilities of the browser. I expect the web server will be checking on something like this.
Take a look at the screenshot above of Chrome Developer Tools. It allows you to copy the whole request as a cURL request, including all the headers that were sent from Chrome, for testing in the terminal.
Try to match all the headers exactly from within your PHP, and I'm sure the web server will not be able to identify you as a script.
Related Topics
Difference Between Method Calls $Model-≫Relation(); and $Model-≫Relation;
How to Extract Text from the Pdf Document
How Safe Are Pdo Prepared Statements
How to Use MySQLi_Fetch_Array() Twice
How to Detect Strings Like Putjbtghguhjjjanika
How to Handle Ipv6 Addresses in PHP
MySQLi Bind Param With an Array For In
How to Set Upload_Max_Filesize in .Htaccess
Create Ini File, Write Values in PHP
PHP Sessions Timing Out Too Quickly
How to Split a String by Multiple Delimiters in PHP
Error Logging, in a Smooth Way
Upload Video Files Via PHP and Save Them in Appropriate Folder and Have a Database Entry
Assigning the Return Value of New by Reference Is Deprecated