Setting User Agent of a Java Urlconnection

Setting user agent of a java URLConnection

Off hand, setting the http.agent system property to "" might do the trick (I don't have the code in front of me).

You might get away with:

 System.setProperty("http.agent", "");

but that might require a race between you and initialisation of the URL protocol handler, if it caches the value at startup (actually, I don't think it does).

The property can also be set through JNLP files (available to applets from 6u10) and on the command line:

-Dhttp.agent=

Or for wrapper commands:

-J-Dhttp.agent=

Set user-agent property in https connection header

I've found/verified the problem by inspecting http communictions using WireShark. Is there any way around this

This is not possible. Communication over an SSL socket is completely obscured from casual observation by the encryption protocol. Using packet capture software you will be able to view the initiation of the SSL connection and the exchange of encrypted packets, but the content of those packets can only be extracted at the other end of the connection (the server). If this were not the case then the HTTPS protocol as a whole would be broken, as the whole point of it is to secure HTTP communications from man-in-the-middle type attacks (where in this case the MITM is the packet sniffer).

Example Capture of an HTTPS request (partial):

.n....E... .........../..5..3..9..2..8..
..............@........................Ql.{...b....OsR..!.4.$.T...-.-.T....Q...M..Ql.{...LM..L...um.M...........s. ...n...p^0}..I..G4.HK.n......8Y...............E...A..>...0...0.........
).s.......0
..*.H..
.....0F1.0...U....US1.0...U.
.
Google Inc1"0 ..U....Google Internet Authority0..
130327132822Z.
131231155850Z0h1.0...U....US1.0...U...
California1.0...U...
Mountain View1.0...U.
.
Google Inc1.0...U....www.google.com0..0

Theoretically, the only way to know if your User-Agent header is actually being excluded is if you have access to the Google servers, but in actuality there is nothing in either the HTTPS specification or Java's implementation of it that excludes headers that would normally have been sent over HTTP.

Example Capture of HTTP request:

GET / HTTP/1.1

User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:19.0) Gecko/20100101 Firefox/19.0

Host: www.google.com

Accept: text/html, image/gif, image/jpeg, *; q=.2, /; q=.2

Connection: keep-alive

Both example captures were generated with the exact same code:

URL url = new URL(target);
URLConnection conn = url.openConnection();
conn.setRequestProperty("User-Agent",
"Mozilla/5.0 (Windows NT 5.1; rv:19.0) Gecko/20100101 Firefox/19.0");
conn.connect();
BufferedReader serverResponse = new BufferedReader(
new InputStreamReader(conn.getInputStream()));
System.out.println(serverResponse.readLine());
serverResponse.close();

Except that for HTTPS the target was "https://www.google.com", and for HTTP it was "http://www.google.com".


Edit 1:

Based off your updated question, using the -Dhttp.agent property does indeed append 'Java/version' to the user agent header, as described by the following documentation:

http.agent (default: “Java/<version>”)

Defines the string sent in the User-Agent request header in http requests. Note that the string “Java/<version>” will be appended to the one provided in the property (e.g. if -Dhttp.agent=”foobar” is used, the User-Agent header will contain “foobar Java/1.5.0” if the version of the VM is 1.5.0). This property is checked only once at startup.

The 'offending' code is in a static block initializer of sun.net.www.protocol.http.HttpURLConnection:

static {
// ...
String agent = java.security.AccessController
.doPrivileged(new sun.security.action.GetPropertyAction(
"http.agent"));
if (agent == null) {
agent = "Java/" + version;
} else {
agent = agent + " Java/" + version;
}
userAgent = agent;

// ...
}

An obscene way around this 'problem' is this snippet of code, which I 1000% recommend you not use:

protected void forceAgentHeader(final String header) throws Exception {
final Class<?> clazz = Class
.forName("sun.net.www.protocol.http.HttpURLConnection");

final Field field = clazz.getField("userAgent");
field.setAccessible(true);
Field modifiersField = Field.class.getDeclaredField("modifiers");
modifiersField.setAccessible(true);
modifiersField.setInt(field, field.getModifiers() & ~Modifier.FINAL);
field.set(null, header);
}

Using this override with https.proxyHost, https.proxyPort and http.agent set gives the desired result:

CONNECT www.google.com:443 HTTP/1.1

User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:19.0) Gecko/20100101 Firefox/19.0

Host: www.google.com

Accept: text/html, image/gif, image/jpeg, *; q=.2, /; q=.2

Proxy-Connection: keep-alive

But yea, don't do that. Its much safer to just use Apache HttpComponents:

final DefaultHttpClient client = new DefaultHttpClient();
HttpHost proxy = new HttpHost("127.0.0.1", 8888, "http");
HttpHost target = new HttpHost("www.google.com", 443, "https");
client.getParams().setParameter(ConnRoutePNames.DEFAULT_PROXY, proxy);
HttpProtocolParams
.setUserAgent(client.getParams(),
"Mozilla/5.0 (Windows NT 5.1; rv:19.0) Gecko/20100101 Firefox/19.0");
final HttpGet get = new HttpGet("/");

HttpResponse response = client.execute(target, get);

Android - Default user agent for URLConnection?

The default user agent is null because the header is empty by default. You will have to set it manually using:

cn.setRequestProperty("User-Agent","your user agent");

Java URLConnection The cookie is not set

The web server sets the session id cookie. You can find it in Chrome see F12 -> Application-> Cookies and should also be seen in home page headers. You can try two things:

If you want to simulate the login using java core, you need to set with setRequestProperty most of the parameters your browser is sending (in Chrome see F12 -> Network -> Headers ->Request Headers) when you make a login request having set also the initial session. But this approach might not work since there are multiple layers of security in a large enterprise web app. With simple APIs or static web pages it would be simple.

What would have a higher chance of success is using a testing framework such as Selenium with ChromeDriver or Gecko for Mozilla. You just instruct the driver to login with your user and then access the user page then parse the page as you wanted.

Keep in mind that both approaches might not be accepted by Instagram policies or if you succeed, the requests from your IP would be redirected by the developer team.

real world user agents - what are they? How to set them? - Java

What is the simplest way I can set the user agent?

URLConnection urlConnection = new URL(a).openConnection();
urlConnection.addRequestProperty("User-Agent", "Mozilla/5.0");
InputStream is = urlConnection.getInputStream();

I want to set it to Mozilla/5.0. Do I have to add any further information?

You can set it to whatever you want. It's actually a good idea to identify your application, otherwise all Java programs simply send:

User-Agent: Java/1.7.0_11

Also, is this strictly allowed as in should I be concerned about any legal issues with regards to setting a user agent?

Nope, you are free to use any user agent as you want. Moreover it's not illegal to fake user agents. But if a website makes any decisions (especially regarding security) based on User-Agent, it's so bad as it's almost illegal ;-) (see: Java - Not getting html code from a URL).



Related Topics



Leave a reply



Submit