Different Behaviours of Treating \ (Backslash) in the Url by Firefox and Chrome

Do browsers ignore slashes in URLs?

Path separators are defined to be a single slash according to this. (Search for Path Component)

Note that browsers don't usually modify the URL. Browsers could append a / at the end of a URL, but in your case, the URL with extra slashes is simply sent along in the request, so it is the server ignoring the slashes instead.

Also, have a look at:

  • Is a URL with // in the path-section valid?
  • URL with multiple forward slashes, does it break anything?
  • What does the double slash mean in URLs?

Even if this behavior is convenient for you, it is generally not recommended. In addition, caching may also be affected (source):

Since both your browser and the server cache individual pages (according to their caching settings), requesting same file multiple times via slightly different URIs might affect the caching (depending on server and client implementation).

Chrome/Edge and Firefox wrap long hyperlinks differently. Why?

What you're looking for is the line-break property. It specifies what characters are allowed to have line-breaks and which do not, but it's not precise. To summarize it gives some specific rules for certain characters in certain languages, but it does not specify the situation that you have asked about. It doesn't specify anything about the slash character. So since the CSS standard doesn't specify what is right, neither of these implementations are wrong.

The default value for the line-break property is auto which does the following:

The UA determines the set of line-breaking restrictions to use, and it may vary the restrictions based on the length of the line; e.g., use a less restrictive set of line-break rules for short lines.

There is another standard, the unicode line breaking algorithm, which is far more specific and it includes a little thing which says that slashes should provide a line breaking opportunity after them for the exact situation you have inquired about. URLs being as frequent as they are on the web, it makes sense to be able to do line breaks in them at slashes, and so it's in that standard.

According to the firefox source code as best as I could find, it says it follows that unicode standard, but that's not what it appears to be doing according to your example, instead the slash in firefox seems to provide a line breaking opportunity before it. Maybe someone with more in depth knowledge of the firefox source code could explain why it does that?

I'm not sure about the chrome line breaking algorithm because it's a lot more difficult to search the source code, but I imagine the developers decided that the '/' character doesn't deserve a line break under those circumstances based on the broad definition of 'auto' in the css spec.

GWT: Difference access file paths with / or \\ between IE,Chrome and FireFox

Probably because the backslash \ is actually invalid to be used as separator? The fact that IE and Chrome accept it doesn't valid it in any way. You should always use the forwardslash '/'.
http://en.wikipedia.org/wiki/Uniform_Resource_Locator

Why is http:///example.org (with triple slash) treated as a valid URL by Firefox and webkit?

The specification of the "http" protocol requires a hostname in the URI. See http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.2.2. So the string http:///foo is not a valid http URI, and the browser is faced with the question of what to do with the invalid URI string.

What Gecko (Firefox) does is that its URI parser actually has scheme-dependent behavior where it will assume what you meant based on the URI scheme and do certain fixups. See the comments at http://mxr.mozilla.org/mozilla-central/source/netwerk/base/public/nsIStandardURL.idl?rev=f4157e8c4107&mark=20-23,28-31,36-39#20. "http" URIs are created with the URLTYPE_AUTHORITY flag, which leads to the behavior you see (per line 31 of nsIStandardURL.idl).

Note that the current attempt to standardize how URIs should be parsed in web pages and by web browsers, at http://url.spec.whatwg.org/ and has a whitelist of schemes at http://url.spec.whatwg.org/#relative-scheme that have behavior like this. If you step through the parsing algorithm for schemes in that whitelist, once you see the ':' you enter the state at http://url.spec.whatwg.org/#authority-first-slash-state which basically treats 0 or more slashes as all being equivalent to "//" and goes on to parse the thing following the slashes as the "authority" section of the URL.

webextension: Why does the browser add a trailing slash to the requested URL?

Think, I found it. The browser is just fixing an invalid URL.

To cite from Wikipedia, a URL looks like this:

scheme:[//[user[:password]@]host[:port]][/path][?query][#fragment]

The path must begin with a single slash (/) if an authority part was present, and may also if one was not, but must not begin with a double slash. The path is always defined, though the defined path may be empty (zero length), therefore no trailing slash.

http://example.com has an authority part (in this example, the schema plus hostname: http://example.com), but that leaves the path empty. According to the specification, the path must start with a /, so the browser fixes it by replacing the empty path by /.

If you use a valid URL instead, like http://example.com/abc, it does not need to modify it.

java.net.URL bug in constructing URLs?

Quoting javadoc of new URL(URL context, String spec):

Otherwise, the path is treated as a relative path and is appended to the context path, as described in RFC2396.

See section 5 "Relative URI References" of the RFC2396 spec, specifically section 5.2 "Resolving Relative References to Absolute Form", item 6a:

All but the last segment of the base URI's path component is copied to the buffer. In other words, any characters after the last (right-most) slash character, if any, are excluded.

Explanation

On a web page, the "Base URI" is the page address, e.g. http://example.com/path/to/page.html. A relative link, e.g. <a href="page2.html">, must be interpreted as a sibling to the base URI, so page.html is removed, and page2.html is added, resulting in http://example.com/path/to/page2.html, as intended.

The Java URL class implements this logic, and that is why you get what you see, and it is entirely the way it is supposed to work.

It is by design, i.e. not a bug.



Related Topics



Leave a reply



Submit