Url-Encoded Slash in Url

Is a slash ( / ) equivalent to an encoded slash ( %2F ) in the path portion of an HTTP URL

From the data you gathered, I would tend to say that encoded "/" in an uri are meant to be seen as "/" again at application/cgi level.

That's to say, that if you're using apache with mod_rewrite for instance, it will not match pattern expecting slashes against URI with encoded slashes in it.
However, once the appropriate module/cgi/... is called to handle the request, it's up to it to do the decoding and, for instance, retrieve a parameter including slashes as the first component of the URI.

If your application is then using this data to retrieve a file (whose filename contains a slash), that's probably a bad thing.

To sum up, I find it perfectly normal to see a difference of behaviour in "/" or "%2F" as their interpretation will be done at different levels.

urlencoded Forward slash is breaking URL

Apache denies all URLs with %2F in the path part, for security reasons: scripts can't normally (ie. without rewriting) tell the difference between %2F and / due to the PATH_INFO environment variable being automatically URL-decoded (which is stupid, but a long-standing part of the CGI specification so there's nothing can be done about it).

You can turn this feature off using the AllowEncodedSlashes directive, but note that other web servers will still disallow it (with no option to turn that off), and that other characters may also be taboo (eg. %5C), and that %00 in particular will always be blocked by both Apache and IIS. So if your application relied on being able to have %2F or other characters in a path part you'd be limiting your compatibility/deployment options.

I am using urlencode() while preparing the search URL

You should use rawurlencode(), not urlencode() for escaping path parts. urlencode() is misnamed, it is actually for application/x-www-form-urlencoded data such as in the query string or the body of a POST request, and not for other parts of the URL.

The difference is that + doesn't mean space in path parts. rawurlencode() will correctly produce %20 instead, which will work both in form-encoded data and other parts of the URL.

GETting a URL with an url-encoded slash

By default, the Uri class will not allow an escaped / character (%2f) in a URI (even though this appears to be legal in my reading of RFC 3986).

Uri uri = new Uri("http://example.com/%2F");
Console.WriteLine(uri.AbsoluteUri); // prints: http://example.com//

(Note: don't use Uri.ToString to print URIs.)

According to the bug report for this issue on Microsoft Connect, this behaviour is by design, but you can work around it by adding the following to your app.config or web.config file:

<uri>
<schemeSettings>
<add name="http" genericUriParserOptions="DontUnescapePathDotsAndSlashes" />
</schemeSettings>
</uri>

(Reposted from https://stackoverflow.com/a/10415482 because this is the "official" way to avoid this bug without using reflection to modify private fields.)

Edit: The Connect bug report is no longer visible, but the documentation for <schemeSettings> recommends this approach to allow escaped / characters in URIs. Note (as per that article) that there may be security implications for components that don't handle escaped slashes correctly.

Why Does url-encoding the first slash after the domain break the url?

The / is a reserved character. It’s not equivalent to %2f. If you need the slash without its defined meaning, you’d use the encoded form.

See RFC 3986: "Reserved Characters":

The purpose of reserved characters is to provide a set of delimiting
characters that are distinguishable from other data within a URI.
URIs that differ in the replacement of a reserved character with its
corresponding percent-encoded octet are not equivalent. Percent-
encoding a reserved character, or decoding a percent-encoded octet
that corresponds to a reserved character, will change how the URI is
interpreted by most applications.

The reason why the mentionend URL still works if you don’t use the reserved char / for the second slash: their CMS simply looks for the ID part in the URL. So you can add whatever you want to the URL, e.g. the following should still work:

http://dottech.org/95285/hey-this-URL-got-featured-at-stackoverflow

(However, it seems that it still has to be / or %2f in their case.)

If you try it with a Wikipedia article, it redirects to the front page:

http://en.wikipedia.org/wiki%2fStack_Overflow

Browser converts encoded slash (%2F) to literal slash (/) in path portion of URL

It's probably decoding it because it considers it part of the path.

I would suggest you explicitly treat it as a parameter. That will tell the browser not to decode it. For instance, instead of having this path:

https://localhost/#/account/AQAAANCMnd8BFdERjHoAwE%2fCl%2bsBAAAA6gbQh..........

Use this path:

https://localhost/#/account/?t=AQAAANCMnd8BFdERjHoAwE%2fCl%2bsBAAAA6gbQh......

Notice the addition of ?t= after the end of the account path.

Then consume the t parameter in your application. That will tell the browser that the value at the end is not to be decoded as part of the path but rather preserved in encoded form because it's a parameter.

This would obviously change the path you have (because of the setup part) so adjust accordingly.

Urlencode everything but slashes?


  1. Split by /
  2. urlencode() each part
  3. Join with /

Percent-encoded slash ( / ) is decoded before the request dispatch

I've been playing around with your code for the last few hours, and it's a doozy. The given code and it's variants all pass when run in the Powershell ISE, but fail on the Powershell console.
The issue itself seems to be the one documented on Microsoft Connect here.

Interestingly, as per user Glenn Block's answer on a related issue, this bug was fixed in .NET Framework 4.5.
You can check the version of the .NET framework being used by your Powershell by running the command $PSVersionTable. As long as the CLRVersion value is of the form 4.0.30319.x, where x > 1700, then you are running v4.5 of the framework.

I'm running Powershell v4.0 on .NET framework 4.5 on my machine, so that explains why Powershell ISE shows the correct behaviour, but I was not able to figure out why Powershell console does not. I verified the .NET assemblies loaded by both, and they seem to be the same.

As things stand, we have two options.
One is to use reflection and set a private field on the .Net class to prevent this behaviour (as outlined in this answer).
The other is to use the workaround listed in the Microsoft Connect issue. This involves the following steps:

  1. Go to your Powershell install folder (this was "C:\Windows\System32\WindowsPowerShell\v1.0\" on my machine). This folder should have the file powershell.exe in it.
  2. Create a new text file in this folder, and name it powershell.exe.config
  3. Open this file in a text editor, and paste the following text into it:

    <?xml version="1.0" encoding="utf-8" ?>
    <configuration>
    <uri>
    <schemeSettings>
    <add name="http" genericUriParserOptions="DontUnescapePathDotsAndSlashes" />
    <add name="https" genericUriParserOptions="DontUnescapePathDotsAndSlashes" />
    </schemeSettings>
    </uri>
    </configuration>

  4. Save this file. Close ALL running instances of Powershell.

  5. Start a new instance of Powershell. This will cause Powershell to detect the config file you created and parse it. The config entries basically tell the .NET libraries to disable the automatic unescaping of HTTP and HTTPS uri's.
  6. Run your script. You should no longer see the issue with the Uris.

slashes in url variables

You need to escape the slashes as %2F.



Related Topics



Leave a reply



Submit