Is a Url Allowed to Contain a Space

Is a URL allowed to contain a space?

As per RFC 1738:

Unsafe:

Characters can be unsafe for a number of reasons. The space
character is unsafe because significant spaces may disappear and
insignificant spaces may be introduced when URLs are transcribed or
typeset or subjected to the treatment of word-processing programs.

The characters "<" and ">" are unsafe because they are used as the
delimiters around URLs in free text; the quote mark (""") is used to
delimit URLs in some systems. The character "#" is unsafe and should
always be encoded because it is used in World Wide Web and in other
systems to delimit a URL from a fragment/anchor identifier that might
follow it. The character "%" is unsafe because it is used for
encodings of other characters. Other characters are unsafe because
gateways and other transport agents are known to sometimes modify
such characters. These characters are "{", "}", "|", "\", "^", "~",
"[", "]", and "`".

All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in
systems that do not normally deal with fragment or anchor
identifiers, so that if the URL is copied into another system that
does use them, it will not be necessary to change the URL encoding.

Spaces in URLs?

A URL must not contain a literal space. It must either be encoded using the percent-encoding or a different encoding that uses URL-safe characters (like application/x-www-form-urlencoded that uses + instead of %20 for spaces).

But whether the statement is right or wrong depends on the interpretation: Syntactically, a URI must not contain a literal space and it must be encoded; semantically, a %20 is not a space (obviously) but it represents a space.

How is it possible to include a space in a URL

URLs always need to be encoded if they contain any kind of special characters, including space. That is usually done with percent encoding, but for space characters the special encoding to a plus sign '+' can be used.
(A literal '+' itself is a reserved character before encoding and would need to be encoded to '%2B'.)

However, modern browsers show some intelligence when dealing with URLs and can apply that encoding transparently when necessary. If you're on Firefox, try using Live HTTP Headers or Firebug to see the request actually sent to the server when you click that link.

URL encoding the space character: + or %20?

From Wikipedia (emphasis and link added):

When data that has been entered into HTML forms is submitted, the form field names and values are encoded and sent to the server in an HTTP request message using method GET or POST, or, historically, via email. The encoding used by default is based on a very early version of the general URI percent-encoding rules, with a number of modifications such as newline normalization and replacing spaces with "+" instead of "%20". The MIME type of data encoded this way is application/x-www-form-urlencoded, and it is currently defined (still in a very outdated manner) in the HTML and XForms specifications.

So, the real percent encoding uses %20 while form data in URLs is in a modified form that uses +. So you're most likely to only see + in URLs in the query string after an ?.

Does a `+` in a URL scheme/host/path represent a space?

  • Percent encoding in the path section of a URL is expected to be decoded, but
  • any + characters in the path component is expected to be treated literally.

To be explicit: + is only a special character in the query component.

https://www.rfc-editor.org/rfc/rfc3986

href syntax : is it okay to have space in file name

The src attribute should contain a valid URL. Since space characters are not allowed in URLs, you have to encode them.

You can write:

<img src="buttons/bu%20hover.png" />

But not:

<img src="buttons/bu+hover.png" />

Because, as DavidRR rightfully points out in his comment, encoding space characters as + is only valid in the query string portion of an URL, not in the path itself.

When should space be encoded to plus (+) or %20?

+ means a space only in application/x-www-form-urlencoded content, such as the query part of a URL:

http://www.example.com/path/foo+bar/path?query+name=query+value

In this URL, the parameter name is query name with a space and the value is query value with a space, but the folder name in the path is literally foo+bar, not foo bar.

%20 is a valid way to encode a space in either of these contexts. So if you need to URL-encode a string for inclusion in part of a URL, it is always safe to replace spaces with %20 and pluses with %2B. This is what, e.g., encodeURIComponent() does in JavaScript. Unfortunately it's not what urlencode does in PHP (rawurlencode is safer).

See Also

HTML 4.01 Specification application/x-www-form-urlencoded

Picasso not working if url contains space

String temp = "http://www.tonightfootballreport.com/\Filebucket\Picture\image\png\20160807025619_Serie A.png";
temp = temp.replaceAll(" ", "%20");
URL sourceUrl = new URL(temp);

how to handle spaces in url android

URLEncoder.encode() your URL parameter values so that spaces and other special characters get correctly encoded.

Example:

strUrlLoginFullpath = strUrlMain + "exl?u=" + URLEncoder.encode(strUser, "UTF-8") +
"&p=" + URLEncoder.encode(strPass, "UTF-8") + "&t=1";


Related Topics



Leave a reply



Submit