How to Urlencode a Querystring in Python

How to urlencode a querystring in Python?

You need to pass your parameters into urlencode() as either a mapping (dict), or a sequence of 2-tuples, like:

>>> import urllib
>>> f = { 'eventName' : 'myEvent', 'eventDescription' : 'cool event'}
>>> urllib.urlencode(f)
'eventName=myEvent&eventDescription=cool+event'

Python 3 or above

Use urllib.parse.urlencode:

>>> urllib.parse.urlencode(f)
eventName=myEvent&eventDescription=cool+event

Note that this does not do url encoding in the commonly used sense (look at the output). For that use urllib.parse.quote_plus.

How to URL encode in Python 3?

You misread the documentation. You need to do two things:

  1. Quote each key and value from your dictionary, and
  2. Encode those into a URL

Luckily urllib.parse.urlencode does both those things in a single step, and that's the function you should be using.

from urllib.parse import urlencode, quote_plus

payload = {'username':'administrator', 'password':'xyz'}
result = urlencode(payload, quote_via=quote_plus)
# 'password=xyz&username=administrator'

How to encode a string for url param in python

use urllib.quote_plus function. Here is an example:

#!/usr/bin/python

print "Content-Type: text/html;charset=utf-8";
print

import urllib

str = "Hello World?"
str = urllib.quote_plus(str)

print str

Output:
Hello+World%3F

Python: How to only URL Encode a specific URL Parameter?

split the string on the &q=/ part and only encode the last string

from urllib import parse

url = 'https://www.exmple.com/test?test1=abc&test2=abc&test3=abc&q=/"TEST"/"TEST"'
encoded = parse.quote_plus(url.split("&q=/")[1])
encoded_url = f"{url.split('&q=/')[0]}&q=/{encoded}"
print(encoded_url)

output

https://www.exmple.com/test?test1=abc&test2=abc&test3=abc&q=%2F%22TEST%22%2F%22TEST%22

Note that there's a difference between this and the requested output, but you have an url encoded space (%20) at the end


EDIT

Comment shows a different need for the encoding, so the code needs to change a bit. The code below only encodes the part after &q=. Basically, first split the url and the parameters, then iterate through the parameters to find the q= parameter, and encode that part. Do some f-string and join magic and you get an url that has the q parameter encoded. Note that this might have issues if an & is present in the part that needs to be encoded.

url = 'https://www.exmple.com/test?test1=abc&test2=abc&test3=abc&q=/"TEST"/"TEST"&utm_source=test1&cpc=123&gclid=abc123'
# the first parameter is always delimited by a ?
baseurl, parameters = url.split("?")
newparameters = []
for parameter in parameters.split("&"):
# check if the parameter is the part that needs to be encoded
if parameter.startswith("q="):
# encode the parameter
newparameters.append(f"q={parse.quote_plus(parameter[2:])}")
else:
# otherwise add the parameter unencoded
newparameters.append(parameter)
# string magic to create the encoded url
encoded_url = f"{baseurl}?{'&'.join(newparameters)}"
print(encoded_url)

output

https://www.exmple.com/test?test1=abc&test2=abc&test3=abc&q=%2F%22TEST%22%2F%22TEST%22&utm_source=test1&cpc=123&gclid=abc123

EDIT 2

Trying to solve the edge case where there's a & character in the string to be encoded, as this messes up the string.split("&").

I tried using urllib.parse.parse_qs() but this has the same issue with the & character. Docs for reference.

This question is a nice example of how edge cases can mess up simple logic and make it overly complicated.

The RFC3986 also didn't specify any limitations on the name of the query string, otherwise that could've been used to narrow down possible errors even more.

updated code

from urllib import parse

url = 'https://www.exmple.com/test?test1=abc&test2=abc&test3=abc&q=/"TEST"/&"TE&eeST"&utm_source=test1&cpc=123&gclid=abc123'
# the first parameter is always delimited by a ?
baseurl, parameters = url.split("?")

# addition to handle & in the querystring.
# it reduces errors, but it can still mess up if there's a = in the part to be encoded.
split_parameters = []
for index, parameter in enumerate(parameters.split("&")):
if "=" not in parameter:
# add this part to the previous entry in split_parameters
split_parameters[-1] += f"&{parameter}"
else:
split_parameters.append(parameter)

newparameters = []
for parameter in split_parameters:
# check if the parameter is the part that needs to be encoded
if parameter.startswith("q="):
# encode the parameter
newparameters.append(f"q={parse.quote_plus(parameter[2:])}")
else:
# otherwise add the parameter unencoded
newparameters.append(parameter)
# string magic to create the encoded url
encoded_url = f"{baseurl}?{'&'.join(newparameters)}"
print(encoded_url)

output

https://www.exmple.com/test?test1=abc&test2=abc&test3=abc&q=%2F%22TEST%22%2F%26%22TE%26eeST%22&utm_source=test1&cpc=123&gclid=abc123

How can I percent-encode URL parameters in Python?

Python 2

From the documentation:

urllib.quote(string[, safe])

Replace special characters in string
using the %xx escape. Letters, digits,
and the characters '_.-' are never
quoted. By default, this function is
intended for quoting the path section
of the URL.The optional safe parameter
specifies additional characters that
should not be quoted — its default
value is '/'

That means passing '' for safe will solve your first issue:

>>> urllib.quote('/test')
'/test'
>>> urllib.quote('/test', safe='')
'%2Ftest'

About the second issue, there is a bug report about it. Apparently it was fixed in Python 3. You can workaround it by encoding as UTF-8 like this:

>>> query = urllib.quote(u"Müller".encode('utf8'))
>>> print urllib.unquote(query).decode('utf8')
Müller

By the way, have a look at urlencode.

Python 3

In Python 3, the function quote has been moved to urllib.parse:

>>> import urllib.parse
>>> print(urllib.parse.quote("Müller".encode('utf8')))
M%C3%BCller
>>> print(urllib.parse.unquote("M%C3%BCller"))
Müller

Build query string using urlencode python

You shouldn't worry about encoding the + it should be restored on the server after unescaping the url. The order of named parameters shouldn't matter either.

Considering OrderedDict, it is not Python's built in. You should import it from collections:

from urllib import urlencode, quote
# from urllib.parse import urlencode # python3
from collections import OrderedDict

initial_url = "http://www.stackoverflow.com"
search = "Generate+value"
query_string = urlencode(OrderedDict(data=initial_url,search=search))
url = 'www.example.com/find.php?' + query_string

if your python is too old and does not have OrderedDict in the module collections, use:

encoded = "&".join( "%s=%s" % (key, quote(parameters[key], safe="+")) 
for key in ordered(parameters.keys()))

Anyway, the order of parameters should not matter.

Note the safe parameter of quote. It prevents + to be escaped, but it means , server will interpret Generate+value as Generate value. You can manually escape + by writing %2Band marking % as safe char:

URI encoding in Python Requests package

You can use requests.utils.quote (which is just a link to urllib.parse.quote) to convert your text to url encoded format.

>>> import requests
>>> requests.utils.quote('test+user@gmail.com')
'test%2Buser%40gmail.com'

How to encode a single string in python

In Python 3:

>>> import urllib.parse
>>> urllib.parse.quote_plus('D:/somedir/userdata/scripts')
'D%3A%2Fsomedir%2Fuserdata%2Fscripts'

In Python 2:

>>> import urllib
>>> urllib.quote_plus('D:/somedir/userdata/scripts')
'D%3A%2Fsomedir%2Fuserdata%2Fscripts'

How to urlencode UTC datetime strings?

%3A is : when urlencoded.

Hence begin=2020-02-04T17:00:00&end=2020-02-04T20:00:00 when encoded will be begin=2020-02-04T17%3A00%3A00&end=2020-02-04T20%3A00%3A00, as stated in your question.

The format is correct and there is nothing wrong with your code.

python requests module request params encoded url is different with the intended url

You can manually encode your _params to construct your query string and then concatenate it to your _url.

You can use urllib.parse.urlencode[Python-Docs] to convert
your _params dictionary to a percent-encoded ASCII text string. The resulting
string is a series of key=value pairs separated by & characters,
where both key and value are quoted using the quote_via function. By
default, quote_plus() is used to quote the values, which
means spaces are quoted as a + character and / characters are
encoded as %2F, which follows the standard for GET requests
(application/x-www-form-urlencoded). An alternate function that can be
passed as quote_via is quote(), which will encode spaces
as %20 and not encode / characters. For maximum control of what is
quoted, use quote and specify a value for safe.


from urllib.parse import quote_plus, quote, urlencode
import requests

url_template = "http://something/?{}"
_headers = { ... }
_params = {"action": "log", "datetime": "0900 (대한민국 표준시)"}
_url = url_template.format(urlencode(_params, safe="()", quote_via=quote))

response = requests.get(_url, headers=_headers)


Related Topics



Leave a reply



Submit