Python: Ignore 'Incorrect Padding' Error When Base64 Decoding

Base64 Incorrect padding error using Python

You have at least one string in your CSV file that is either not a Base64 string, is a corrupted (damaged) Base64 string, or is a string that is missing the required = padding. Your example value, ABHPdSaxrhjAWA=, is short one = or is missing another data character.

Base64 strings, properly padded, have a length that is a multiple of 4, so you can easily re-add the padding:

value = csvlines[0]
if len(value) % 4:
# not a multiple of 4, add padding:
value += '=' * (4 - len(value) % 4)
csvlines[0] = value.decode("base64").encode("hex")

If the value then still fails to decode, then your input was corrupted or not valid Base64 to begin with.

For the example error, ABHPdSaxrhjAWA=, the above adds one = to make it decodable:

>>> value = 'ABHPdSaxrhjAWA='
>>> if len(value) % 4:
... # not a multiple of 4, add padding:
... value += '=' * (4 - len(value) % 4)
...
>>> value
'ABHPdSaxrhjAWA=='
>>> value.decode('base64')
'\x00\x11\xcfu&\xb1\xae\x18\xc0X'
>>> value.decode('base64').encode('hex')
'0011cf7526b1ae18c058'

I need to emphasise that your data may simply be corrupted. Your console output includes one value that worked, and one that failed. The one that worked is one character longer, and that's the only difference:

ABHvPdSaxrhjAWA=
ABHPdSaxrhjAWA=

Note the v in the 4th place; this is missing from the second example. This could indicate that something happened to your CSV data that caused that character to be dropped from the second example. Adding in padding can make the second value decodable again, but the result would be wrong. We can't tell you which of those two options is the cause here.

How to avoid Incorrect padding error while Base64 Decoding this string in Python

Well, it looks like the message is split by . into three parts. The first two parts are base64 encoded, while the last one is not:

import base64

res = "eyJqa3UiOiJodHRwczovL2U5N2I4YTlkNjcyZTRjZTQ4NDVlYzY5NDdjZDY2ZWY2LXNiLmJhYXMubmludGVuZG8uY29tLzEuMC4wL2NlcnRpZmljYXRlcyIsImtpZCI6ImZlOWRiYmZmLTQ3MGItNDZjOC04YmFmLTFiNzY5OGRlZTViZSIsImFsZyI6IlJTMjU2In0.eyJpc3MiOiJodHRwczovL2U5N2I4YTlkNjcyZTRjZTQ4NDVlYzY5NDdjZDY2ZWY2LXNiLmJhYXMubmludGVuZG8uY29tIiwiZXhwIjoxNTQ1MTg1NDk2LCJ0eXAiOiJpZF90b2tlbiIsImF1ZCI6IjhkOTc1NTllNjNlY2NkNTYiLCJiczpkaWQiOiI2NjJhZTQwOWYwNTQyYTBjIiwic3ViIjoiOTNkYmYwNDdiYTI3NzQ5NSIsImp0aSI6IjY1NDg4ZjJmLTI1NzAtNDBkYy04ODQ3LTMzODNlZWIxMGJiYiIsIm5pbnRlbmRvIjp7ImFpIjoiMDEwMGY4MDAwMDQ5MjAwMCIsImF2IjoiMDAwMCIsImVkaSI6ImJjNTdiYmM3MTZlMDA1MGFmOWRhN2NkYTIzMWRjZDgyIiwiYXQiOjE1NDUxNzQ2OTZ9LCJpYXQiOjE1NDUxNzQ2OTZ9.ZMUIt3wYrbfhXnnDh4WraGlKrZy0YuL5prluY70sU_-0W5XvWIB-xmTrLz7LJWHEGwTskcWf81_HBq_mSb75rMfTAEBwBmOJ4ITmhdnXksz8w7EDOWuPPSEft5XLMNOMD16ztEOYe5ddU_iqNEbT56L7fcAJEXv0FWy6H_OutxOglYpDaNkcj6CWJ7dpA0JbqerR9dEszaLwyn1ZBDPVD0YeAIm5bEr61imeedzMb0amxlTl4R87mqK6epsFUnRy6p6Klr27_DlTLQ-gej09W7NeNzONCj4thHgCr9szAiaN28krfTc2fobz3qFCoC_eQghiIIZBe_-Lksng3Eg6tw"

for i in res.split("."):
print(base64.b64decode(i + '=' * (-len(i) % 4)))

I guess the last one is a signature, which is used to validate the first two parts. Do you get this string from a cookie? Or from a submitted form?

Edit

So for anyone sees this answer, the given string is a JWT string.

binascii.Error: Incorrect padding, even when string length is multiple of 4

by checking your link, your string has 200000 bytes all right, but it contains the header:

strOne = b"...

This is part of MIME message or something. You have to strip this first.

strOne = strOne.partition(",")[2]

then pad (if needed)

pad = len(strOne)%4
strOne += b"="*pad

then decode using codecs (python 3 compliant)

codecs.decode(strOne.strip(),'base64')

=> "we believe in team work" :)

python b64decode incorrect padding

Your description of what you are doing sounds OK. Choice of the input piece size affects only the efficiency. Padding bytes are minimised if the length of each input piece (except of course the last) is a multiple of 3.

You need to show us both your server code and your client code. Alternatively: on the server, log the input and the pieces transmitted. On the client, log the pieces received. Compare.

Curiosity: Why don't you just b64encode the whole string, split the encoded result however you like, transmit the pieces, at the client reassemble the pieces using b''.join(pieces) and b64decode that?

Further curiosity: I thought the contents of a UDP packet could be any old binary bunch of bytes; why are you doing base64 encoding at all?



Related Topics



Leave a reply



Submit