Base64 Incorrect padding error using Python
You have at least one string in your CSV file that is either not a Base64 string, is a corrupted (damaged) Base64 string, or is a string that is missing the required =
padding. Your example value, ABHPdSaxrhjAWA=
, is short one =
or is missing another data character.
Base64 strings, properly padded, have a length that is a multiple of 4, so you can easily re-add the padding:
value = csvlines[0]
if len(value) % 4:
# not a multiple of 4, add padding:
value += '=' * (4 - len(value) % 4)
csvlines[0] = value.decode("base64").encode("hex")
If the value then still fails to decode, then your input was corrupted or not valid Base64 to begin with.
For the example error, ABHPdSaxrhjAWA=
, the above adds one =
to make it decodable:
>>> value = 'ABHPdSaxrhjAWA='
>>> if len(value) % 4:
... # not a multiple of 4, add padding:
... value += '=' * (4 - len(value) % 4)
...
>>> value
'ABHPdSaxrhjAWA=='
>>> value.decode('base64')
'\x00\x11\xcfu&\xb1\xae\x18\xc0X'
>>> value.decode('base64').encode('hex')
'0011cf7526b1ae18c058'
I need to emphasise that your data may simply be corrupted. Your console output includes one value that worked, and one that failed. The one that worked is one character longer, and that's the only difference:
ABHvPdSaxrhjAWA=
ABHPdSaxrhjAWA=
Note the v
in the 4th place; this is missing from the second example. This could indicate that something happened to your CSV data that caused that character to be dropped from the second example. Adding in padding can make the second value decodable again, but the result would be wrong. We can't tell you which of those two options is the cause here.
How to avoid Incorrect padding error while Base64 Decoding this string in Python
Well, it looks like the message is split by .
into three parts. The first two parts are base64 encoded, while the last one is not:
import base64
res = "eyJqa3UiOiJodHRwczovL2U5N2I4YTlkNjcyZTRjZTQ4NDVlYzY5NDdjZDY2ZWY2LXNiLmJhYXMubmludGVuZG8uY29tLzEuMC4wL2NlcnRpZmljYXRlcyIsImtpZCI6ImZlOWRiYmZmLTQ3MGItNDZjOC04YmFmLTFiNzY5OGRlZTViZSIsImFsZyI6IlJTMjU2In0.eyJpc3MiOiJodHRwczovL2U5N2I4YTlkNjcyZTRjZTQ4NDVlYzY5NDdjZDY2ZWY2LXNiLmJhYXMubmludGVuZG8uY29tIiwiZXhwIjoxNTQ1MTg1NDk2LCJ0eXAiOiJpZF90b2tlbiIsImF1ZCI6IjhkOTc1NTllNjNlY2NkNTYiLCJiczpkaWQiOiI2NjJhZTQwOWYwNTQyYTBjIiwic3ViIjoiOTNkYmYwNDdiYTI3NzQ5NSIsImp0aSI6IjY1NDg4ZjJmLTI1NzAtNDBkYy04ODQ3LTMzODNlZWIxMGJiYiIsIm5pbnRlbmRvIjp7ImFpIjoiMDEwMGY4MDAwMDQ5MjAwMCIsImF2IjoiMDAwMCIsImVkaSI6ImJjNTdiYmM3MTZlMDA1MGFmOWRhN2NkYTIzMWRjZDgyIiwiYXQiOjE1NDUxNzQ2OTZ9LCJpYXQiOjE1NDUxNzQ2OTZ9.ZMUIt3wYrbfhXnnDh4WraGlKrZy0YuL5prluY70sU_-0W5XvWIB-xmTrLz7LJWHEGwTskcWf81_HBq_mSb75rMfTAEBwBmOJ4ITmhdnXksz8w7EDOWuPPSEft5XLMNOMD16ztEOYe5ddU_iqNEbT56L7fcAJEXv0FWy6H_OutxOglYpDaNkcj6CWJ7dpA0JbqerR9dEszaLwyn1ZBDPVD0YeAIm5bEr61imeedzMb0amxlTl4R87mqK6epsFUnRy6p6Klr27_DlTLQ-gej09W7NeNzONCj4thHgCr9szAiaN28krfTc2fobz3qFCoC_eQghiIIZBe_-Lksng3Eg6tw"
for i in res.split("."):
print(base64.b64decode(i + '=' * (-len(i) % 4)))
I guess the last one is a signature, which is used to validate the first two parts. Do you get this string from a cookie? Or from a submitted form?
Edit
So for anyone sees this answer, the given string is a JWT string.
binascii.Error: Incorrect padding, even when string length is multiple of 4
by checking your link, your string has 200000 bytes all right, but it contains the header:
strOne = b"data:image/png;base64,iVBORw0KGgoAAAANSU...
This is part of MIME message or something. You have to strip this first.
strOne = strOne.partition(",")[2]
then pad (if needed)
pad = len(strOne)%4
strOne += b"="*pad
then decode using codecs
(python 3 compliant)
codecs.decode(strOne.strip(),'base64')
=> "we believe in team work" :)
python b64decode incorrect padding
Your description of what you are doing sounds OK. Choice of the input piece size affects only the efficiency. Padding bytes are minimised if the length of each input piece (except of course the last) is a multiple of 3.
You need to show us both your server code and your client code. Alternatively: on the server, log the input and the pieces transmitted. On the client, log the pieces received. Compare.
Curiosity: Why don't you just b64encode the whole string, split the encoded result however you like, transmit the pieces, at the client reassemble the pieces using b''.join(pieces)
and b64decode that?
Further curiosity: I thought the contents of a UDP packet could be any old binary bunch of bytes; why are you doing base64 encoding at all?
Related Topics
Index N Dimensional Array with (N-1) D Array
What Can You Use Generator Functions For
How to Get List of Methods in a Python Class
Python Multiprocessing Linux Windows Difference
How to Automatically Install Required Packages from a Python Script as Necessary
Detect Socket Hangup Without Sending or Receiving
Appending the Same String to a List of Strings in Python
Windows Scipy Install: No Lapack/Blas Resources Found
Keyboard Interrupts with Python's Multiprocessing Pool
How to Replace Text in a String Column of a Pandas Dataframe
How to Specify the Function Type in My Type Hints
How to Execute a File Within the Python Interpreter
Python: Get a Frequency Count Based on Two Columns (Variables) in Pandas Dataframe Some Row Appers