How to check for a valid Base64 encoded string
Update: For newer versions of C#, there's a much better alternative, please refer to the answer by Tomas here: https://stackoverflow.com/a/54143400/125981.
It's pretty easy to recognize a Base64 string, as it will only be composed of characters 'A'..'Z', 'a'..'z', '0'..'9', '+', '/'
and it is often padded at the end with up to three '=', to make the length a multiple of 4. But instead of comparing these, you'd be better off ignoring the exception, if it occurs.
How to check whether a string is Base64 encoded or not
You can use the following regular expression to check if a string constitutes a valid base64 encoding:
^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)?$
In base64 encoding, the character set is [A-Z, a-z, 0-9, and + /]
. If the rest length is less than 4, the string is padded with '='
characters.
^([A-Za-z0-9+/]{4})*
means the string starts with 0 or more base64 groups.
([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$
means the string ends in one of three forms: [A-Za-z0-9+/]{4}
, [A-Za-z0-9+/]{3}=
or [A-Za-z0-9+/]{2}==
.
check the string is Base64 encoded in PowerShell
The following returns $true
if $item
contains a valid Base64-encoded string, and $false
otherwise:
try { $null=[Convert]::FromBase64String($item); $true } catch { $false }
The above uses
System.Convert.FromBase64String
to try to convert input string$item
to the array of bytes it represents.If the call succeeds, the output byte array is ignored (
$null = ...
), and$true
is output.Otherwise, the
catch
block is entered and$false
is returned.
Caveat: Even regular strings can accidentally be technically valid Base64-encoded strings, namely if they happen to contain only characters from the Base64 character set and the character count is a multiple of 4.
For instance, the above test yields $true
for "word"
(only Base64 chars., and a multiple of 4), but not for "words"
(not multiple of 4 chars.)
For example, in the context of an if
statement:
- Note: In order for a
try
/catch
statement to serve as an expression in theif
conditional,$()
, the subexpression operator, must be used.
# Process 2 sample strings, one Base64-encoded, the other not.
foreach ($item in 'foo', 'SGFwcHkgSG9saWRheXM=') {
if ($(try { $null=[Convert]::FromBase64String($item); $true } catch { $false })) {
'Base64-encoded: [{0}]; decoded as UTF-8: [{1}]' -f
$item,
[Text.Encoding]::UTF8.GetString([Convert]::FromBase64String($item))
}
else {
'NOT Base64-encoded: [{0}]' -f $item
}
}
The above yields:
NOT Base64-encoded: [foo]
Base64-encoded: [SGFwcHkgSG9saWRheXM=]; decoded as UTF-8: [Happy Holidays]
It's easy to wrap the functionality in a custom helper function, Test-Base64
:
# Define function.
# Accepts either a single string argument or multiple strings via the pipeline.
function Test-Base64 {
param(
[Parameter(ValueFromPipeline)]
[string] $String
)
process {
try { $null=[Convert]::FromBase64String($String); $true } catch { $false }
}
}
# Test two sample strings.
foreach ($item in 'foo', 'SGFwcHkgSG9saWRheXM=') {
if (Test-Base64 $item) {
"YES: $item"
}
else {
"NO: $item"
}
}
For information on converting bytes to and from Base64-encoded strings, see this answer.
How to check whether a string is base64 encoded or not?
If you receive the exact value by <img src="..." />
attribute then it should have Data URL format
The simple regexp could determine whether the URL is Data or regular. In java it can look like
private static final Pattern DATA_URL_PATTERN = Pattern.compile("^data:image/(.+?);base64,\\s*", Pattern.CASE_INSENSITIVE);
static void handleImgSrc(String path) {
if (path.startsWith("data:")) {
final Matcher m = DATA_URL_PATTERN.matcher(path);
if (m.find()) {
String imageType = m.group(1);
String base64 = path.substring(m.end());
// decodeImage(imageType, base64);
} else {
// some logging
}
} else {
// downloadImage(path);
}
}
Valid Base64 string can't be decoded
As Base64 string maps each byte 6 bits to 8 bits so each 3 bytes (24 bits) become 4 bytes.
Base64 string length must be divisible to 4, if not as many =
characters as needed are added to the end of it (which is actually not part of its content) to make the length divisible to 4.
As your Base64 string length is already divisble by 4, there is no need for extra =
characters.
Determine if string is in base64 using JavaScript
If "valid" means "only has base64 chars in it" then check against /[A-Za-z0-9+/=]/
.
If "valid" means a "legal" base64-encoded string then you should check for the =
at the end.
If "valid" means it's something reasonable after decoding then it requires domain knowledge.
Is there a bulletproof way to detect base64 encoding in a string in php?
I will post Yoshi's comment as the final conclusion:
I think you're out of luck. The false positives you mention, still are valid base64 encodings. You'd need to judge whether the decoded version makes any sense, but that will probably be a never ending story, and ultimately would probably also result in false positives. – Yoshi
How to detect true base64 on PHP
Since base64 is a mapping from 8 bit to 6 bit representation of data. You have just the following options:
- Look for non-ASCII chars (other than A-Z, a-z, 0-9, +, /) and paddings
- Look for the number of characters (it must be dividable by three).
By this way, you can check whether the data is not base64 encoded. But you cannot check whether the data is real base64, since it can be a normal string passing the requirements of base64 encoding.
On the other hand, if you know the structure of the data, it is possible to check that the decoding of base64 text fits the structure.
Related Topics
String to Decimal With 2 Decimal Places Always
How to Get the Currently Loggedin Windows Account from an ASP.NET Page
Clearing a Textbox Leaves an Invisible Character
How to Get Only Date from a Datetime Value in Razor Page
How to Provide Success Messages ASP.NET MVC
Check If a File Is Real or a Symbolic Link
In C# What Is the Default Value of the Bytes When Creating a New Byte Array
Microsoft Azure: How to Create Sub Directory in a Blob Container
Fast Way of Finding Most and Least Significant Bit Set in a 64-Bit Integer
How to Wait Until Task Is Finished in C#
C# Best Way to Run a Function Every Second, Timer VS Thread
How to Generate a System (Pc/Laptop) Hardware Unique Id in C#
Convert a List of Objects from One Type to Another Using Lambda Expression
What Regular Expression Would I Use to Remove Everything After the Second Backslash
C# - How to Loop Through a Table to Update Each Row MySQL