Determine If String Is in Base64 Using JavaScript

Determine if string is in base64 using JavaScript

If "valid" means "only has base64 chars in it" then check against /[A-Za-z0-9+/=]/.

If "valid" means a "legal" base64-encoded string then you should check for the = at the end.

If "valid" means it's something reasonable after decoding then it requires domain knowledge.

How to check if a string is plaintext or base64 format in Node.js

Encoding is byte level.
If you're dealing in strings then all you can do is to guess or keep meta data information with your string to identify

But you can check these libraries out:

  1. https://www.npmjs.com/package/detect-encoding
  2. https://github.com/mooz/node-icu-charset-detector

Detect base64 encoding

Must it really be a jQuery plugin? Just use a simple JavaScript regex match:

var base64Matcher = new RegExp("^(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{4})$");

// ...

if (base64Matcher.test(someString)) {
// It's likely base64 encoded.
} else {
// It's definitely not base64 encoded.
}

The regex pattern is taken from this question: RegEx to parse or validate Base64 data.

How to check whether a string is Base64 encoded or not

You can use the following regular expression to check if a string constitutes a valid base64 encoding:

^([A-Za-z0-9+/]{4})*([A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)?$

In base64 encoding, the character set is [A-Z, a-z, 0-9, and + /]. If the rest length is less than 4, the string is padded with '=' characters.

^([A-Za-z0-9+/]{4})* means the string starts with 0 or more base64 groups.

([A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==)$ means the string ends in one of three forms: [A-Za-z0-9+/]{4}, [A-Za-z0-9+/]{3}= or [A-Za-z0-9+/]{2}==.

Determine if a base64 string or a buffer contains JPEG or PNG without metadata? Possible?

The first eight bytes of a PNG file always contain the following values - see PNG Specification:

(decimal)              137  80  78  71  13  10  26  10
(hexadecimal) 89 50 4e 47 0d 0a 1a 0a
(ASCII C notation) \211 P N G \r \n \032 \n

So, if I take 8 bytes from the start of any PNG file and base64 encode it as follows, I get:

head -c8 test.png | base64
iVBORw0KGgo=

The first 2 bytes of every JPEG file contain ff d8 in hex - see Wikipedia entry for JPEG. So if I take any JPEG file and base64 encode the first two bytes as follows, I get:

head -c2 test.jpg | base64
/9g=

So my suggestion would be to look at the first few (10 for PNG and 2 for JPEG, always excluding the =) characters of your base64-encoded file and see if they match what I am suggesting and then use that as the determinant - be sure to output error messages if your string matches neither in case the test is not sufficiently thorough for some reason!


Why 10 characters for PNG? Because the guaranteed signature is 8 bytes, i.e. 64 bits and base64 splits into 6 bits at a time to generate a character, so the first 10 characters are the first 60 bits. The 11th character will vary depending on what follows the signature.

Same logic for JPEG... 2 bytes is 16 bits, which means 2 characters each corresponding to 6 bits are guaranteed. The 3rd character will vary depending on what follows the 2-byte SOI marker.

How to validate Base64 string using regex in JavaScript?

See if this answer fits you, from another user:

^(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?$

check if string is base64

You can use something like this, not very performant but you are guaranteed not to get false positives:

require 'base64'

def base64?(value)
value.is_a?(String) && Base64.strict_encode64(Base64.decode64(value)) == value
end

The use of strict_encode64 versus encode64 prevents Ruby from inadvertently inserting newlines if you have a long string. See this post for details.



Related Topics



Leave a reply



Submit