How to Find Out If String Has Already Been Url Encoded

How to find out if string has already been URL encoded?

Decode, compare to original. If it does differ, original is encoded. If it doesn't differ, original isn't encoded. But still it says nothing about whether the newly decoded version isn't still encoded. A good task for recursion.

I hope one can't write a quine in urlencode, or this algorithm would get stuck.

Exception: When a string contains "+" character url decoder replaces it with a space even though the string is not url encoded

How to know whether a string is url encoded in iOS?

Check the accepted answer from:
How to find out if string has already been URL encoded?

It says:
Decode, compare to original. If it does differ, original is encoded. If it doesn't differ, original isn't encoded. But still it says nothing about whether the newly decoded version isn't still encoded. A good task for recursion.

Since, you are working on your own API response and the url string will be either encoded or plain text; you can just decode once and compare with the original string.

Straight-forward:
Decode and check if if it matches with original string.

  1. match - use the original string to check for valid url using regexp
  2. not a match - use the decoded string for valid url using regexp.

Test if string is URL encoded in PHP

You'll never know for sure if a string is URL-encoded or if it was supposed to have the sequence %2B in it. Instead, it probably depends on where the string came from, i.e. if it was hand-crafted or from some application.

Is it better to search the string for characters which would be encoded, which aren't, and if any exist then its not encoded.

I think this is a better approach, since it would take care of things that have been done programmatically (assuming the application would not have left a non-encoded character behind).

One thing that will be confusing here... Technically, the % "should be" encoded if it will be present in the final value, since it is a special character. You might have to combine your approaches to look for should-be-encoded characters as well as validating that the string decodes successfully if none are found.

How to know if a URL is decoded/encoded?

Repeatedly decoding until you find no % signs will work over 99% of the time. It'll work even better if you repeatedly call so long as a match for /%[0-9a-f]{2}/i can be found.

However, if I were (for some bizarre reason) to name a file 100%achieved, that would cause a problem because %ac would be decoded to ¬, causing the decode to fail. Unfortunately there's no way to detect this case.

Ideally you should know if something is encoded more than once, and optimally you shouldn't let it happen in the first place.

How do I check a string is url encoded or not in golang?

Detect encoding of URL in Java

If you can assume that only alphanumerics are encoded, following woud work for:

  • "häßlich"
  • "h%C3%A4%C3%9Flich"
  • "h%E4%DFlich"

// check firstly:

public static boolean isUtf8Encoded(String url) {
return isAlphaNumeric(url);
}

public static boolean isUrlUtf8Encoded(String url)
throws UnsupportedEncodingException {
return isAlphaNumeric(URLDecoder.decode(url, "UTF-8"));
}

public static boolean isUrlIsoEncoded(String url)
throws UnsupportedEncodingException {
return isAlphaNumeric(URLDecoder.decode(url, "ISO-8859-1"));
}

private static boolean isAlphaNumeric(String decode) {
for (char c : decode.toCharArray()) {
if (!Character.isLetterOrDigit(c)) {
return false;
}
}
return true;
}

How can I know if url-encoded string is UTF-8 or Latin-1 with PHP?

mb_detect_encoding() is normally useless with the default second parameter:

<?php

$x1 = 'Cl%C3%A9ment';
$x2 = 'Cl%E9ment';

$encoding_list = array('utf-8', 'iso-8859-1');

var_dump(
mb_detect_encoding(urldecode($x1), $encoding_list),
mb_detect_encoding(urldecode($x2), $encoding_list)
);

... prints:

string(5) "UTF-8"
string(10) "ISO-8859-1"

How determine if a string has been encoded programmatically in C#?

You can use HttpUtility.HtmlDecode() to decode the string, then compare the result with the original string. If they're different, the original string was probably encoded (at least, the routine found something to decode inside):

public bool IsHtmlEncoded(string text)
{
return (HttpUtility.HtmlDecode(text) != text);
}


Related Topics



Leave a reply



Submit