Base 62 Conversion

Base 62 conversion

There is no standard module for this, but I have written my own functions to achieve that.

BASE62 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

def encode(num, alphabet):
"""Encode a positive number into Base X and return the string.

Arguments:
- `num`: The number to encode
- `alphabet`: The alphabet to use for encoding
"""
if num == 0:
return alphabet[0]
arr = []
arr_append = arr.append # Extract bound-method for faster access.
_divmod = divmod # Access to locals is faster.
base = len(alphabet)
while num:
num, rem = _divmod(num, base)
arr_append(alphabet[rem])
arr.reverse()
return ''.join(arr)

def decode(string, alphabet=BASE62):
"""Decode a Base X encoded string into the number

Arguments:
- `string`: The encoded string
- `alphabet`: The alphabet to use for decoding
"""
base = len(alphabet)
strlen = len(string)
num = 0

idx = 0
for char in string:
power = (strlen - (idx + 1))
num += alphabet.index(char) * (base ** power)
idx += 1

return num

Notice the fact that you can give it any alphabet to use for encoding and decoding. If you leave the alphabet argument out, you are going to get the 62 character alphabet defined on the first line of code, and hence encoding/decoding to/from 62 base.

Hope this helps.

PS - For URL shorteners, I have found that it's better to leave out a few confusing characters like 0Ol1oI etc. Thus I use this alphabet for my URL shortening needs - "23456789abcdefghijkmnpqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ"

Have fun.

converting a number base 10 to base 62 (a-zA-Z0-9)

OLD: A quick and dirty solution can be to use a function like this:

function toChars($number) {
$res = base_convert($number, 10,26);
$res = strtr($res,'0123456789','qrstuvxwyz');
return $res;
}

The base convert translate your number to a base where the digits are 0-9a-p
then you get rid of the remaining digits with a quick char substitution.

As you may observe, the function is easily reversible.

function toNum($number) {
$res = strtr($number,'qrstuvxwyz','0123456789');
$res = base_convert($number, 26,10);
return $res;
}

By the way, what would you use this function for?


Edit:

Based on the question change and on the @jnpcl answer, here is a set of functions that performs the base conversion without using pow and log (they take half the time to complete the tests).

The functions work for integer values only.

function toBase($num, $b=62) {
$base='0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
$r = $num % $b ;
$res = $base[$r];
$q = floor($num/$b);
while ($q) {
$r = $q % $b;
$q =floor($q/$b);
$res = $base[$r].$res;
}
return $res;
}

function to10( $num, $b=62) {
$base='0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
$limit = strlen($num);
$res=strpos($base,$num[0]);
for($i=1;$i<$limit;$i++) {
$res = $b * $res + strpos($base,$num[$i]);
}
return $res;
}

The test:

for ($i = 0; $i<1000000; $i++) {
$x = toBase($i);
$y = to10($x);
if ($i-$y)
echo "\n$i -> $x -> $y";
}

Convert a string into BASE62

Background on BINARY to TEXT Encoding schemes:

https://en.wikipedia.org/wiki/Base62

https://en.wikipedia.org/wiki/Base64

Good explanation of the BASE62 encoding scheme:

https://www.codeproject.com/Articles/1076295/Base-Encode

Try the C# libraries available here which adds some extension methods to allow you to convert a byte array to and from BASE62 (binary-to-text encoding schemes).

Plenty of base62 libraries on github, have a look:

  • https://github.com/JoyMoe/Base62.Net
  • https://github.com/ghost1face/base62
  • https://github.com/rossdempster/base62csharp
  • https://github.com/renmengye/base62-csharp (claims below that it doesn't work...raise any issues with them)

If your source data is contained in a "string" then you would first need to convert your "string" to a suitable byte array.

But be careful, to use the correct string to byte conversion call....as you may want the bytes to be the ASCII characters, or the Unicode byte stream etc i.e. Encoding.GetBytes(text) or System.Text.ASCIIEncoding.ASCII.GetBytes(text);, etc

byte[] bytestoencode = ..... 

string encodedasBASE62 = bytestoencode.ToBase62();

.....

byte[] bytesdecoded = encodedasBASE62.FromBase62();

Bash decimal to base 62 conversion

I do really appreciate the solution you came up with, and I guess there's no way around it straight with bash. Here's the little point you've missed:

BASE62=($(echo {0..9} {a..z} {A..Z}))
for i in $(bc <<< "obase=62; 9207903953"); do
echo -n ${BASE62[$(( 10#$i ))]}
done && echo

Output:

a39qrT

Base 62 conversion in Objective-C

Your code is fine. If anything, make it more generic. Here is a recursive version for any base (same code):

#import <Foundation/Foundation.h>

@interface BaseConversion : NSObject
+(NSString*) formatNumber:(NSUInteger)n toBase:(NSUInteger)base;
+(NSString*) formatNumber:(NSUInteger)n usingAlphabet:(NSString*)alphabet;
@end

@implementation BaseConversion

// Uses the alphabet length as base.
+(NSString*) formatNumber:(NSUInteger)n usingAlphabet:(NSString*)alphabet
{
NSUInteger base = [alphabet length];
if (n<base){
// direct conversion
NSRange range = NSMakeRange(n, 1);
return [alphabet substringWithRange:range];
} else {
return [NSString stringWithFormat:@"%@%@",

// Get the number minus the last digit and do a recursive call.
// Note that division between integer drops the decimals, eg: 769/10 = 76
[self formatNumber:n/base usingAlphabet:alphabet],

// Get the last digit and perform direct conversion with the result.
[alphabet substringWithRange:NSMakeRange(n%base, 1)]];
}
}

+(NSString*) formatNumber:(NSUInteger)n toBase:(NSUInteger)base
{
NSString *alphabet = @"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"; // 62 digits
NSAssert([alphabet length]>=base,@"Not enough characters. Use base %ld or lower.",(unsigned long)[alphabet length]);
return [self formatNumber:n usingAlphabet:[alphabet substringWithRange:NSMakeRange (0, base)]];
}

@end

int main(int argc, char *argv[]) {
@autoreleasepool {
NSLog(@"%@",[BaseConversion formatNumber:3735928559 toBase:16]); // deadbeef
return EXIT_SUCCESS;
}
}

A Swift 3 version: https://gist.github.com/j4n0/056475333d0ddfe963ac5dc44fa53bf2

Please explain this base 62 PHP conversion function/algorithm

You're over-complicating:

private static $_characters = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';

private static function _convertBase($num)
{
$base = strlen(self::$_characters); // 62
$string = self::$_characters[$num % $base];

while (($num = intval($num / $base)) > 0)
{
$string = self::$_characters[$num % $base] . $string;
}

return $string;
}


Related Topics



Leave a reply



Submit