Converting from a string to a number

人走茶凉 提交于 2019-12-06 12:07:55

Here's a program that combines my old code with some new code to perform the inverse operations.

You have a syntax error in your inv_get_digit function: you left the colon off the end of an elif line. And there's no need to do str(c), since c is already a string.

I'm afraid that your decode function doesn't make much sense. It's supposed to take a string as input and return an integer. Please see a working version below.

def get_digit(d):
    ''' Convert a base 64 digit to the desired character '''
    if 0 <= d <= 9:
        # 0 - 9
        c = 48 + d
    elif 10 <= d <= 35:
        # A - Z
        c = 55 + d
    elif 36 <= d <= 61:
        # a - z
        c = 61 + d
    elif d == 62:
        # -
        c = 45
    elif d == 63:
        # +
        c = 43
    else:
        # We should never get here
        raise ValueError('Invalid digit for base 64: ' + str(d)) 
    return chr(c)

print('Testing get_digit') 
digits = ''.join([get_digit(d) for d in range(64)])
print(digits)

def inv_get_digit(c):
    if '0' <= c <= '9':
        d = ord(c) - 48
    elif 'A' <= c <= 'Z':
        d = ord(c) - 55
    elif 'a' <= c <= 'z':
        d = ord(c) - 61
    elif c == '-':
        d = 62
    elif c == '+':
        d = 63
    else:
        raise ValueError('Invalid input: ' + c)
    return d

print('\nTesting inv_get_digit') 
nums = [inv_get_digit(c) for c in digits]
print(nums == list(range(64)))

def encode(n):
    ''' Convert integer n to base 64 '''
    out = []
    while n:
        n, r = n // 64, n % 64
        out.append(get_digit(r))
    while len(out) < 6:
        out.append('0')
    return ''.join(out)

print('\nTesting encode')
numdata = (0, 9876543210, 68719476735)
strdata = []
for i in numdata:
    s = encode(i)
    print(i, s)
    strdata.append(s)

def decode(s):
    out = []
    n = 0
    for c in reversed(s):
        d = inv_get_digit(c)
        n = 64 * n + d
    return n

print('\nTesting decode')
for s, oldn in zip(strdata, numdata):
    n = decode(s)
    print(s, n, n == oldn)

output

Testing get_digit
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+

Testing inv_get_digit
True

Testing encode
0 000000
9876543210 gR1iC9
68719476735 ++++++

Testing decode
000000 0 True
gR1iC9 9876543210 True
++++++ 68719476735 True

I would like to do this WITHOUT strings.

First, you need to clarify what this means. What you offer as a working encoder uses these strings:

out.append('0')
return ''.join(out)

And the accepted solution adds these strings:

digits = ''.join([get_digit(d) for d in range(64)])
if '0' <= c <= '9':
elif 'A' <= c <= 'Z':
elif 'a' <= c <= 'z':
elif c == '-':
elif c == '+':

Do you mean that single character strings are acceptable but multiple character strings are not? Or do you mean you don't want to use a str as a data structure and wish to minimize string operations?

I feel your solution, and the accepted solution which builds upon it, are doing too many operations at encode and decode time. I recommend a little bit of work up front to build data structures and less effort when working through the data:

from string import digits, ascii_lowercase, ascii_uppercase

BASE10_TO_BASE64 = list(digits + ascii_uppercase + ascii_lowercase + '-' + '+')

BASE64_TO_BASE10 = {base64: base10 for base10, base64 in enumerate(BASE10_TO_BASE64)}

ZEROS = ['0'] * 6

def encode(number):
    ''' Convert base 10 int to reversed base 64 str '''

    characters = []

    while number:
        number, remainder = divmod(number, 64)
        characters.append(BASE10_TO_BASE64[remainder])

    return ''.join(characters + ZEROS[:max(len(ZEROS) - len(characters), 0)])

def decode(string):
    ''' Convert reversed base 64 str to base 10 int '''

    number = 0

    for character in string[::-1]:
        digit = BASE64_TO_BASE10[character]
        number = 64 * number + digit

    return number

if __name__ == "__main__":

    NUMBERS = (4096, 9876543210, 68719476735)
    strings = []

    print("Encode:")
    for number in NUMBERS:
        string = encode(number)
        print(number, string)
        strings.append(string)

    print("\nDecode:")
    for string in strings:
        number = decode(string)
        print(string, number)

Encoding to reversed base 64 numbers and zero padding complicate the program but add nothing. We usually expect '100' to represent the square of the base, but not here.

OUTPUT

> python3 test.py
Encode:
4096 001000
9876543210 gR1iC9
68719476735 ++++++

Decode:
001000 4096
gR1iC9 9876543210
++++++ 68719476735
> 
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!