问题
So, I am trying to write a program to decode 6-character base-64 numbers.
Here is the problem statement:
Return the 36-bit number represented as a base-64 number in reverse order by the 6-character string s where the order of the 64 numerals is: 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+
i.e.
decode('000000') → 0
decode('gR1iC9') → 9876543210
decode('++++++') → 68719476735
I would like to do this WITHOUT strings.
The easiest way to do this would be to create the inverse of the following function:
def get_digit(d):
''' Convert a base 64 digit to the desired character '''
if 0 <= d <= 9:
# 0 - 9
c = 48 + d
elif 10 <= d <= 35:
# A - Z
c = 55 + d
elif 36 <= d <= 61:
# a - z
c = 61 + d
elif d == 62:
# -
c = 45
elif d == 63:
# +
c = 43
else:
# We should never get here
raise ValueError('Invalid digit for base 64: ' + str(d))
return chr(c)
# Test `digit`
print(''.join([get_digit(d) for d in range(64)]))
def encode(n):
''' Convert integer n to base 64 '''
out = []
while n:
n, r = n // 64, n % 64
out.append(get_digit(r))
while len(out) < 6:
out.append('0')
return ''.join(out)
# Test `encode`
for i in (0, 9876543210, 68719476735):
print(i, encode(i))
Output
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+
0 000000
9876543210 gR1iC9
68719476735 ++++++
Which is actually from PM 2Ring on this page.
How do I write the inverse of this program?
A start:
The inverse of get_digits as above is below:
def inv_get_digit(c):
if 0 <= c <= 9:
d = ord(c) - 48
elif 'A' <= c <= 'Z':
d = ord(c) - 55
elif 'a' <= c <= 'z'
d = ord(c) - 61
elif c == '+':
d = 63
elif c == '-':
d = 62
else:
raise ValueError('Invalid Input' + str(c))
return d
def decode(n):
out = []
while n:
n, r= n % 10, n ** (6-len(str))
out.append(get_digit(r))
while len(out) < 10:
out.append('0')
return ''.join(out)
回答1:
Here's a program that combines my old code with some new code to perform the inverse operations.
You have a syntax error in your inv_get_digit
function: you left the colon off the end of an elif
line. And there's no need to do str(c)
, since c
is already a string.
I'm afraid that your decode
function doesn't make much sense. It's supposed to take a string as input and return an integer. Please see a working version below.
def get_digit(d):
''' Convert a base 64 digit to the desired character '''
if 0 <= d <= 9:
# 0 - 9
c = 48 + d
elif 10 <= d <= 35:
# A - Z
c = 55 + d
elif 36 <= d <= 61:
# a - z
c = 61 + d
elif d == 62:
# -
c = 45
elif d == 63:
# +
c = 43
else:
# We should never get here
raise ValueError('Invalid digit for base 64: ' + str(d))
return chr(c)
print('Testing get_digit')
digits = ''.join([get_digit(d) for d in range(64)])
print(digits)
def inv_get_digit(c):
if '0' <= c <= '9':
d = ord(c) - 48
elif 'A' <= c <= 'Z':
d = ord(c) - 55
elif 'a' <= c <= 'z':
d = ord(c) - 61
elif c == '-':
d = 62
elif c == '+':
d = 63
else:
raise ValueError('Invalid input: ' + c)
return d
print('\nTesting inv_get_digit')
nums = [inv_get_digit(c) for c in digits]
print(nums == list(range(64)))
def encode(n):
''' Convert integer n to base 64 '''
out = []
while n:
n, r = n // 64, n % 64
out.append(get_digit(r))
while len(out) < 6:
out.append('0')
return ''.join(out)
print('\nTesting encode')
numdata = (0, 9876543210, 68719476735)
strdata = []
for i in numdata:
s = encode(i)
print(i, s)
strdata.append(s)
def decode(s):
out = []
n = 0
for c in reversed(s):
d = inv_get_digit(c)
n = 64 * n + d
return n
print('\nTesting decode')
for s, oldn in zip(strdata, numdata):
n = decode(s)
print(s, n, n == oldn)
output
Testing get_digit
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+
Testing inv_get_digit
True
Testing encode
0 000000
9876543210 gR1iC9
68719476735 ++++++
Testing decode
000000 0 True
gR1iC9 9876543210 True
++++++ 68719476735 True
回答2:
I would like to do this WITHOUT strings.
First, you need to clarify what this means. What you offer as a working encoder uses these strings:
out.append('0')
return ''.join(out)
And the accepted solution adds these strings:
digits = ''.join([get_digit(d) for d in range(64)])
if '0' <= c <= '9':
elif 'A' <= c <= 'Z':
elif 'a' <= c <= 'z':
elif c == '-':
elif c == '+':
Do you mean that single character strings are acceptable but multiple character strings are not? Or do you mean you don't want to use a str
as a data structure and wish to minimize string operations?
I feel your solution, and the accepted solution which builds upon it, are doing too many operations at encode and decode time. I recommend a little bit of work up front to build data structures and less effort when working through the data:
from string import digits, ascii_lowercase, ascii_uppercase
BASE10_TO_BASE64 = list(digits + ascii_uppercase + ascii_lowercase + '-' + '+')
BASE64_TO_BASE10 = {base64: base10 for base10, base64 in enumerate(BASE10_TO_BASE64)}
ZEROS = ['0'] * 6
def encode(number):
''' Convert base 10 int to reversed base 64 str '''
characters = []
while number:
number, remainder = divmod(number, 64)
characters.append(BASE10_TO_BASE64[remainder])
return ''.join(characters + ZEROS[:max(len(ZEROS) - len(characters), 0)])
def decode(string):
''' Convert reversed base 64 str to base 10 int '''
number = 0
for character in string[::-1]:
digit = BASE64_TO_BASE10[character]
number = 64 * number + digit
return number
if __name__ == "__main__":
NUMBERS = (4096, 9876543210, 68719476735)
strings = []
print("Encode:")
for number in NUMBERS:
string = encode(number)
print(number, string)
strings.append(string)
print("\nDecode:")
for string in strings:
number = decode(string)
print(string, number)
Encoding to reversed base 64 numbers and zero padding complicate the program but add nothing. We usually expect '100' to represent the square of the base, but not here.
OUTPUT
> python3 test.py
Encode:
4096 001000
9876543210 gR1iC9
68719476735 ++++++
Decode:
001000 4096
gR1iC9 9876543210
++++++ 68719476735
>
来源:https://stackoverflow.com/questions/46751441/converting-from-a-string-to-a-number