ord() Function or ASCII Character Code of String with Z3 Solver

让人想犯罪 __ 提交于 2021-01-29 21:32:37

问题


How can I convert a z3.String to a sequence of ASCII values?

For example, here is some code that I thought would check whether the ASCII values of all the characters in the string add up to 100:

import z3

def add_ascii_values(password):
    return sum(ord(character) for character in password)

password = z3.String("password")
solver = z3.Solver()
ascii_sum = add_ascii_values(password)
solver.add(ascii_sum == 100)

print(solver.check())
print(solver.model())

Unfortunately, I get this error:

TypeError: ord() expected string of length 1, but SeqRef found

It's apparent that ord doesn't work with z3.String. Is there something in Z3 that does?


回答1:


You're conflating Python strings and Z3 Strings; and unfortunately the two are quite different types.

In Z3py, a String is simply a sequence of 8-bit values. And what you can do with a Z3 is actually quite limited; for instance you cannot iterate over the characters like you did in your add_ascii_values function. See this page for what the allowed functions are: https://rise4fun.com/z3/tutorialcontent/sequences (This page lists the functions in SMTLib parlance; but the equivalent ones are available from the z3py interface.)

There are a few important restrictions/things that you need to keep in mind when working with Z3 sequences and strings:

  • You have to be very explicit about the lengths; In particular, you cannot sum over strings of arbitrary symbolic length. There are a few things you can do without specifying the length explicitly, but these are limited. (Like regex matches, substring extraction etc.)

  • You cannot extract a character out of a string. This is an oversight in my opinion, but SMTLib just has no way of doing so for the time being. Instead, you get a list of length 1. This causes a lot of headaches in programming, but there are workarounds. See below.

  • Anytime you loop over a string/sequence, you have to go up to a fixed bound. There are ways to program so you can cover "all strings upto length N" for some constant "N", but they do get hairy.

Keeping all this in mind, I'd go about coding your example like the following; restricting password to be precisely 10 characters long:

from z3 import *

s = Solver()

# Work around the fact that z3 has no way of giving us an element at an index. Sigh.
ordHelperCounter = 0
def OrdAt(inp, i):
    global ordHelperCounter
    v = BitVec("OrdAtHelper_%d_%d" % (i, ordHelperCounter), 8)
    ordHelperCounter += 1
    s.add(Unit(v) == SubString(inp, i, 1))
    return v

# Your original function, but note the addition of len parameter and use of Sum
def add_ascii_values(password, len):
    return Sum([OrdAt(password, i) for i in range(len)])

# We'll have to force a constant length
length = 10
password = String("password")
s.add(Length(password) == 10)
ascii_sum = add_ascii_values(password, length)
s.add(ascii_sum == 100)

# Also require characters to be printable so we can view them:
for i in range(length):
  v = OrdAt(password, i)
  s.add(v >= 0x20)
  s.add(v <= 0x7E)

print(s.check())
print(s.model()[password])

The OrdAt function works around the problem of not being able to extract characters. Also note how we use Sum instead of sum, and how all "loops" are of fixed iteration count. I also added constraints to make all the ascii codes printable for convenience.

When you run this, you get:

sat
":X|@`y}@@@"

Let's check it's indeed good:

>>> len(":X|@`y}@@@")
10
>>> sum(ord(character) for character in ":X|@`y}@@@")
868

So, we did get a length 10 string; but how come the ord's don't sum up to 100? Now, you have to remember sequences are composed of 8-bit values, and thus the arithmetic is done modulo 256. So, the sum actually is:

>>> sum(ord(character) for character in ":X|@`y}@@@") % 256
100

To avoid the overflows, you can either use larger bit-vectors, or more simply use Z3's unbounded Integer type Int. To do so, use the BV2Int function, by simply changing add_ascii_values to:

def add_ascii_values(password, len):
    return Sum([BV2Int(OrdAt(password, i)) for i in range(len)])

Now we'd get:

unsat

That's because each of our characters has at least value 0x20 and we wanted 10 characters; so there's no way to make them all sum up to 100. And z3 is precisely telling us that. If you increase your sum goal to something more reasonable, you'd start getting proper values.

Programming with z3py is different than regular programming with Python, and z3 String objects are quite different than those of Python itself. Note that the sequence/string logic isn't even standardized yet by the SMTLib folks, so things can change. (In particular, I'm hoping they'll add functionality for extracting elements at an index!).

Having said all this, going over the https://rise4fun.com/z3/tutorialcontent/sequences would be a good start to get familiar with them, and feel free to ask further questions.



来源:https://stackoverflow.com/questions/53774838/ord-function-or-ascii-character-code-of-string-with-z3-solver

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!