I\'m doing static code analysis using openstack/bandit. Do have a lot of repositories, some of those are in python 2 other in python 3. How can I detect if code is syntactically
Here's one thing you might want to do. I think it's the easiest way you can know if code is compatible at least syntaxically.
Have a python3 program load the python modules (without executing them). If the code is compatible, it will load the module, if it isn't... it will raise a syntax error.
Use the ast
module.
import ast
def test_source_code_compatible(code_data):
try:
return ast.parse(code_data)
except SyntaxError as exc:
return False
ast_tree = test_source_code_compatible(open('file.py', 'rb').read())
if not ast_tree:
print("File couldn't get loaded")
If the code can't be loaded it will raise a SyntaxError
error.
Documentation of the Ast Module
If the abstract syntax tree can't be loaded, then you may have to check for python2 methods that don't exists in python3 or methods that changed their behaviour.
For example the division in python3 and python2 works differently. In python2, the division divide in integers so the result of a division will be different if you don't use the same division scheme. In that case, you'll have to look if the module is importing from __future__ import division
to have the same behaviour in python2 and python3.
Here's an exhaustive list of things that you might want to handle:
http://python-future.org/compatible_idioms.html
Loading the ast of the module will give you right away things that absolutely can't work.. but knowing if code that can be parsed will work in python3 is subject to many false positive. It's hard even impossible to accurately detect if code will actually work 100% in python2 and 3 without actually running it and comparing the results.