parsing SQL with python to find specific statements

吃可爱长大的小学妹 提交于 2021-02-10 20:31:42

问题


I am using sqlparse to try and find specific statements.

code:

import sqlparse

text = 'SELECT * FROM dbo.table'
parse = sqlparse.parse(text)
pizza = [t for t in parse[0].tokens if t.ttype in (sqlparse.tokens.Keyword)]

I get

[<DML 'SELECT' at 0x9f12318>, <Keyword 'FROM' at 0x9f12548>]

but what I really want is to return the table name:

dbo.table

how can I do that? My end goal is to parse through folders of sql scripts and find references and other details.


回答1:


If you print each of the tokens

>>> pprint(parse[0].tokens)

[<DML 'SELECT' at 0x7fc278c9eb50>,
 <Whitespace ' ' at 0x7fc278c9eba8>,
 <Wildcard '*' at 0x7fc278c9ec00>,
 <Whitespace ' ' at 0x7fc278c9ec58>,
 <Keyword 'FROM' at 0x7fc278c9ecb0>,
 <Whitespace ' ' at 0x7fc278c9ed08>,
 <Identifier 'dbo.ta...' at 0x7fc278cb3628>]

you will see that dbo.table is in fact not a Keyword, but an Identifier

Logical next step: filter on Identifier

>>> pprint([t for t in parse[0].tokens if t.ttype in (sqlparse.tokens.Identifier)])

but that gives AttributeError: 'module' object has no attribute 'Identifier'. If you have a look at the source code for sqlparse, you will see that there is no Identifier in sql.tokens, you you cannot filter on that. (mistake in the code? perhaps forgot to implement that?)

However, if you print the ttype of each token

>>> pprint([t.ttype for t in parse[0].tokens])

[Token.Keyword.DML,
 Token.Text.Whitespace,
 Token.Wildcard,
 Token.Text.Whitespace,
 Token.Keyword,
 Token.Text.Whitespace,
 None]

You will see that they all have a valid Token as ttype, except for dbo.table, which is None.

Knowing that, we can filter the tokens on that and get the result we want:

>>> print([t for t in parse[0].tokens if t.ttype is None])

[<Identifier 'dbo.ta...' at 0x7f6ae0318628>]

And to get the name:

>>> print([t.to_unicode() for t in parse[0].tokens if t.ttype is None][0])
u'dbo.table'

or

>>> print([str(t) for t in parse[0].tokens if t.ttype is None][0])
'dbo.table'

note the [0] at the end, to get the first and only item in the array that the list comprehension produces



来源:https://stackoverflow.com/questions/29646908/parsing-sql-with-python-to-find-specific-statements

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!