pyparsing

Pyparsing: extract variable length, variable content, variable whitespace substring

筅森魡賤 提交于 2019-11-27 07:20:03
问题 I need to extract Gleason scores from a flat file of prostatectomy final diagnostic write-ups. These scores always have the word Gleason and two numbers that add up to another number. Humans typed these in over two decades. Various conventions of whitespace and modifiers are included. Below is my Backus-Naur form so far, and two example records. Just for prostatectomies, we're looking at upwards of a thousand cases. I am using pyparsing because I'm learning python, and have no fond memories

Split string at commas except when in bracket environment

流过昼夜 提交于 2019-11-26 20:58:38
问题 I would like to split a Python multiline string at its commas, except when the commas are inside a bracketed expression. E.g., the string {J. Doe, R. Starr}, {Lorem {i}psum dolor }, Dol. sit., am. et. Should be split into ['{J. Doe, R. Starr}', '{Lorem\n{i}psum dolor }', 'Dol. sit.', 'am. et.'] This involves bracket matching, so probably regexes are not helping out here. PyParsing has commaSeparatedList which almost does what I need except that quoted ( " ) environments are protected instead

Parsing SQL with Python

做~自己de王妃 提交于 2019-11-26 09:20:41
问题 I want to create a SQL interface on top of a non-relational data store. Non-relational data store, but it makes sense to access the data in a relational manner. I am looking into using ANTLR to produce an AST that represents the SQL as a relational algebra expression. Then return data by evaluating/walking the tree. I have never implemented a parser before, and I would therefore like some advice on how to best implement a SQL parser and evaluator. Does the approach described above sound about