pyparsing | 易学教程

How do I parse indents and dedents with pyparsing?

阅读更多关于 How do I parse indents and dedents with pyparsing?

Here is a subset of the Python grammar: single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE stmt: simple_stmt | compound_stmt simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE small_stmt: pass_stmt pass_stmt: 'pass' compound_stmt: if_stmt if_stmt: 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite] suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT (You can read the full grammar in the Python SVN repository: http://svn.python.org/.../Grammar ) I am trying to use this grammar to generate a parser for Python, in Python. What I am having trouble with is how to express the

Parsing nested function calls using pyparsing

阅读更多关于 Parsing nested function calls using pyparsing

问题 I'm trying to use pyparsing to parse function calls in the form: f(x, y) That's easy. But since it's a recursive-descent parser, it should also be easy to parse: f(g(x), y) That's what I can't get. Here's a boiled-down example: from pyparsing import Forward, Word, alphas, alphanums, nums, ZeroOrMore, Literal lparen = Literal("(") rparen = Literal(")") identifier = Word(alphas, alphanums + "_") integer = Word( nums ) functor = identifier # allow expression to be used recursively expression =

Pyparsing: Parsing semi-JSON nested plaintext data to a list

阅读更多关于 Pyparsing: Parsing semi-JSON nested plaintext data to a list

I have a bunch of nested data in a format that loosely resembles JSON: company="My Company" phone="555-5555" people= { person= { name="Bob" location="Seattle" settings= { size=1 color="red" } } person= { name="Joe" location="Seattle" settings= { size=2 color="blue" } } } places= { ... } There are many different parameters with varying levels of depth--this is just a very small subset. It also might be worth noting that when a new sub-array is created that there is always an equals sign followed by a line break followed by the open bracket (as seen above). Is there any simple looping or

pyparsing nestedExpr and nested parentheses

阅读更多关于 pyparsing nestedExpr and nested parentheses

I am working on a very simple "querying syntax" usable by people with reasonable technical skills (i.e., not coders per se, but able to touch on the subject) A typical example of what they would enter on a form is: address like street AND vote = True AND ( ( age>=25 AND gender = M ) OR ( age between [20,30] AND gender = F ) OR ( age >= 70 AND eyes != blue ) ) With no quote required potentially infinite nesting of parentheses simple AND|OR linking I am using pyparsing (well, trying to anyway) and reaching something: from pyparsing import * OPERATORS = [ '<', '<=', '>', '>=', '=', '!=', 'like'

Python - pyparsing unicode characters

阅读更多关于 Python - pyparsing unicode characters

:) I tried using w = Word(printables), but it isn't working. How should I give the spec for this. 'w' is meant to process Hindi characters (UTF-8) The code specifies the grammar and parses accordingly. 671.assess :: अहसास ::2 x=number + "." + src + "::" + w + "::" + number + "." + number If there is only english characters it is working so the code is correct for the ascii format but the code is not working for the unicode format. I mean that the code works when we have something of the form 671.assess :: ahsaas ::2 i.e. it parses words in the english format, but I am not sure how to parse and

How can I use pyparsing to parse nested expressions that have multiple opener/closer types?

阅读更多关于 How can I use pyparsing to parse nested expressions that have multiple opener/closer types?

I'd like to use pyparsing to parse an expression of the form: expr = '(gimme [some {nested [lists]}])' , and get back a python list of the form: [[['gimme', ['some', ['nested', ['lists']]]]]] . Right now my grammar looks like this: nestedParens = nestedExpr('(', ')') nestedBrackets = nestedExpr('[', ']') nestedCurlies = nestedExpr('{', '}') enclosed = nestedParens | nestedBrackets | nestedCurlies Presently, enclosed.searchString(expr) returns a list of the form: [[['gimme', ['some', '{nested', '[lists]}']]]] . This is not what I want because it's not recognizing the square or curly brackets,

How best to parse a simple grammar?

阅读更多关于 How best to parse a simple grammar?

Ok, so I've asked a bunch of smaller questions about this project, but I still don't have much confidence in the designs I'm coming up with, so I'm going to ask a question on a broader scale. I am parsing pre-requisite descriptions for a course catalog. The descriptions almost always follow a certain form, which makes me think I can parse most of them. From the text, I would like to generate a graph of course pre-requisite relationships. (That part will be easy, after I have parsed the data.) Some sample inputs and outputs: "CS 2110" => ("CS", 2110) # 0 "CS 2110 and INFO 3300" => [("CS", 2110)

Pyparsing: extract variable length, variable content, variable whitespace substring

阅读更多关于 Pyparsing: extract variable length, variable content, variable whitespace substring

I need to extract Gleason scores from a flat file of prostatectomy final diagnostic write-ups. These scores always have the word Gleason and two numbers that add up to another number. Humans typed these in over two decades. Various conventions of whitespace and modifiers are included. Below is my Backus-Naur form so far, and two example records. Just for prostatectomies, we're looking at upwards of a thousand cases. I am using pyparsing because I'm learning python, and have no fond memories of my very limited exposure to regex writing. My question: how can I pluck out these Gleason grades

Python - pyparsing unicode characters

阅读更多关于 Python - pyparsing unicode characters

问题 :) I tried using w = Word(printables), but it isn't working. How should I give the spec for this. 'w' is meant to process Hindi characters (UTF-8) The code specifies the grammar and parses accordingly. 671.assess :: अहसास ::2 x=number + "." + src + "::" + w + "::" + number + "." + number If there is only english characters it is working so the code is correct for the ascii format but the code is not working for the unicode format. I mean that the code works when we have something of the form

pyparsing nestedExpr and nested parentheses

阅读更多关于 pyparsing nestedExpr and nested parentheses

问题 I am working on a very simple "querying syntax" usable by people with reasonable technical skills (i.e., not coders per se, but able to touch on the subject) A typical example of what they would enter on a form is: address like street AND vote = True AND ( ( age>=25 AND gender = M ) OR ( age between [20,30] AND gender = F ) OR ( age >= 70 AND eyes != blue ) ) With no quote required potentially infinite nesting of parentheses simple AND|OR linking I am using pyparsing (well, trying to anyway)