I\'ve developed an equation parser using a simple stack algorithm that will handle binary (+, -, |, &, *, /, etc) operators, unary (!) operators, and parenthesis.
<
I know this is a late answer, but I've just written a tiny parser that allows all operators (prefix, postfix and infix-left, infix-right and nonassociative) to have arbitrary precedence.
I'm going to expand this for a language with arbitrary DSL support, but I just wanted to point out that one doesn't need custom parsers for operator precedence, one can use a generalized parser that doesn't need tables at all, and just looks up the precedence of each operator as it appears. People have been mentioning custom Pratt parsers or shunting yard parsers that can accept illegal inputs - this one doesn't need to be customized and (unless there's a bug) won't accept bad input. It isn't complete in a sense, it was written to test the algorithm and its input is in a form that will need some preprocessing, but there are comments that make it clear.
Note some common kinds of operators are missing for instance the sort of operator used for indexing ie table[index] or calling a function function(parameter-expression, ...) I'm going to add those, but think of both as postfix operators where what comes between the delimeters '[' and ']' or '(' and ')' is parsed with a different instance of the expression parser. Sorry to have left that out, but the postfix part is in - adding the rest will probably almost double the size of the code.
Since the parser is just 100 lines of racket code, perhaps I should just paste it here, I hope this isn't longer than stackoverflow allows.
A few details on arbitrary decisions:
If a low precedence postfix operator is competing for the same infix blocks as a low precedence prefix operator the prefix operator wins. This doesn't come up in most languages since most don't have low precedence postfix operators. - for instance: ((data a) (left 1 +) (pre 2 not)(data b)(post 3 !) (left 1 +) (data c)) is a+not b!+c where not is a prefix operator and ! is postfix operator and both have lower precedence than + so they want to group in incompatible ways either as (a+not b!)+c or as a+(not b!+c) in these cases the prefix operator always wins, so the second is the way it parses
Nonassociative infix operators are really there so that you don't have to pretend that operators that return different types than they take make sense together, but without having different expression types for each it's a kludge. As such, in this algorithm, non-associative operators refuse to associate not just with themselves but with any operator with the same precedence. That's a common case as < <= == >= etc don't associate with each other in most languages.
The question of how different kinds of operators (left, prefix etc) break ties on precedence is one that shouldn't come up, because it doesn't really make sense to give operators of different types the same precedence. This algorithm does something in those cases, but I'm not even bothering to figure out exactly what because such a grammar is a bad idea in the first place.
#lang racket
;cool the algorithm fits in 100 lines!
(define MIN-PREC -10000)
;format (pre prec name) (left prec name) (right prec name) (nonassoc prec name) (post prec name) (data name) (grouped exp)
;for example "not a*-7+5 < b*b or c >= 4"
;which groups as: not ((((a*(-7))+5) < (b*b)) or (c >= 4))"
;is represented as '((pre 0 not)(data a)(left 4 *)(pre 5 -)(data 7)(left 3 +)(data 5)(nonassoc 2 <)(data b)(left 4 *)(data b)(right 1 or)(data c)(nonassoc 2 >=)(data 4))
;higher numbers are higher precedence
;"(a+b)*c" is represented as ((grouped (data a)(left 3 +)(data b))(left 4 *)(data c))
(struct prec-parse ([data-stack #:mutable #:auto]
[op-stack #:mutable #:auto])
#:auto-value '())
(define (pop-data stacks)
(let [(data (car (prec-parse-data-stack stacks)))]
(set-prec-parse-data-stack! stacks (cdr (prec-parse-data-stack stacks)))
data))
(define (pop-op stacks)
(let [(op (car (prec-parse-op-stack stacks)))]
(set-prec-parse-op-stack! stacks (cdr (prec-parse-op-stack stacks)))
op))
(define (push-data! stacks data)
(set-prec-parse-data-stack! stacks (cons data (prec-parse-data-stack stacks))))
(define (push-op! stacks op)
(set-prec-parse-op-stack! stacks (cons op (prec-parse-op-stack stacks))))
(define (process-prec min-prec stacks)
(let [(op-stack (prec-parse-op-stack stacks))]
(cond ((not (null? op-stack))
(let [(op (car op-stack))]
(cond ((>= (cadr op) min-prec)
(apply-op op stacks)
(set-prec-parse-op-stack! stacks (cdr op-stack))
(process-prec min-prec stacks))))))))
(define (process-nonassoc min-prec stacks)
(let [(op-stack (prec-parse-op-stack stacks))]
(cond ((not (null? op-stack))
(let [(op (car op-stack))]
(cond ((> (cadr op) min-prec)
(apply-op op stacks)
(set-prec-parse-op-stack! stacks (cdr op-stack))
(process-nonassoc min-prec stacks))
((= (cadr op) min-prec) (error "multiply applied non-associative operator"))
))))))
(define (apply-op op stacks)
(let [(op-type (car op))]
(cond ((eq? op-type 'post)
(push-data! stacks `(,op ,(pop-data stacks) )))
(else ;assume infix
(let [(tos (pop-data stacks))]
(push-data! stacks `(,op ,(pop-data stacks) ,tos)))))))
(define (finish input min-prec stacks)
(process-prec min-prec stacks)
input
)
(define (post input min-prec stacks)
(if (null? input) (finish input min-prec stacks)
(let* [(cur (car input))
(input-type (car cur))]
(cond ((eq? input-type 'post)
(cond ((< (cadr cur) min-prec)
(finish input min-prec stacks))
(else
(process-prec (cadr cur)stacks)
(push-data! stacks (cons cur (list (pop-data stacks))))
(post (cdr input) min-prec stacks))))
(else (let [(handle-infix (lambda (proc-fn inc)
(cond ((< (cadr cur) min-prec)
(finish input min-prec stacks))
(else
(proc-fn (+ inc (cadr cur)) stacks)
(push-op! stacks cur)
(start (cdr input) min-prec stacks)))))]
(cond ((eq? input-type 'left) (handle-infix process-prec 0))
((eq? input-type 'right) (handle-infix process-prec 1))
((eq? input-type 'nonassoc) (handle-infix process-nonassoc 0))
(else error "post op, infix op or end of expression expected here"))))))))
;alters the stacks and returns the input
(define (start input min-prec stacks)
(if (null? input) (error "expression expected")
(let* [(cur (car input))
(input-type (car cur))]
(set! input (cdr input))
;pre could clearly work with new stacks, but could it reuse the current one?
(cond ((eq? input-type 'pre)
(let [(new-stack (prec-parse))]
(set! input (start input (cadr cur) new-stack))
(push-data! stacks
(cons cur (list (pop-data new-stack))))
;we might want to assert here that the cdr of the new stack is null
(post input min-prec stacks)))
((eq? input-type 'data)
(push-data! stacks cur)
(post input min-prec stacks))
((eq? input-type 'grouped)
(let [(new-stack (prec-parse))]
(start (cdr cur) MIN-PREC new-stack)
(push-data! stacks (pop-data new-stack)))
;we might want to assert here that the cdr of the new stack is null
(post input min-prec stacks))
(else (error "bad input"))))))
(define (op-parse input)
(let [(stacks (prec-parse))]
(start input MIN-PREC stacks)
(pop-data stacks)))
(define (main)
(op-parse (read)))
(main)