问题
I'm building this grammar for a simple programming language (already solved previous ambiguity issues: Can't figure out why Bison is throwing "Rules useless in parser due to conflicts").
This is my complete grammar: http://pastebin.com/yBHLSP0z
And this is the output file from Bison: http://pastebin.com/eAma3gWy
(sorry, they're in Spanish, but I think they're pretty self-explanatory)
The thing is, I'm still getting one shift/reduce error at state 107 (I'm translating it):
state 107
31 factor: ID .
48 concatenacion: ID . OPERADOR_SUMA ID
49 | ID . OPERADOR_SUMA literal_string
OPERADOR_SUMA shift and go to state 140
OPERADOR_SUMA [reduce using rule 31 (factor)]
$default reduce using rule 31 (factor)
Now, state 107 is called from state 70:
estado 70
45 asignacion: ID OPERADOR_ASIGNACION . concatenacion
46 | ID OPERADOR_ASIGNACION . expresion
47 | ID OPERADOR_ASIGNACION . literal_string
OPERADOR_RESTA desplazar e ir al estado 55
PARENTESIS_ABRE desplazar e ir al estado 56
COMILLA desplazar e ir al estado 67
ID desplazar e ir al estado 107
expresion ir al estado 108
termino ir al estado 61
factor ir al estado 62
concatenacion ir al estado 109
literal_string ir al estado 110
literal_real ir al estado 63
literal_entero ir al estado 64
signo ir al estado 65
What I think is happening (please correct me if I'm wrong) is that when it finds a rule for "asignacion" like this:
asignacion: ID OPERADOR_ASIGNACION concatenacion | ID OPERADOR_ASIGNACION expresion
it sees that from "expresion" it can get an ID token (expresion > termino > factor > ID), making a ID OPERADOR_ASIGNACION ID:
expresion:
expresion OPERADOR_SUMA termino
| expresion OPERADOR_RESTA termino
| termino
;
termino:
termino OPERADOR_MULTIPLICACION factor
| termino OPERADOR_DIVISION factor
| factor
;
factor:
ID
| literal_entero
| literal_real
| PARENTESIS_ABRE expresion PARENTESIS_CIERRA
;
Now, when it reaches an ID OPERADOR_ASIGNACION concatenacion and looks at the rules for "concatenacion", it gets:
concatenacion:
ID OPERADOR_SUMA ID
| ID OPERADOR_SUMA literal_string
| literal_string OPERADOR_SUMA ID
| literal_string OPERADOR_SUMA literal_string
;
Two of them begin with "ID". So if any of those two rules are selected, it gets to a state where it can obtain a ID OPERADOR_ASIGNACION ID, only that with the "concatenacion" rules, it needs to find a "OPERADOR_SUMA" token afterwards. But I believe it's choking as soon as it sees that from both "concatenacion" and "expresion" can form the ID OPERADOR_ASIGNACION ID expression.
If this is not exactly what's going on, I'd like to know what is then the problem.
And, if I'm correct as where the error happens, I really don't know how to solve it.
Please help :)
Thanks!
回答1:
The problem arises from:
asignacion
: ID OPERADOR_ASIGNACION concatenacion
| ID OPERADOR_ASIGNACION expresion
;
and the selected alternatives:
expresion
: expresion OPERADOR_SUMA termino
;
termino
: factor
;
factor
: ID
;
concatenacion
: ID OPERADOR_SUMA ID
;
Which means that when your parser encounters:
x = y + z
it cannot tell whether it is processing the first or second alternative for asignacion.
That's the easy part. How to fix? The simplest fix (if it works, which I've not tested) would be to remove the concatenacion rule I showed, and in the expresion rule, recognize when you are dealing with a concatenacion vs an expresion since they are syntactically identical:
ID OPERADOR_SIGNACION ID OPERADOR_SUM ID
You'd look at the types of the two operands of the expresion, and if they are both string types, then you'd assume it was a concatenacion, otherwise an expresion.
You might want to review the whole of the concatenacion rule, though. You'd need to let strings through the factor rule, I think, so you'd add another alternative to factor:
factor
: literal_string
;
It would mean you'd have to reject literal strings in other rules, though, so more semantic checking. An alternative is to introduce a separate operator other than + to mean 'string concatenation'. SQL uses ||; some languages have used ,; you could use another token altogether, such as @. There are many options once you break away from +. Can you even just use 'two adjacent string expressions mean the concatenation operation', with no operator between them?
If none of that works, then get back to me.
来源:https://stackoverflow.com/questions/19942710/how-can-i-force-bison-to-shift-to-resolve-a-conflict