Writing a Z80 assembler - lexing ASM and building a parse tree using composition?

徘徊边缘 提交于 2020-01-01 04:45:11

问题


I'm very new to the concept of writing an assembler and even after reading a great deal of material, I'm still having difficulties wrapping my head around a couple of concepts.

  1. What is the process to actually break up a source file into tokens? I believe this process is called lexing, and I've searched high and low for a real code examples that make sense, but I can't find a thing so simple code examples very welcome ;)

  2. When parsing, does information ever need to be passed up or down the tree? The reason I ask is as follows, take:

    LD BC, nn

It needs to be turned into the following parse tree once tokenized(???)

  ___ LD ___
  |        |
 BC        nn

Now, when this tree is traversed it needs to produce the following machine code:

01 n n

If the instruction had been:

LD DE,nn

Then the output would need to be:

11 n n

Meaning that it raises the question, does the LD node return something different based on the operand or is it the operand that returns something? And how is this achieved? More simple code examples would be excellent if time permits.

I'm most interested in learning some of the raw processes here rather than looking at advanced existing tools so please bear that in mind before sending me to Yacc or Flex.


回答1:


Well, the structure of the tree you really want for an instruction that operates on a register and an memory addressing mode involing an offset displacement and an index register would look like this:

    INSTRUCTION-----+
    |      |        |
  OPCODE  REG     OPERAND
                  |     |
                OFFSET  INDEXREG

And yes, you want want to pass values up and down the tree. A method for formally specifying such value passing is called "attribute grammars", and you decorate the grammar for your langauge (in your case, your assembler syntax) with the value-passing and the computations over those values. For more background, see Wikipedia on attribute grammars.

In a related question you asked, I discussed a tool, DMS, which handles expression grammars and building trees. As language manipulation tool, DMS faces exactly these same up-and-down the tree information flows issues. It shouldn't surprise you, that as a high-end language manipulation tool, it can handle attribute grammar computations directly.




回答2:


It is not necessary to build a parse tree. Z80 op codes are very simple. They consist of the op code and 0, 1 or 2 operands, separated by commas. You just need to split the opcode up into the (maximum of 3) components with a very simple parser - no tree is needed.




回答3:


Actually, the opcodes do have not a byte base, but an octal base. The best description I know is DECODING Z80 OPCODES.



来源:https://stackoverflow.com/questions/1305091/writing-a-z80-assembler-lexing-asm-and-building-a-parse-tree-using-composition

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!