Interpreting custom language

天大地大妈咪最大 提交于 2019-12-04 14:34:17

XML is great for storing relational data in a verbose way. I think it is a terrible candidate for writing logic such as a program, however.

Have you considered using an existing grammar/scripting language that you can embed, rather than writing your own? E.g:

LUA

Python

In one of my projects I actually started with an XML like language as I already had an XML parser and parsed the XML structure into an expression tree in memory to be interpreted/run.

This works out very nicely to get passed the problem of figuring out tokenizing/parsing of text files and concentrate instead on your 'language' and the logic of the operations in your language. The down side is writing the text files is a little strange and very wordy. Its also very unnatural for a programmer use to C/C++ syntax.

Eventually you could easily replace your XML with a full blown scanner & lexer to parse a more 'natural C++' like text format into your expression tree.

As for writing a scanner & lexer, I found it easier to write these by hand using simple logic flow/loops for the scanner and recursive decent parser for the lexer.

That said, ANTLR is great at letting you write out rules for your language and generating your scanner & lexer for you. This allows for much more dynamic language which can easily change without having to refactor everything again when new things are added. So, it might be worth looking into as learning this as it would save you much time in rewrites as things change if you hand wrote your own.

I'd recommend writing the app in F#. It has many useful features for parsing strings and xmls like Pattern Matching and Active Patterns.

For parsing C-like code I would recommend F# (just did one interpreter with F#, works like a charm)

For parsing XML's I would recommend C#/F# + XmlDocument class.

You basically need to work on two files:

  • Operator dictionary
  • Code file in YourLanguage

Load and interpret the operators and then apply them recursively to your code file.

Aaron Altman

The best prefab answer: S-expressions

C and XML are good first steps. They have sort of opposite disadvantages. The C-like syntax won't add a ton of extra characters, but it's going to be hard to parse due to ambiguity, the variety of tokens, and probably a bunch more issues I can't think of. XML is relatively easy to parse and there's tons of example code, but it will also contain tons of extra text. It might also give you too many options for where to stick language features - for example, is the number of times to repeat a loop an attribute, element or text?

S-expressions are more terse than XML for sure, maybe even C. At the same time, they're specific to the task of applying operations to data. They don't admit ambiguity. Parsers are simple and easy to find example code for.

This might save you from having to learn too much theory before you start experimenting. I'll emphasize MerickOWA's point that ANTLR and other parser generators are probably a bigger battle than you want to fight right now. See this discussion on programmers.stackexchange for some background on when the full generality of this type of tool could help.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!