How to approach parsing through a javascript file?

可紊 提交于 2019-12-24 18:30:58

问题


I want to parse through a javascript and find all the variable declarations, attributions, and calls to functions from a specific library.

What would be the best approach:regular expressions, lexer, use something already done that does that (does it exist?)....?

What I want in fact is to be assured that an object namespace and methods are not modified, and this through a static analysis.


回答1:


You can not do it with regexes and probably you also do not want to write you own implementation of ecma-standard 262 (It is a total overkill).
As for me I dig google's V8 javascript engine, more precisely PyV8. I suggest you can use it.

If you had problems there is the code I used to install (pip installation had an error for my x64 system, so I used sources):

apt-get install subversion scons libboost-python-dev
svn checkout http://v8.googlecode.com/svn/trunk/ v8
svn checkout http://pyv8.googlecode.com/svn/trunk/ pyv8
cd v8
export PyV8=`pwd`
cd ../pyv8
sudo python setup.py build
sudo python setup.py install

As I remember these commands did not make errors for me. (I copypasted it but it worked)

Answer to the question itself:
More complex hello wolrd example, list some varibales of the global object:

import PyV8

class Global(PyV8.JSClass):      # define a compatible javascript class
    def hello(self):               # define a method
        print "Hello World"

    def alert(self, message): # my own alert function
        print type(message), '  ', message

    @property
    def GObject(self): return self

    def __setattr__(self, key, value):
        super(Global, self).__setattr__(key, value)
        print key, '=', value

G = Global()
ctxt = PyV8.JSContext(G)
ctxt.enter()
ctxt.eval("var a=hello; GObject.b=1.0; a();")
list_all_cmd = '''for (myKey in GObject){
alert(GObject[myKey]);
}'''
ctxt.eval(list_all_cmd)
ctxt.leave()

(In browsers you should call you global object - Window)
This code will output:

b = 1
Hello World
<class '__main__.Global'>    <__main__.Global object at 0x7f202c9159d0>
<class '_PyV8.JSFunction'>    function () { [native code] }
<type 'int'>    1
<class '_PyV8.JSFunction'>    function () { [native code] }
<class '_PyV8.JSFunction'>    function () { [native code] }
<class '_PyV8.JSFunction'>    function () { [native code] }
<class '_PyV8.JSFunction'>    function () { [native code] }
<class '_PyV8.JSFunction'>    function () { [native code] }
<class '_PyV8.JSFunction'>    function () { [native code] }
<class '_PyV8.JSFunction'>    function () { [native code] }
<class '_PyV8.JSFunction'>    function () { [native code] }



回答2:


You can use Rhino from Mozilla. It is a Javascript implementation written in Java. 1.7R3 release onwards have a new AST API. The classes are available in org.mozilla.javascript.ast

If you want to do this in Javascript, please see this discussion JavaScript parser in JavaScript

Hope it helps.



来源:https://stackoverflow.com/questions/11878691/how-to-approach-parsing-through-a-javascript-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!