Parse GATE Document to get Co-Reference Text

拈花ヽ惹草 提交于 2020-01-13 06:04:28

问题


I'm creating a GATE app which used to find co-reference text. It works fine and I have created zipped file of the app by export option provided in GATE.

Now I'm trying to use the same in my Java code.

    Gate.runInSandbox(true);
    Gate.setGateHome(new File(gateHome));
    Gate.setPluginsHome(new File(gateHome, "plugins"));
    Gate.init();
    URL applicationURL = new URL("file:" + new Path(gateHome, "application.xgapp").toString());

    application = (CorpusController) PersistenceManager.loadObjectFromUrl(applicationURL);
    corpus = Factory.newCorpus("Megaki Corpus");
    application.setCorpus(corpus);

    Document document = Factory.newDocument(text);

    corpus.add(document);
    application.execute();
    corpus.clear();

Now how can I parse this document and get co-reference text?


回答1:


I do not know about yours, but co-references created manually using the Co-reference Editor are stored in a document feature. The feature name seems to be "MatchesAnnots" and the type Map<String, List<List<Integer>>>.

In my case, following code prints as name: null (the default annotation set) followed by all co-reference chains present in it.

Object obj = document.getFeatures().get("MatchesAnnots");

@SuppressWarnings("unchecked")
Map<String, List<List<Integer>>> map = (Map<String, List<List<Integer>>>) obj;

for (Entry<String, List<List<Integer>>> e : map.entrySet()) {
    System.err.println("as name: "+  e.getKey());
    for (List<Integer> chain : e.getValue()) {
        System.err.println("chain : "+  chain);         
    }
}


来源:https://stackoverflow.com/questions/27035648/parse-gate-document-to-get-co-reference-text

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!