CoreNLP Stanford Dependency Format

丶灬走出姿态 提交于 2019-12-13 16:43:26

问题


Bills on ports and immigration were submitted by Senator Brownback, Republican of Kansas

From the above sentence, I am looking to obtain the following typed dependencies:

nsubjpass(submitted, Bills)
auxpass(submitted, were)
agent(submitted, Brownback)
nn(Brownback, Senator)
appos(Brownback, Republican)
prep_of(Republican, Kansas)
prep_on(Bills, ports)
conj_and(ports, immigration)
prep_on(Bills, immigration)

This should be possible as per Table 1, Figure 1 on the documentation for Stanford Dependencies.

Using the below code I have only been able to achieve the following dependency makeup (code outputs this):

root(ROOT-0, submitted-7)
nmod:on(Bills-1, ports-3)
nmod:on(Bills-1, immigration-5)
case(ports-3, on-2)
cc(ports-3, and-4)
conj:and(ports-3, immigration-5)
nsubjpass(submitted-7, Bills-1)
auxpass(submitted-7, were-6)
nmod:agent(submitted-7, Brownback-10)
case(Brownback-10, by-8)
compound(Brownback-10, Senator-9)
punct(Brownback-10, ,-11)
appos(Brownback-10, Republican-12)
nmod:of(Republican-12, Kansas-14)
case(Kansas-14, of-13)

Question - How do I achieve the desired output above?

Code

public void processTestCoreNLP() {
    String text = "Bills on ports and immigration were submitted " +
            "by Senator Brownback, Republican of Kansas";

    Annotation annotation = new Annotation(text);
    Properties properties = PropertiesUtils.asProperties(
            "annotators", "tokenize,ssplit,pos,lemma,depparse"
    );

    AnnotationPipeline pipeline = new StanfordCoreNLP(properties);

    pipeline.annotate(annotation);

    for (CoreMap sentence : annotation.get(SentencesAnnotation.class)) {
        SemanticGraph sg = sentence.get(EnhancedPlusPlusDependenciesAnnotation.class);
        Collection<TypedDependency> dependencies = sg.typedDependencies();
        for (TypedDependency td : dependencies) {
            System.out.println(td);
        }
    }
}

回答1:


If you want to get the CCprocessed and collapsed Stanford Dependencies (SD) for a sentence through the NN dependency parser, you'll have to set a property to circumvent a small bug in CoreNLP.

However, please note that we are no longer maintaining the Stanford Dependencies code and unless you have really good reasons to use SD, we'd recommend using Universal Dependencies for any new projects. Take a look at the Universal Dependencies (UD) documentation and Schuster and Manning (2016) for more information on the UD representation.

To obtain the CCprocessed and collapsed SD representation, set the depparse.language property as follows:

public void processTestCoreNLP() {
  String text = "Bills on ports and immigration were submitted " +
        "by Senator Brownback, Republican of Kansas";

  Annotation annotation = new Annotation(text);
  Properties properties = PropertiesUtils.asProperties(
        "annotators", "tokenize,ssplit,pos,lemma,depparse");

  properties.setProperty("depparse.language", "English")

  AnnotationPipeline pipeline = new StanfordCoreNLP(properties);

  pipeline.annotate(annotation);

  for (CoreMap sentence : annotation.get(SentencesAnnotation.class)) {
    SemanticGraph sg = sentence.get(CollapsedCCProcessedDependenciesAnnotation.class);
    Collection<TypedDependency> dependencies = sg.typedDependencies();
    for (TypedDependency td : dependencies) {
      System.out.println(td);
    }
  }
}



回答2:


CoreNLP recently switched from the old Stanford dependencies format (the format in the top example) to Universal Dependencies. My first recommendation is to use the new format if at all possible. Continuing development on the parsers will be using universal dependencies, and the format is in many ways similar to the old format, modulo cosmetic changes (e.g., prep -> nmod).

However, if you'd like to get the old dependency format out, you can do so with the CollapsedCCProcessedDependenciesAnnotation annotation.



来源:https://stackoverflow.com/questions/45202486/corenlp-stanford-dependency-format

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!