define CRF++ template file

淺唱寂寞╮ 提交于 2019-12-20 06:04:10

问题


This is my issue, but it doesn't say HOW to define the template file correctly.

My training file looks like this:

上   B-NR
海   L-NR
浦   B-NR
东   L-NR
开   B-NN
发   L-NN
与   U-CC
法   B-NN
制   L-NN
建   B-NN
...

回答1:


CRF++ is extremely easy to use. The instructions on the website explains it clearly.

http://crfpp.googlecode.com/svn/trunk/doc/index.html

Suppose we extract feature for the line 东 L-NR

Unigram

U02:%x[0,0] #means column 0 of the current line

U03:%x[1,0] #means column 0 of the next line

So the underlying feature is "column0=开"

Similar for bigrams




回答2:


It seems that this issue arises from not clearly understanding how CRF++ is processing the training file. Your features may not include the values in the last column. These are the labels! If you were to include them in your features, your model would be trivially perfect! When you define your template file, because you only have two columns, it can only include rules of the form %x[n,0]. It is hardcoded into CRF++ (though not clearly documented, as far as I can tell), that -4 <= n <= 4.



来源:https://stackoverflow.com/questions/28792024/define-crf-template-file

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!