Using Conditional Random Fields for Named Entity Recognition

混江龙づ霸主 提交于 2019-12-05 22:11:54

问题


What is Conditional Random Field? How does exactly Conditional Random Field identify proper names as person, organization, or place in a structured or unstructured text?

For example: This product is ordered by StackOverFlow Inc.

What does Conditional Random Field do to identify StackOverFlow Inc. as an organization?


回答1:


A CRF is a discriminative, batch, tagging model, in the same general family as a Maximum Entropy Markov model.

A full explanation is book-length.

A short explanation is as follows:

  1. Humans annotate 200-500K words of text, marking the entities.
  2. Humans select a set of features that they hope indicate entities. Things like capitalization, or whether the word was seen in the training set with a tag.
  3. A training procedure counts all the occurrences of the features.
  4. The meat of the CRF algorithm search the space of all possible models that fit the counts to find a pretty good one.
  5. At runtime, a decoder (probably a Viterbi decoder) looks at a sentence and decides what tag to assign to each word.

The hard parts of this are feature selection and the search algorithm in step 4.




回答2:


Well to understand that you got to study a lot of things.
For start

Understand the basic of markov and bayesian networks.
Online course available in coursera by daphne coller
https://class.coursera.org/pgm/lecture/index

CRF is a special type of markov network where we have observation and hidden states.
The objective is to find the best State Assignment to the unobserved variables also known as MAP problem.
Be Prepared for a lot of probability and Optimization. :-)



来源:https://stackoverflow.com/questions/1965789/using-conditional-random-fields-for-named-entity-recognition

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!