machine learning to overcome typo errors [closed]

余生长醉 提交于 2019-12-21 23:13:46

问题


I have a list of names of medicines suppose(crocin,seroflo,oxitab,etc).The list is very long. Now suppose I need to find whether a particular medicine is present or not in the list,but also there could be typo errors.supposing I intended to find crocin in the list,but i instead type crosin.I want the machine learning algorithm to overcome this typographical error of mine and for small differences like crocin and crosin, it should return as match found


回答1:


I don't think you need machine learning a simple edit distance algorithm should do that.

https://en.wikipedia.org/wiki/Edit_distance




回答2:


I agree the necessity of using ML methods is doubtful. But if you really want to using learning-based method for "spelling correction" (I am not sure if this works well for medicine names), you can refer papers below:

A winnow-based approach to context-sensitive spelling correction

An improved error model for noisy channel spelling correction

A large scale ranker-based system for search query spelling correction

A discriminative model for query spelling correction with latent structural SVM

A Graph Approach to Spelling Correction in Domain-Centric Search.

And this paper is about correction for person names:

Hashing-based approaches to spelling correction of personal names



来源:https://stackoverflow.com/questions/18329826/machine-learning-to-overcome-typo-errors

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!