问题
How to tag text using wordnet by word's category (java as a interfacer ) ?
Example
Consider the sentences:
1) Computers need keyboard , moniter , CPU to work.
2) Automobile uses gears and clutch .
Now my objective is , the example sentences have to be tagged as
- 1st sentence
Computer/electronic
keyboard/electronic
CPU / electronic
- 2nd sentence
Automobile / mechanical
gears / mechanical
clutch / mechanical
"Clutch and gear is monitored using microchip " -> clutch /mechanical , gear/mechanical , microchip / electronic
"software used here to monitor hydrogen levels" -> software/computer , hydrogen / chemistry ..
I want to implement above mentions objective in java, that is to tag nouns by it related category such as technical , mechanical , electrical etc.
How to do this using wordnet .
My Previous Works
To achieve my objective I created a index of terms in text files for each category and matched it with a title .. if it contains a word in text files , then title get classified.
For example
Automobile.txt
have car , gear , wheel , clutch
. networking.txt
have server,IP Address,TCP , RIP
This is the Algorithm:
String Classify (String title)
{
String area;
if (compareWordsFrom ("Automobile.txt",title) == true ) area = "Auto";
if (compareWordsFrom ("Netoworking.txt",title) == true ) area = "Networking";
if (compareWordsFrom ("metels.txt",title) == true ) area = "Metallurgy";
return area;
}
it is very difficult to find related words to build the index. That is , the field automobile have 1000 of related terms which difficult to find.
To be precise , building index of terms manually is a heart-breaking process
I already used Stanford NLP , Open NLP , but they are tagging POS , but not satisfying what is need.
My Need
I need an automated way for my work . Do Natural Language Processing techniques able to do it. ?
Some suggesting to use wordnet library , but how can I use it since it is like dictionary , but I wants like ..
mechanical = {gear , turbine , engine ....) electronic = {microchip , RAM , ROM ,...)
Is there any word database available like in above mentioned structure ..
OR I is there is an ready-made library available ?