Is there an open source Java library/algorithm for finding if a particular piece of text is a question or not?
I am working on a question answering system that needs t
Many quasi-questions/requests-for-info are posed in the grammatical form of a statement; e.g. "I would like to know who stole my bike".
I would abandon all hope of determining from its structure whether the user's input is a question, and simply assume it is a question unless it is unambiguously not a question. You could take an iterative, interactive approach so the system could refine its "understanding" of the user's input:
User: I would like to know how many angels fit on the head of a pin.
System: Do you have a question about angels?
User: Yes.
System: Do you want to know if angels are fit?
User: No.
System: Do you want to know if angels have heads?
User: Possibly.
System: Do you want to know if angels have pins?
User: No.
System: Do you want to know if angels are numerous?
User: No.
System: Do you want to know the dimensions of an angel?
User: Yes.
System: Do you mean angels in paintings?
User: No.
System: Do you mean angels in myth and religious writing?
User: Yes.
System: Angels are metaphysical beings.
User: I hear that Pennsylvania was founded by William Penn. Is that true?
System: Do you have a question about hearing?
User: No.
System: Do you have a question about Pennsylvania?
User: Yes.
System: Pennsylvania was founded by William Penn.
User: When?
System: 1682.
User: What does the name mean?
System: What name?
User: Pennsylvania!
System: Do you want to know the meaning of Pennsylvania?
User: Yes.
System: Pennsylvania means Penn's Woods.