Is there a way, in Objective-C/Cocoa, to convert spelled out words to an NSNumber or equivalent in multiple languages?
For example:
convert
NSNumberFormatter can convert from text to numbers:
NSNumberFormatter *formatter = [[NSNumberFormatter alloc] init];
formatter.numberStyle = NSNumberFormatterSpellOutStyle;
NSLog(@"%@", [formatter numberFromString:@"thirty-four"]);
NSLog(@"%@", [formatter numberFromString:@"three point five"]);
formatter.locale = [[NSLocale alloc]initWithLocaleIdentifier:[NSLocale localeIdentifierFromComponents:@{NSLocaleLanguageCode: @"es"}]];
NSLog(@"%@", [formatter numberFromString:@"ocho"]);
There are serious limitations as to what it can handle (it doesn't auto detect languages, if you deviate from the expected format (e.g. "thirty four" instead of "thirty-four"), fractions, etc.), but for the narrow domain, it appears to do the job.
NSLinguisticTagger will flag numbers for you in multiple languages.
NSArray * texts = @[@"It's 3 degrees outside", @"Ocho tacos", @"What is 3 1/2?", @"ocho"];
for (NSString * text in texts)
{
NSLinguisticTaggerOptions options = NSLinguisticTaggerOmitWhitespace | NSLinguisticTaggerJoinNames;
NSArray * tagSchemes = [NSLinguisticTagger availableTagSchemesForLanguage:@"en"];
tagSchemes = [tagSchemes arrayByAddingObjectsFromArray:[NSLinguisticTagger availableTagSchemesForLanguage:@"es"]];
NSLinguisticTagger * tagger = [[NSLinguisticTagger alloc] initWithTagSchemes:tagSchemes
options:options];
[tagger setString:text];
[tagger enumerateTagsInRange:NSMakeRange(0, [text length])
scheme:NSLinguisticTagSchemeNameTypeOrLexicalClass
options:options
usingBlock:^(NSString *tag, NSRange tokenRange, NSRange sentenceRange, BOOL *stop)
{
NSString *token = [text substringWithRange:tokenRange];
NSLog(@"%@: %@", token, tag);
}];
}
This does leave you with the task of identifying how and when to do things like fraction resolution.