What algorithm is used for finding ngrams?
Supposing my input data is an array of words and the size of the ngrams I want to find, what algorithm I should use?
Simple heres the java answer:
int ngrams = 9;// let's say 9-grams since it's the length of "bonasuera"...
String string = "bonasuera";
for (int j=1; j <= ngrams;j++) {
for (int k=0; k < string.length()-j+1;k++ )
System.out.print(string.substring(k,k+j) + " ");
System.out.println();
}
output :
b o n a s u e r a
bo on na as su ue er ra
bon ona nas asu sue uer era
bona onas nasu asue suer uera
bonas onasu nasue asuer suera
bonasu onasue nasuer asuera
bonasue onasuer nasuera
bonasuer onasuera
bonasuera