问题
I'm a beginner programmer and I'm trying to do one program that opens a text file with a large text inside and then it counts how many words it contains. Then it should write how many different words are in the text, and write the frecuency of each word in the text. I had the intention to use one array-string to store all unique words and one int-string to store the frequency.
The program counts the words, but I'm a little bit unsure about how could I write the code correctly to get the list of the words and the frequency them are repeated in the text.
I wrote this:
import easyIO.*;
import java.util.*;
class Oblig3A{
public static void main(String[] args){
int cont = 0;
In read = new In (alice.txt);
In read2 = new In (alice.txt);
while(read.endOfFile() == false)
{
String info = read.inWord();
System.out.println(info);
cont = cont + 1;
}
System.out.println(UniqueWords);
final int AN_WORDS = cont;
String[] words = new String[AN_WORDS];
int[] frequency = new int[AN_WORDS];
int i = 0;
while(les2.endOfFile() == false){
word[i] = read2.inWord();
i = i + 1;
}
}
}
回答1:
Ok, here is what you need to do:
1. Use a BufferedReader to read the lines of text from the file, one by one.
2. Create a HashMap<String,Integer> to store the word, frequency relations.
3. When you read each line of text, use split() to get all the words in the line of text in an array of String[]
4. Iterate over each word. For each word, retrieve the value from the HashTable. if you get a null value, you have found the word for the first time. Hence, create a new Integer with value 1 and place it back in the HashMap
If you get a non-null value, then increment the value and place it back in the HashMap.
5. Do this till you do not reach EOF.
Done !
回答2:
You can use a
Map<String, Integer> map = HashMap<String, Integer>();
And then add the words to the map asking if the value is already there. If it is not, add it to the map with a counter initialized to 1.
if(!map.containsKey(word))
{
map.put(word, new Integer("1"));
}
else
{
map.put(word, map.get(word) + new Integer(1));
}
In the end you will have a map with all the words that the file contains and a Integer that represents how many times does the word appear in the text.
回答3:
You basically need a hash here. In java , you can use a HashMap<String, Integer> which will store words and their frequency.
So when you read in a new word, check it up in the hashMap, say h, and if it exists , increase the frequency or add a new word with frequency = 1.
回答4:
If you can use a library you may want to consider using a Guava Multiset, it has the counting functionality already built in:
public void count() throws IOException {
Multiset<String> countSet = HashMultiset.create();
BufferedReader bufferedReader = new BufferedReader(new FileReader("alice.txt"));
String line;
while ((line = bufferedReader.readLine()) != null) {
List<String> words = Arrays.asList(line.split("\\W+"));
countSet.addAll(words);
}
bufferedReader.close();
for (Entry<String> entry : countSet.entrySet()) {
System.out.println("word: " + entry.getElement() + " count: " + entry.getCount());
}
}
来源:https://stackoverflow.com/questions/19212179/java-program-counts-all-the-words-from-a-text-file-and-counts-frequency-of-ea