I have a large text file I am reading from and I need to find out how many times some words come up. For example, the word the. I\'m doing this line by line e
Why not run your line through the Java StringTokenizer then you can get the words broken up by not just spaces but also commas and other punctuation. Just run through your tokens and count the occurrence of each "the" or any word you would like.
It would be very easy to expand this a bit and make a map that had each word as a key and kept a count of each word use. Also you may need to consider running each word through a function to stem the word so you can count a more useful thing then just the words.