Count the number of times each word occurs in a file

前端 未结 4 2028
挽巷
挽巷 2020-12-12 02:51

Hi I am writing a program that counts the number of times each word occurs in a file. Then it prints a list of words with counts between 800 and 1000, sorted in the order of

4条回答
  •  情深已故
    2020-12-12 03:19

    He. I know bluntly showing a solution is not really helping you. However.

    I glanced through your code and saw many unused and confused bits. Here's what I'd do:

    #include 
    #include 
    #include 
    #include 
    #include 
    #include 
    #include 
    #include 
    
    using namespace std;
    
    // types
    typedef std::pair frequency_t;
    typedef std::vector words_t;
    
    // predicates
    static bool byDescendingFrequency(const frequency_t& a, const frequency_t& b)
    { return a.second > b.second; }
    
    const struct isGTE // greater than or equal
    { 
        size_t inclusive_threshold;
        bool operator()(const frequency_t& record) const 
            { return record.second >= inclusive_threshold; }
    } over1000 = { 1001 }, over800  = { 800 };
    
    int main() 
    {
        words_t words;
        {
            map tally;
    
            ifstream inFile("bible.txt");
            string s;
            while (inFile >> s)
                tally[s]++;
    
            remove_copy_if(tally.begin(), tally.end(), 
                           back_inserter(words), over1000);
        }
    
        words_t::iterator begin = words.begin(),
                          end = partition(begin, words.end(), over800);
        std::sort(begin, end, &byDescendingFrequency);
    
        for (words_t::const_iterator it=begin; it!=end; it++)
            cout << it->second << "\t" << it->first << endl;
    }
    

    Authorized Verion:

    993 because
    981 men
    967 day
    954 over
    953 God,
    910 she
    895 among
    894 these
    886 did
    873 put
    868 thine
    864 hand
    853 great
    847 sons
    846 brought
    845 down
    819 you,
    811 so
    

    Vulgata:

    995 tuum
    993 filius
    993 nec
    966 suum
    949 meum
    930 sum
    919 suis
    907 contra
    902 dicens
    879 tui
    872 quid
    865 Domine
    863 Hierusalem
    859 suam
    839 suo
    835 ipse
    825 omnis
    811 erant
    802 se
    

    Performance is about 1.12s for for both files, but only 0.355s after drop-in replacing map<> with boost::unordered_map<>

提交回复
热议问题