Text File Binary Search

最后都变了- 提交于 2019-12-08 06:54:30

问题


I have a test file that looks like this:

Ampersand           Gregorina           5465874526370945
Anderson            Bob                 4235838387422002
Anderson            Petunia             4235473838457294
Aphid               Bumbellina          8392489357392473
Armstrong-Jones     Mike                8238742438632892

And code that looks like this:

#include <iostream>
#include <string>
#include <fstream>

class CardSearch
{
protected:
    std::ifstream cardNumbers;

public:
    CardSearch(std::string fileName)
    {
        cardNumbers.open(fileName, std::ios::in);

        if (!cardNumbers.is_open())
        {
            std::cout << "Unable to open: " << fileName;
        }
        return;
    }

    std::string Find(std::string lastName, std::string firstName)
    {
        // Creating string variables to hold first and last name
        // as well as card number. Also creating bools to decide whether
        // or not the person has been found or if the last name is the only
        // identifier for a found person
        std::string lN;
        std::string fN;
        std::string creditNumber;
        bool foundPerson = false;

        // By using the seekg and tellg functions, we can find our place
        // in the file and also calculate the amount of lines within the file
        cardNumbers.seekg(0, std::ios::beg);
        cardNumbers.clear();
        std::streamsize first = cardNumbers.tellg();
        cardNumbers.ignore(std::numeric_limits<std::streamsize>::max());
        cardNumbers.clear();
        std::streamsize last = cardNumbers.tellg();
        cardNumbers.seekg(0, std::ios::beg);
        std::streamsize lineNumbers = (last / 57);
        std::streamsize middle;

        while (first <= lineNumbers)
        {
            middle = (first + lineNumbers) / 2;
            // middle * 57 takes us to the beginning of the correct line
            cardNumbers.seekg(middle * 57, std::ios::beg);
            cardNumbers.ignore(std::numeric_limits<std::streamsize>::max(), '\n');

            cardNumbers >> lN >> fN;

            if (lN < lastName)
            {
                first = middle + 1;
            }
            else if (lN > lastName)
            {
                lineNumbers = middle - 1;
            }
            else
            {
                if (fN < firstName)
                {
                    first = middle + 1;
                }
                else if (fN > firstName)
                {
                    lineNumbers = middle - 1;
                }
                else if (fN == firstName)
                {
                    foundPerson = true;
                    break;
                }
            }
        }

        if (foundPerson)
        {
            // When a person is found, we seek to the correct line position and 
            // offset by another 40 characters to receive the card number
            cardNumbers.seekg((middle * 57) + 40, std::ios::beg);
            std::cout << lN << ", " << fN << " ";
            cardNumbers >> creditNumber;
            return creditNumber;
        }
        return "Unable to find person.\n";
    }
};

int main()
{
    CardSearch CS("C:/Users/Rafael/Desktop/StolenNumbers.txt");
    std::string S = CS.Find("Ampersand", "Gregorina");
    std::cout << S;

    std::cin.ignore();
    std::cin.get();

    return 0;
}

I am able to retrieve all but the first record in the list. It seems as though the seekg is seeking to the correct position but cardNumbers is not reading the correct information. When 'middle' is set to 0, the seekg should seek to the 0th line, (middle * 57), read in Ampersand Gregorina and make a comparison. Instead, it remains reading Anderson Bob.

Any ideas as to why this may be happening?

Thanks


回答1:


LineNumbers is being modified by your loop going from 4, to 1, to -1. The -1 makes your loop terminate too early so you don't pick up the first entry properly.

It seems like a homework problem, so I hope you can use this to direct yourself towards an answer.




回答2:


When using functions such as seekg, it is always best to open the file in binary mode, not text mode as your code is doing now. In other words, you should be doing this:

cardNumbers.open(fileName, std::ios::in | std::ios::binary);

The reason is that opening a file in text mode will allow end-of-line translations to be done. This renders functions such as seekg, tellg, etc. anything between unstable (or lucky to work) at best, and in the worst case, useless for text processing.

When a file is opened in binary mode, the seekg and other family of functions work as expected, since there are no end-of-line translations being done. You will actually seek to the byte offset in the file that you specify, and not be thrown off by end-of-line translations.

Also, once you do this, the length of the data in the line includes not only the visible text, but also the invisible characters that make up the end-of-line sequence. So your hand calculation of 57 is not going to be correct in binary mode -- it should be 58 or 59, depending on whether you are using Linux / Unix, or Windows, respectively.



来源:https://stackoverflow.com/questions/33926595/text-file-binary-search

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!