Some code of reading .csv file crashed

ⅰ亾dé卋堺 提交于 2020-01-07 02:13:34

问题


I tried make a code to read a csv file and change one value by getting line and column. In the first I read the file to check how many lines and cols up there, and than I create a dynamic 2D array- every line is the line on the file. actually make the file in 2D array. and than I will change the value of the chosen line and col and write the whole array back to the file. someone know why it's crashed? it's crashed in the first line of -

bigArr[i][j]=(char)ch;

the function:

int changeValue(int line, int col, char str[],const char* path)
{
    FILE* csvFile = fopen(path, "r");
    char arr[VERY_BIG_MEMORY];
    int l = 0, c = 1;
    int i = 0,j=0;
    int ch = 0;
    if (!csvFile)
    {
        printf("Cant read the file\nPlease open a file\n");
        return -1;
    }
    do
    {
        ch = fgetc(csvFile);
        if (ch == ',')
        {
            c++;
        }
    } while (ch !='\n');
    fseek(csvFile, 0L, SEEK_SET);
    do
    {
        ch = fgetc(csvFile);
        if (ch == '\n')
        {
            l++;
        }
    } while (ch!=EOF);
    char** bigArr = (char**)calloc(l*c,sizeof(char*));
    for (i = 0; i < l*c; i++)
    {
        bigArr[i] = (char*)calloc(10,sizeof(char));
    }
    fseek(csvFile, 0L, SEEK_SET);
    do
    {

        ch = fgetc(csvFile);
        if (ch == ',')
        {
            j++;
        }
        else if (ch == '\n')
        {
            i++;
        }
        else
        {
            bigArr[i][j]=(char)ch;
        }
    } while (ch != EOF);
}

回答1:


The loop that's crashing should be more like:

enum { MAX_FIELD_WIDTH = 10 };  // Including null terminator

i = j = 0;
while ((ch = getc(csvFile)) != EOF)
{
    if (ch == ',' || ch == '\n')
    {
        bigArr[i++][j] = '\0';
        j = 0;
    }
    else
    {
        if (j < MAX_FIELD_WIDTH - 1)
            bigArr[i][j++] = ch;
        // else ignore excess characters
}

Warning: untested code!

Your code is simply creating a linear list of l * c field values, which is fine. You can pick the fields for line n by accessing fields bigArr[n * c] through bigArr[n * c + c - 1] (counting from line 0).

For important variables like l and c, I use longer names such as rows (or lines) and cols. Still not long, but more meaningful. Single character names should be used with limited scope.

Note that this code ignores subtleties of the CSV format such as fields with commas inside double quotes, let alone newlines within double quoted fields. It also ignores the possibility of varying numbers of fields in the lines. If the code kept track of line numbers, it would be possible to handle both too many fields (ignoring the extra) and too few fields (creating empty entries for missing fields). If the code that pre-scans the file was cleverer, it could keep a record of the minimum and maximum number of columns per line as well as the number of lines. Problems could then be diagnosed too.

With a more complex memory management scheme, it would also be possible to scan the file just once, which has advantages if the file is actually a terminal or pipe, rather than a disk file. It could also handle arbitrarily long field values instead of restricting them to 10 bytes including the terminal null byte.


The code should check that the file could be opened, and close it when it is finished. The current function interface is:

int changeValue(int line, int col, char str[], const char* path)

but the first three values are ignored by the code shown. This is probably because the final code will change one of the values read and then rewrite the file. Presumably, it would report an error if asked to change a non-existent column or line. These relatively minor infelicities are probably due to the minimization to make the code resemble an MCVE (How to create a Minimal, Complete, and Verifiable Example?).




回答2:


If your goal is to read the data and store it in a char ** pointer, then this is a way to do it

int
changeValue(const char *path)
{
    FILE *file;
    size_t column_count;
    size_t row_count;
    int character;
    char **result;
    char *field;
    char large_buffer[100];
    size_t length;
    size_t index;

    file = fopen(path, "r");
    if (file == NULL)
    {
        printf("Cant read the file\nPlease open a file\n");
        return -1;
    }

    /* Count Rows and Columns */
    while ((character = fgetc(file)) != EOF)
    {
        switch (character)
        {
            case ',':
                ++column_count;
                break;
            case '\n':
                ++row_count;
                break;
        }
    }
    rewind(file);

    result = malloc(row_count * column_count * sizeof(char *));
    if (result == NULL)
    {
        fclose(file);
        return -1; /* Do something to inform the caller */
    }

    length = 0;
    index = 0;
    while ((character = fgetc(file)) != EOF)
    {
        switch (character)
        {
            case '\n':
            case ',':
                field = malloc(length + 1);
                if (field != NULL)
                {
                    memcpy(field, large_buffer, length);
                    field[length] = '\0';
                }
                result[index++] = field;

                length = 0;
                break;
            default:
                if (length < sizeof(large_buffer))
                    large_buffer[length++] = character;
                break;
        }
    }
    /* USE THE DATA NOW AND FREE THE POINTERS */
    fclose(file);

    return 0;
}

Note that:

  1. You can count rows and columns to pre-allocate the destination but you can do it in a single loop because you are reading the file one character at a time.

  2. You don't need to pre-allocate every pointer in the char ** array to a fixed size because that doesn't make much sense, you could just use the fixed size to pre-allocate it like this

    char (*bigArr)[10] = malloc(sizeof(*bigArr));
    

    Instead, in the second loop just use a buffer large enough to hold characters until a ',' or '\n' is found (ignore characters if they don't fit, as you would anyway in your code) and then allocate the pointer and copy the data into it.

  3. The actual problem as pointed out in the answer by @JonathanLeffler Here is that you were not resetting the j index properly thus writing after the bounds of the array.




回答3:


Get rid of the arr array as it isn't used.

You don't check whether you allocated the bigArr memory ok, nor the smaller arrays in it.

Also you are assigning bigArr[i][j] tp the value of a character, which is not correct - you want to set the array you allocated in bigArr[i][j] to it (and will have to handle appending the character within that array since you are reading character by character)



来源:https://stackoverflow.com/questions/37212191/some-code-of-reading-csv-file-crashed

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!