how to find out if csv file fields are tab delimited or comma delimited

后端 未结 15 1018
[愿得一人]
[愿得一人] 2020-12-01 09:46

how to find out if csv file fields are tab delimited or comma delimited. I need php validation for this. Can anyone plz help. Thanks in advance.

15条回答
  •  误落风尘
    2020-12-01 10:19

    I used @Jay Bhatt's solution for finding out a csv file's delimiter, but it didn't work for me, so I applied a few fixes and comments for the process to be more understandable.

    See my version of @Jay Bhatt's function:

    function decide_csv_delimiter($file, $checkLines = 10) {
    
        // use php's built in file parser class for validating the csv or txt file
        $file = new SplFileObject($file);
    
        // array of predefined delimiters. Add any more delimiters if you wish
        $delimiters = array(',', '\t', ';', '|', ':');
    
        // store all the occurences of each delimiter in an associative array
        $number_of_delimiter_occurences = array();
    
        $results = array();
    
        $i = 0; // using 'i' for counting the number of actual row parsed
        while ($file->valid() && $i <= $checkLines) {
    
            $line = $file->fgets();
    
            foreach ($delimiters as $idx => $delimiter){
    
                $regExp = '/['.$delimiter.']/';
                $fields = preg_split($regExp, $line);
    
                // construct the array with all the keys as the delimiters
                // and the values as the number of delimiter occurences
                $number_of_delimiter_occurences[$delimiter] = count($fields);
    
            }
    
           $i++;
        }
    
        // get key of the largest value from the array (comapring only the array values)
        // in our case, the array keys are the delimiters
        $results = array_keys($number_of_delimiter_occurences, max($number_of_delimiter_occurences));
    
    
        // in case the delimiter happens to be a 'tab' character ('\t'), return it in double quotes
        // otherwise when using as delimiter it will give an error,
        // because it is not recognised as a special character for 'tab' key,
        // it shows up like a simple string composed of '\' and 't' characters, which is not accepted when parsing csv files
        return $results[0] == '\t' ? "\t" : $results[0];
    }
    

    I personally use this function for helping automatically parse a file with PHPExcel, and it works beautifully and fast.

    I recommend parsing at least 10 lines, for the results to be more accurate. I personally use it with 100 lines, and it is working fast, no delays or lags. The more lines you parse, the more accurate the result gets.

    NOTE: This is just a modifed version of @Jay Bhatt's solution to the question. All credits goes to @Jay Bhatt.

提交回复
热议问题