Converting HTML Table to a CSV automatically using PHP?

后端 未结 8 1742
一整个雨季
一整个雨季 2020-12-02 19:20

I am just in need to convert a this html table automatically in csv using PHP. Can someone provide any idea how to do this? Thanks.

$table = \'
相关标签:
8条回答
  • 2020-12-02 19:45

    To expand on the accepted answer I did this which allows me to ignore columns by class name and also deals with blank rows/columns.

    You can use str_get_html http://simplehtmldom.sourceforge.net/. Just include it and away you go! :)

    $html = str_get_html($html); // give this your HTML string
    
    header('Content-type: application/ms-excel');
    header('Content-Disposition: attachment; filename=sample.csv');
    
    $fp = fopen("php://output", "w");
    
    foreach($html->find('tr') as $element) {
      $td = array();
      foreach( $element->find('th') as $row) {
        if (strpos(trim($row->class), 'actions') === false && strpos(trim($row->class), 'checker') === false) {
          $td [] = $row->plaintext;
        }
      }
      if (!empty($td)) {
        fputcsv($fp, $td);
      }
    
      $td = array();
      foreach( $element->find('td') as $row) {
        if (strpos(trim($row->class), 'actions') === false && strpos(trim($row->class), 'checker') === false) {
          $td [] = $row->plaintext;
        }
      }
      if (!empty($td)) {
        fputcsv($fp, $td);
      }
    }
    
    fclose($fp);
    exit;
    
    0 讨论(0)
  • 2020-12-02 19:48

    You can do this with arrays and regular expressions... See below

    $csv = array();
    preg_match('/<table(>| [^>]*>)(.*?)<\/table( |>)/is',$table,$b);
    $table = $b[2];
    preg_match_all('/<tr(>| [^>]*>)(.*?)<\/tr( |>)/is',$table,$b);
    $rows = $b[2];
    foreach ($rows as $row) {
        //cycle through each row
        if(preg_match('/<th(>| [^>]*>)(.*?)<\/th( |>)/is',$row)) {
            //match for table headers
            preg_match_all('/<th(>| [^>]*>)(.*?)<\/th( |>)/is',$row,$b);
            $csv[] = strip_tags(implode(',',$b[2]));
        } elseif(preg_match('/<td(>| [^>]*>)(.*?)<\/td( |>)/is',$row)) {
            //match for table cells
            preg_match_all('/<td(>| [^>]*>)(.*?)<\/td( |>)/is',$row,$b);
            $csv[] = strip_tags(implode(',',$b[2]));
        }
    }
    $csv = implode("\n", $csv);
    var_dump($csv);
    

    Then you can use file_put_contents() to write the csv string to file..

    0 讨论(0)
  • 2020-12-02 19:53

    Baba's answer contains extra space. So, I updated the code to this:

    include "simple_html_dom.php";
    $table = '<table border="1">
    <tr>
    <th>Header 1</th>
    <th>Header 2</th>
    </tr>
    <tr>
    <td>row 1, cell 1</td>
    <td>row 1, cell 2</td>
    </tr>
    <tr>
    <td>row 2, cell 1</td>
    <td>row 2, cell 2</td>
    </tr>
    </table>';
    
    $html = str_get_html($table);
    
    
    
    header('Content-type: application/ms-excel');
    header('Content-Disposition: attachment; filename=sample.csv');
    
    $fp = fopen("php://output", "w");
    
    foreach($html->find('tr') as $element)
    {
        $td = array();
    foreach( $element->find('th') as $row)
    {
        $td [] = $row->plaintext;
    }
    
    foreach( $element->find('td') as $row)
    {
        $td [] = $row->plaintext;
    }
    fputcsv($fp, $td);
    }
    
    
    fclose($fp);

    0 讨论(0)
  • 2020-12-02 19:56

    You can use this function in separate js file:

    function exportTableToCSV($table, filename) {
    
            var $rows = $table.find('tr:has(td)'),
    
                // Temporary delimiter characters unlikely to be typed by keyboard
                // This is to avoid accidentally splitting the actual contents
                tmpColDelim = String.fromCharCode(11), // vertical tab character
                tmpRowDelim = String.fromCharCode(0), // null character
    
                // actual delimiter characters for CSV format
                colDelim = '","',
                rowDelim = '"\r\n"',
    
                // Grab text from table into CSV formatted string
                csv = '"' + $rows.map(function (i, row) {
                    var $row = $(row),
                        $cols = $row.find('td');
    
                    return $cols.map(function (j, col) {
                        var $col = $(col),
                            text = $col.text();
    
                        return text.replace('"', '""'); // escape double quotes
    
                    }).get().join(tmpColDelim);
    
                }).get().join(tmpRowDelim)
                    .split(tmpRowDelim).join(rowDelim)
                    .split(tmpColDelim).join(colDelim) + '"',
    
                // Data URI
                csvData = 'data:application/csv;charset=utf-8,' + encodeURIComponent(csv);
    
            $(this)
                .attr({
                'download': filename,
                    'href': csvData,
                    'target': '_blank'
            });
        }
    

    Now, to initiate this function, you can use:

    $('.getfile').click(
                function() { 
        exportTableToCSV.apply(this, [$('#thetable'), 'filename.csv']);
                 });
    

    where 'getfile' should be the class assigned to button, where you want to add call to action. (On clicking this button, the download popup will appear) and "thetable" should be the ID assigned to table you want to download.

    You can also change to the custom file name to download in code.

    0 讨论(0)
  • 2020-12-02 19:58

    I've adapted a simple class based on the code found on this thread that now handles colspan and rowspan. Not heavily tested and I'm sure it could be optimized.

    Usage:

    require_once('table2csv.php');
    
    $table = '<table border="1">
        <tr>
        <th colspan=2>Header 1</th>
        </tr>
        <tr>
        <td>row 1, cell 1</td>
        <td>row 1, cell 2</td>
        </tr>
        <tr>
        <td>row 2, cell 1</td>
        <td>row 2, cell 2</td>
        </tr>
        <tr>
        <td rowspan=2>top left row</td>
        <td>top right row</td>
        </tr>
        <tr>
        <td>bottom right</td>
        </tr>
        </table>';
    
    table2csv($table,"sample.csv",true);
    

    table2csv.php

    <?php
    
        //download @ http://simplehtmldom.sourceforge.net/
        require_once('simple_html_dom.php');
        $repeatContentIntoSpannedCells = false;
    
    
        //--------------------------------------------------------------------------------------------------------------------
    
        function table2csv($rawHTML,$filename,$repeatContent) {
    
            //get rid of sups - they mess up the wmus
            for ($i=1; $i <= 20; $i++) { 
                $rawHTML = str_replace("<sup>".$i."</sup>", "", $rawHTML);
            }
    
            global $repeatContentIntoSpannedCells;
    
            $html = str_get_html(trim($rawHTML));
            $repeatContentIntoSpannedCells = $repeatContent;
    
            //we need to pre-initialize the array based on the size of the table (how many rows vs how many columns)
    
            //counting rows is easy
            $rowCount = count($html->find('tr'));
    
            //column counting is a bit trickier, we have to iterate through the rows and basically pull out the max found
            $colCount = 0;
            foreach ($html->find('tr') as $element) {
    
                $tempColCount = 0;
    
                foreach ($element->find('th') as $cell) {
                    $tempColCount++;
                }
    
                if ($tempColCount == 0) {
                    foreach ($element->find('td') as $cell) {
                        $tempColCount++;
                    }
                }
    
                if ($tempColCount > $colCount) $colCount = $tempColCount;
            }
    
            $mdTable = array();
    
            for ($i=0; $i < $rowCount; $i++) { 
                array_push($mdTable, array_fill(0, $colCount, NULL));
            }
    
            //////////done predefining array
    
            $rowPos = 0;
            $fp = fopen($filename, "w");
    
            foreach ($html->find('tr') as $element) {
    
                $colPos = 0;
    
                foreach ($element->find('th') as $cell) {
                    if (strpos(trim($cell->class), 'actions') === false && strpos(trim($cell->class), 'checker') === false) {
                        parseCell($cell,$mdTable,$rowPos,$colPos);
                    }
                    $colPos++;
                }
    
                foreach ($element->find('td') as $cell) {
                    if (strpos(trim($cell->class), 'actions') === false && strpos(trim($cell->class), 'checker') === false) {
                        parseCell($cell,$mdTable,$rowPos,$colPos);
                    }
                    $colPos++;
                }   
    
                $rowPos++;
            }
    
    
            foreach ($mdTable as $key => $row) {
    
                //clean the data
                array_walk($row, "cleanCell");
                fputcsv($fp, $row);
            }
        }
    
    
        function cleanCell(&$contents,$key) {
    
            $contents = trim($contents);
    
            //get rid of pesky &nbsp's (aka: non-breaking spaces)
            $contents = trim($contents,chr(0xC2).chr(0xA0));
            $contents = str_replace("&nbsp;", "", $contents);
        }
    
    
        function parseCell(&$cell,&$mdTable,&$rowPos,&$colPos) {
    
            global $repeatContentIntoSpannedCells;
    
            //if data has already been set into the cell, skip it
            while (isset($mdTable[$rowPos][$colPos])) {
                $colPos++;
            }
    
            $mdTable[$rowPos][$colPos] = $cell->plaintext;
    
            if (isset($cell->rowspan)) {
    
                for ($i=1; $i <= ($cell->rowspan)-1; $i++) {
                    $mdTable[$rowPos+$i][$colPos] = ($repeatContentIntoSpannedCells ? $cell->plaintext : "");
                }
            }
    
            if (isset($cell->colspan)) {
    
                for ($i=1; $i <= ($cell->colspan)-1; $i++) {
    
                    $colPos++;
                    $mdTable[$rowPos][$colPos] = ($repeatContentIntoSpannedCells ? $cell->plaintext : "");
                }
            }
        }
    
    ?>
    
    0 讨论(0)
  • 2020-12-02 20:07

    If anyone is using Baba's answer but scratching their head over extra white spaces being added, this will work:

    include "simple_html_dom.php";
    $table = '<table border="1">
    <tr>
    <th>Header 1</th>
    <th>Header 2</th>
    </tr>
    <tr>
    <td>row 1, cell 1</td>
    <td>row 1, cell 2</td>
    </tr>
    <tr>
    <td>row 2, cell 1</td>
    <td>row 2, cell 2</td>
    </tr>
    </table>';
    
    $html = str_get_html($table);   
    
    $fileName="export.csv";
    header('Content-type: application/ms-excel');
    header("Content-Disposition: attachment; filename=$fileName");
    
    $fp = fopen("php://output", "w");
    $csvString="";
    
    $html = str_get_html(trim($table));
    foreach($html->find('tr') as $element)
    {
    
        $td = array();
        foreach( $element->find('th') as $row)
        {
            $row->plaintext="\"$row->plaintext\"";
            $td [] = $row->plaintext;
        }
        $td=array_filter($td);
        $csvString.=implode(",", $td);
    
        $td = array();
        foreach( $element->find('td') as $row)
        {
            $row->plaintext="\"$row->plaintext\"";
            $td [] = $row->plaintext;
        }
        $td=array_filter($td);
        $csvString.=implode(",", $td)."\n";
    }
    echo $csvString;
    fclose($fp);
    exit;
    

    }

    0 讨论(0)
提交回复
热议问题