How excel reads XML file?

前端 未结 3 811
梦毁少年i
梦毁少年i 2020-12-20 20:33

I have researched a lot to convert an xml file to 2d array in a same way excel does trying to make same algorithm as excel does when you open an xml file in excel.

<
相关标签:
3条回答
  • 2020-12-20 21:04

    According to your vague question, what you call "Excel" it does the following in my own words: It takes each /items/item element as a row. From that in document order, the column-name is the tag-name of each leaf-element-nodes, if there is a duplicate name, the position is of the first one.

    Then it creates one row per row but only if all child-elements are leaf elements. Otherwise, the row is taken as base for the rows out of that row and non-leaf-element containing elements are interpolated. E.g. if such an entry does have two times two additional leafs with the same name, those get interpolated into two rows. Their child values are then placed into the position of the columns with the name following the logic described in the first paragraph.

    How deep this logic is followed is not clear from your question. So I keep it on that level only. Otherwise the interpolation would need to recurse deeper into the tree. For that, the algorithm as outlined might not be fitting any longer.

    To build that in PHP, you can particularly benefit from XPath and the interpolation works wonders as a Generator.

    function tree_to_rows(SimpleXMLElement $xml)
    {
        $columns = [];
    
        foreach ($xml->xpath('/*/*[1]//*[not(*)]') as $leaf) {
            $columns[$leaf->getName()] = null;
        }
    
        yield array_keys($columns);
    
        $name = $xml->xpath('/*/*[1]')[0]->getName();
    
        foreach ($xml->$name as $source) {
            $rowModel       = array_combine(array_keys($columns), array_fill(0, count($columns), null));
            $interpolations = [];
    
            foreach ($source as $child) {
                if ($child->count()) {
                    $interpolations[] = $child;
                } else {
                    $rowModel[$child->getName()] = $child;
                }
            }
    
            if (!$interpolations) {
                yield array_values($rowModel);
                continue;
            }
    
            foreach ($interpolations as $interpolation) {
                foreach ($interpolation as $interpolationStep) {
                    $row = $rowModel;
                    foreach ($interpolationStep->xpath('(.|.//*)[not(*)]') as $leaf) {
                        $row[$leaf->getName()] = $leaf;
                    }
                    yield array_values($row);
                }
            }
        }
    }
    

    Using it then can be as straight forward as:

    $xml  = simplexml_load_file('items.xml');
    $rows = tree_to_rows($xml);
    echo new TextTable($rows);
    

    Giving the exemplary output:

    +-----+--------+-----+-----------------+----------+-----------+-----+
    |sku  |title   |price|name             |value     |contributor|isbn |
    +-----+--------+-----+-----------------+----------+-----------+-----+
    |abc 1|a book 1|42 1 |Number of pages 1|123 1     |           |12345|
    +-----+--------+-----+-----------------+----------+-----------+-----+
    |abc 1|a book 1|42 1 |Author 1         |Rob dude 1|           |12345|
    +-----+--------+-----+-----------------+----------+-----------+-----+
    |abc 1|a book 1|42 1 |                 |          |John 1     |12345|
    +-----+--------+-----+-----------------+----------+-----------+-----+
    |abc 1|a book 1|42 1 |                 |          |Ryan 1     |12345|
    +-----+--------+-----+-----------------+----------+-----------+-----+
    |abc 2|a book 2|42 2 |Number of pages 2|123 2     |           |6789 |
    +-----+--------+-----+-----------------+----------+-----------+-----+
    |abc 2|a book 2|42 2 |Author 2         |Rob dude 2|           |6789 |
    +-----+--------+-----+-----------------+----------+-----------+-----+
    |abc 2|a book 2|42 2 |                 |          |John 2     |6789 |
    +-----+--------+-----+-----------------+----------+-----------+-----+
    |abc 2|a book 2|42 2 |                 |          |Ryan 2     |6789 |
    +-----+--------+-----+-----------------+----------+-----------+-----+
    

    The TextTable is a slightly modified version from https://gist.github.com/hakre/5734770 allowing to operate on Generators - in case you're looking for that code.

    0 讨论(0)
  • 2020-12-20 21:20

    The PHP library PHPExcel solves your issue:

    https://phpexcel.codeplex.com/

    You can find some samples here too:

    https://phpexcel.codeplex.com/wikipage?title=Examples&referringTitle=Home

    https://github.com/PHPOffice/PHPExcel/wiki/User%20Documentation

    It's the most reliable Excel library for PHP and it's constantly maintained and upgraded.

    Keep in mind that you can read (from an Excel file etc.) and write (to an Excel file, PDF etc.).

    0 讨论(0)
  • 2020-12-20 21:28

    In order to get the array that you want from the xml file you have given you would have to do it this way. This was not overly fun so I hope it is indeed what you wanted.

    Given the exact XML you have given about it will produce the output you have as your final result.

    This was written in php 5.6 I believe you will have to move the function calls to their own line and replace [] with array() if you run into issues in your environment.

    $items = simplexml_load_file("items.xml");
    
    $items_array = [];
    
    foreach($items as $item) {
    
        foreach($item->attributes->attribute as $attribute) {
            array_push($items_array, itemsFactory($item, (array) $attribute));
        }
    
        foreach((array) $item->contributors->contributor as $contributer) {
            array_push($items_array, itemsFactory($item, $contributer));
        }
    
    }
    
    function itemsFactory($item, $vars) {
    
        $item = (array) $item;
    
        return [
            "sku" => $item['sku'],
            "title" => $item['title'],
            "price" => $item['price'],
            "name" => (is_array($vars) ? $vars['name'] : ""),
            "value" => (is_array($vars) ? $vars['name'] : ""),
            "contributer" => (is_string($vars) ? $vars : ""),
            "isbn" => $item['isbn']
        ];
    
    }
    
    var_dump($items_array);
    

    Here is the result when run on your XML file...

    array(8) {
      [0]=>
      array(7) {
        ["sku"]=>
        string(5) "abc 1"
        ["title"]=>
        string(8) "a book 1"
        ["price"]=>
        string(4) "42 1"
        ["name"]=>
        string(17) "Number of pages 1"
        ["value"]=>
        string(17) "Number of pages 1"
        ["contributer"]=>
        string(0) ""
        ["isbn"]=>
        string(5) "12345"
      }
      [1]=>
      array(7) {
        ["sku"]=>
        string(5) "abc 1"
        ["title"]=>
        string(8) "a book 1"
        ["price"]=>
        string(4) "42 1"
        ["name"]=>
        string(8) "Author 1"
        ["value"]=>
        string(8) "Author 1"
        ["contributer"]=>
        string(0) ""
        ["isbn"]=>
        string(5) "12345"
      }
      [2]=>
      array(7) {
        ["sku"]=>
        string(5) "abc 1"
        ["title"]=>
        string(8) "a book 1"
        ["price"]=>
        string(4) "42 1"
        ["name"]=>
        string(0) ""
        ["value"]=>
        string(0) ""
        ["contributer"]=>
        string(6) "John 1"
        ["isbn"]=>
        string(5) "12345"
      }
      [3]=>
      array(7) {
        ["sku"]=>
        string(5) "abc 1"
        ["title"]=>
        string(8) "a book 1"
        ["price"]=>
        string(4) "42 1"
        ["name"]=>
        string(0) ""
        ["value"]=>
        string(0) ""
        ["contributer"]=>
        string(6) "Ryan 1"
        ["isbn"]=>
        string(5) "12345"
      }
      [4]=>
      array(7) {
        ["sku"]=>
        string(5) "abc 2"
        ["title"]=>
        string(8) "a book 2"
        ["price"]=>
        string(4) "42 2"
        ["name"]=>
        string(17) "Number of pages 2"
        ["value"]=>
        string(17) "Number of pages 2"
        ["contributer"]=>
        string(0) ""
        ["isbn"]=>
        string(4) "6789"
      }
      [5]=>
      array(7) {
        ["sku"]=>
        string(5) "abc 2"
        ["title"]=>
        string(8) "a book 2"
        ["price"]=>
        string(4) "42 2"
        ["name"]=>
        string(8) "Author 2"
        ["value"]=>
        string(8) "Author 2"
        ["contributer"]=>
        string(0) ""
        ["isbn"]=>
        string(4) "6789"
      }
      [6]=>
      array(7) {
        ["sku"]=>
        string(5) "abc 2"
        ["title"]=>
        string(8) "a book 2"
        ["price"]=>
        string(4) "42 2"
        ["name"]=>
        string(0) ""
        ["value"]=>
        string(0) ""
        ["contributer"]=>
        string(6) "John 2"
        ["isbn"]=>
        string(4) "6789"
      }
      [7]=>
      array(7) {
        ["sku"]=>
        string(5) "abc 2"
        ["title"]=>
        string(8) "a book 2"
        ["price"]=>
        string(4) "42 2"
        ["name"]=>
        string(0) ""
        ["value"]=>
        string(0) ""
        ["contributer"]=>
        string(6) "Ryan 2"
        ["isbn"]=>
        string(4) "6789"
      }
    }
    

    If you actually have access to the excel file and not the xml this could be much easier. If so we can use php excel to render the exact same thing but it would work for any dataset and not just the one specified. If that is not the case I can't think of any other way to transform that xml file into what you want.

    EDIT:

    This also may bring some more light to the subject and is from the developer of PHPExcel himself PHPExcel factory error when reading XML from URL. As you can I don't think you are able to write something that would parse any XML file that you throw at it without getting a hold of some of Excels source code or spending a very long time working on this.. time that is much beyond the scope of this question. However if you were to write something that would parse any XML file I have a feeling it would look like the above but with a TON of conditionals.

    0 讨论(0)
提交回复
热议问题