XML file to a PHPMyAdmin Database

◇◆丶佛笑我妖孽 提交于 2020-02-07 03:42:06

问题


I have an XML file (snippet of first entry plus header below) and I want to put it into a newly created database. It looks like the XML will be at least 2 different tables.

I want it to also ignore some particular rows that I won't be needing.

Also, should I create the tables and columns first? Or can the interface create them for me?

<?xml version="1.0" encoding="UTF-8" ?>
<npidata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.npi.gov.au/includes/download/npi-data-download.xsd">
<report>
    <year>2014</year>
    <registered_business_name>W H HECK &amp; SONS PTY LTD</registered_business_name> - IGNORE
    <abn>91009661401</abn> - IGNORE
    <acn>009661401</acn> - IGNORE
    <sub_threshold>N</sub_threshold>
    <data_start_date>2013-07-01</data_start_date>
    <data_end_date>2014-06-30</data_end_date>
    <first_published_date>2015-03-31</first_published_date>
    <last_updated_date>2015-03-31</last_updated_date>
    <number_of_employees>55</number_of_employees> - IGNORE
    <facility_name>Rocky Point Sugarmill</facility_name>
    <jurisdiction_facility_id>Q014HEK001</jurisdiction_facility_id> - PRIMARY KEY
    <jurisdiction>QLD</jurisdiction> - IGNORE
    <site_address_street>MILL RD</site_address_street> - IGNORE
    <site_address_suburb>Woongoolba</site_address_suburb>
    <site_address_state>QLD</site_address_state>
    <site_address_postcode>4207</site_address_postcode>
    <main_activities>Sugarmilling</main_activities>
    <site_latitude>-27.735666</site_latitude>
    <site_longitude>153.327611</site_longitude>
    <anzsic_codes> - IGNORE
        <anzsic_code> - IGNORE
            <type>Primary</type><code>1181</code> - IGNORE
            <name>Sugar Manufacturing</name> - IGNORE
        </anzsic_code> - IGNORE
    </anzsic_codes> - IGNORE
    <public_contact> - IGNORE
        <title>Mr</title>
        <name>David</name> - IGNORE
        <surname>Heck</surname> - IGNORE
        <position>Managing Director</position> - IGNORE
        <phone>(07) 5547 4500</phone> - IGNORE
        <fax>(07) 5546 1233</fax> - IGNORE
        <email>david@heckgroup.com.au</email> - IGNORE
    </public_contact> - IGNORE
    <emissions>
        <emission>
            <substance>Ethanol</substance>
            <destination>Water</destination>
            <quantity_in_kg>19518</quantity_in_kg>
            <mass_balance_estimation>N</mass_balance_estimation>
            <engineering_calculations_estimation>Y</engineering_calculations_estimation>
            <direct_measurement_estimation>N</direct_measurement_estimation>
            <emission_factors_estimation>N</emission_factors_estimation>
            <approved_alternative_estimation>N</approved_alternative_estimation>
        </emission>
        <emission>
            <substance>Ethanol</substance>
            <destination>Air Fugitive</destination>
            <quantity_in_kg>2114.45</quantity_in_kg>
            <mass_balance_estimation>N</mass_balance_estimation>
            <engineering_calculations_estimation>Y</engineering_calculations_estimation>
            <direct_measurement_estimation>N</direct_measurement_estimation>
            <emission_factors_estimation>N</emission_factors_estimation>
            <approved_alternative_estimation>N</approved_alternative_estimation>
        </emission>
        <emission>
            <substance>Ethanol</substance>
            <destination>Air Total</destination>
            <quantity_in_kg>2114.45</quantity_in_kg>
        </emission>
        <emission>
            <substance>Hydrochloric acid</substance>
            <destination>Land</destination>
            <quantity_in_kg>2282.95</quantity_in_kg>
            <mass_balance_estimation>N</mass_balance_estimation>
            <engineering_calculations_estimation>Y</engineering_calculations_estimation>
            <direct_measurement_estimation>N</direct_measurement_estimation>
            <emission_factors_estimation>N</emission_factors_estimation>
            <approved_alternative_estimation>N</approved_alternative_estimation>
        </emission>
        <emission>
            <substance>Particulate Matter 10.0 um</substance>
            <destination>Air Fugitive</destination>
            <quantity_in_kg>398.91285</quantity_in_kg>
            <mass_balance_estimation>N</mass_balance_estimation>
            <engineering_calculations_estimation>Y</engineering_calculations_estimation>
            <direct_measurement_estimation>N</direct_measurement_estimation>
            <emission_factors_estimation>N</emission_factors_estimation>
            <approved_alternative_estimation>N</approved_alternative_estimation>
        </emission>
        <emission>
            <substance>Particulate Matter 10.0 um</substance>
            <destination>Air Total</destination>
            <quantity_in_kg>398.91285</quantity_in_kg>
        </emission>
    </emissions>
    <transfers></transfers> - IGNORE
    <pollution_control_devices></pollution_control_devices> - IGNORE
    <cleaner_production_activities></cleaner_production_activities> - IGNORE
</report>

This XML file has over 11,000 entries so manually entering them is out of the question.


回答1:


You're going to have to do some manipulation on the file first. It's true that phpMyAdmin can import and export XML data, but the format used by phpMyAdmin isn't the same format your data currently exists in.

The unfortunate thing about XML data is that XML makes a great wrapper for any kind of data, but for interoperability between systems it requires the two systems be speaking the same dialect which I hardly ever see actually work out in practical situations.

Is there any chance of getting the data by some other means -- exporting to a different format, for instance?

Here's an example of how phpMyAdmin is expecting the data to be formatted; you can see more detail by doing an export of an existing table since I only picked out a bit from the middle.

<database name="stackoverflowtest">
    <table name="tblUsers">
        <column name="id">1</column>
        <column name="name">Bob</column>
    </table>
    <table name="15">
        <column name="id">2</column>
        <column name="name">Jim</column>
    </table>
</database>

If you're not able to get the data in another format, I suggest using some text parser to manipulate the data you do have. I think CSV might be a better format for you to use for your import and with some work you could come up with an awk/sed/perl/python script to come up with the format you need (or probably any other language, but those are common for manipulating text files). You might even find an XML parsing library function which will let you read the XML file as actual XML attributes rather than having to parse it yourself. That's probably the ideal solution, but even without that you could look for each line containing <year> and </year> and pull the data from in between. Add a comma, then repeat for <data_start_date> and so on until you reach the end of a <report>/</report> block and add a newline. Make that loop over your whole XML file and you'll have a well-formatted CSV file that phpMyAdmin will gladly load. MySQL itself uses a slightly different flavor of XML (still different from what you have available); that would look more like

<row>
    <field name="id">1</field>
    <field name="name">Bob</field>
</row>
<row>
    <field name="id">2</field>
    <field name="name">Jim</field>
</row>

I think that's going to be easier than working with XML. Unfortunately the solution will take some work, but hopefully I've given you some resources to start. There are countless tutorials and examples of using the tools I mentioned for manipulating files, though again I suggest looking in to an XML friendly Python, Perl, or even PHP library first.



来源:https://stackoverflow.com/questions/31197287/xml-file-to-a-phpmyadmin-database

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!