I do have a lot of XML files I would like to import in the table xml_data:
create table xml_data(result xml);
To do this I hav
Necromancing: For those that need a working example:
DO $$
DECLARE myxml xml;
BEGIN
myxml := XMLPARSE(DOCUMENT convert_from(pg_read_binary_file('MyData.xml'), 'UTF8'));
DROP TABLE IF EXISTS mytable;
CREATE TEMP TABLE mytable AS
SELECT
(xpath('//ID/text()', x))[1]::text AS id
,(xpath('//Name/text()', x))[1]::text AS Name
,(xpath('//RFC/text()', x))[1]::text AS RFC
,(xpath('//Text/text()', x))[1]::text AS Text
,(xpath('//Desc/text()', x))[1]::text AS Desc
FROM unnest(xpath('//record', myxml)) x
;
END$$;
SELECT * FROM mytable;
Or with less noise
SELECT
(xpath('//ID/text()', myTempTable.myXmlColumn))[1]::text AS id
,(xpath('//Name/text()', myTempTable.myXmlColumn))[1]::text AS Name
,(xpath('//RFC/text()', myTempTable.myXmlColumn))[1]::text AS RFC
,(xpath('//Text/text()', myTempTable.myXmlColumn))[1]::text AS Text
,(xpath('//Desc/text()', myTempTable.myXmlColumn))[1]::text AS Desc
,myTempTable.myXmlColumn as myXmlElement
FROM unnest(
xpath
( '//record'
,XMLPARSE(DOCUMENT convert_from(pg_read_binary_file('MyData.xml'), 'UTF8'))
)
) AS myTempTable(myXmlColumn)
;
With this example XML file (MyData.xml):
1
A
RFC 1035[1]
Address record
Returns a 32-bit IPv4 address, most commonly used to map hostnames to an IP address of the host, but it is also used for DNSBLs, storing subnet masks in RFC 1101, etc.
2
NS
RFC 1035[1]
Name server record
Delegates a DNS zone to use the given authoritative name servers
Note:
MyData.xml needs to be in the PG_Data directory (the parent-directory of the pg_stat directory).
e.g. /var/lib/postgresql/9.3/main/MyData.xml
This requires PostGreSQL 9.1+
Overall, you can achive it fileless, like this:
SELECT
(xpath('//ID/text()', myTempTable.myXmlColumn))[1]::text AS id
,(xpath('//Name/text()', myTempTable.myXmlColumn))[1]::text AS Name
,(xpath('//RFC/text()', myTempTable.myXmlColumn))[1]::text AS RFC
,(xpath('//Text/text()', myTempTable.myXmlColumn))[1]::text AS Text
,(xpath('//Desc/text()', myTempTable.myXmlColumn))[1]::text AS Desc
,myTempTable.myXmlColumn as myXmlElement
-- Source: https://en.wikipedia.org/wiki/List_of_DNS_record_types
FROM unnest(xpath('//record',
CAST('
1
A
RFC 1035[1]
Address record
Returns a 32-bit IPv4 address, most commonly used to map hostnames to an IP address of the host, but it is also used for DNSBLs, storing subnet masks in RFC 1101, etc.
2
NS
RFC 1035[1]
Name server record
Delegates a DNS zone to use the given authoritative name servers
' AS xml)
)) AS myTempTable(myXmlColumn)
;
Note that unlike in MS-SQL, xpath text() returns NULL on a NULL value, and not an empty string.
If for whatever reason you need to explicitly check for the existence of NULL, you can use [not(@xsi:nil="true")], to which you need to pass an array of namespaces, because otherwise, you get an error (however, you can omit all namespaces but xsi).
SELECT
(xpath('//xmlEncodeTest[1]/text()', myTempTable.myXmlColumn))[1]::text AS c1
,(
xpath('//xmlEncodeTest[1][not(@xsi:nil="true")]/text()', myTempTable.myXmlColumn
,
ARRAY[
-- ARRAY['xmlns','http://www.w3.org/1999/xhtml'], -- defaultns
ARRAY['xsi','http://www.w3.org/2001/XMLSchema-instance'],
ARRAY['xsd','http://www.w3.org/2001/XMLSchema'],
ARRAY['svg','http://www.w3.org/2000/svg'],
ARRAY['xsl','http://www.w3.org/1999/XSL/Transform']
]
)
)[1]::text AS c22
,(xpath('//nixda[1]/text()', myTempTable.myXmlColumn))[1]::text AS c2
--,myTempTable.myXmlColumn as myXmlElement
,xmlexists('//xmlEncodeTest[1]' PASSING BY REF myTempTable.myXmlColumn) AS c1e
,xmlexists('//nixda[1]' PASSING BY REF myTempTable.myXmlColumn) AS c2e
,xmlexists('//xmlEncodeTestAbc[1]' PASSING BY REF myTempTable.myXmlColumn) AS c1ea
FROM unnest(xpath('//row',
CAST('
noob
' AS xml)
)
) AS myTempTable(myXmlColumn)
;
You can also check if a field is contained in an XML-text, by doing
,xmlexists('//xmlEncodeTest[1]' PASSING BY REF myTempTable.myXmlColumn) AS c1e
for example when you pass an XML-value to a stored-procedure/function for CRUD. (see above)
Also, note that the correct way to pass a null-value in XML is and not or nothing. There is no correct way to pass NULL in attributes (you can only omit the attribute, but then it gets difficult/slow to infer the number of columns and their names in a large dataset).
e.g.
(is more compact, but very bad if you need to import it, especially if from XML-files with multiple GB of data - see a wonderful example of that in the stackoverflow data dump)
SELECT
myTempTable.myXmlColumn
,(xpath('//@column1', myTempTable.myXmlColumn))[1]::text AS c1
,(xpath('//@column2', myTempTable.myXmlColumn))[1]::text AS c2
,(xpath('//@column3', myTempTable.myXmlColumn))[1]::text AS c3
,xmlexists('//@column3' PASSING BY REF myTempTable.myXmlColumn) AS c3e
,case when (xpath('//@column3', myTempTable.myXmlColumn))[1]::text is null then 1 else 0 end AS is_null
FROM unnest(xpath('//row', '
'
)) AS myTempTable(myXmlColumn)