How to flatten an XML file into a set of xpath expressions?

后端 未结 2 1607
暗喜
暗喜 2021-01-03 08:25

Consider I have the following example XML file:


   
2条回答
  •  攒了一身酷
    2021-01-03 09:17

    You could do this pretty easily with XSLT. Looking at your examples, it seems like you only want the XPath of elements that contain text. If that's not the case, let me know and I can update the XSLT.

    I created a new input example to show how it handles siblings with the same name. In this case,

    .

    XML Input

    
        
    foo bar 00.00 USD 1
    some name some description 00.01 USD 2

    XSLT 1.0

    
        
        
    
        
    
        
            
            
        
    
        
            
            
            
                
                    
                
            
            
                
                
    
            
        
    
    
    

    Output

    /create[1]/article[1]/name[1]
    /create[1]/article[1]/description[1]
    /create[1]/article[1]/price[1]/amount[1]
    /create[1]/article[1]/price[1]/currency[1]
    /create[1]/article[1]/id[1]
    /create[1]/article[2]/name[1]
    /create[1]/article[2]/description[1]
    /create[1]/article[2]/price[1]/amount[1]
    /create[1]/article[2]/price[1]/currency[1]
    /create[1]/article[2]/id[1]
    

    UPDATE

    For the XSLT to work for all elements, simply remove the [text()] predicate from match="*[text()]". This will output the path for every element. If you don't want the path output for elements that contain other elements (like create, article, and price) add the predicate [not(*)]. Here's an updated example:

    New XML Input

    
        
    some name some description 00.01 USD 2

    XSLT 1.0

    
        
        
    
        
    
        
            
            
        
    
        
            
            
            
                
                    
                
            
            
                
                
    
            
        
    
    
    

    Output

    /create[1]/article[1]/name[1]
    /create[1]/article[1]/description[1]
    /create[1]/article[1]/price[1]/amount[1]
    /create[1]/article[1]/price[1]/currency[1]
    /create[1]/article[1]/id[1]
    /create[1]/article[2]/name[1]
    /create[1]/article[2]/description[1]
    /create[1]/article[2]/price[1]/amount[1]
    /create[1]/article[2]/price[1]/currency[1]
    /create[1]/article[2]/id[1]
    

    If you remove the [not(*)] predicate, this is what the output looks like (a path is output for every element):

    /create[1]
    /create[1]/article[1]
    /create[1]/article[1]/name[1]
    /create[1]/article[1]/description[1]
    /create[1]/article[1]/price[1]
    /create[1]/article[1]/price[1]/amount[1]
    /create[1]/article[1]/price[1]/currency[1]
    /create[1]/article[1]/id[1]
    /create[1]/article[2]
    /create[1]/article[2]/name[1]
    /create[1]/article[2]/description[1]
    /create[1]/article[2]/price[1]
    /create[1]/article[2]/price[1]/amount[1]
    /create[1]/article[2]/price[1]/currency[1]
    /create[1]/article[2]/id[1]
    

    Here's another version of the XSLT that is about 65% faster:

    
        
        
    
        
    
        
            
                
            
            
    
            
        
    
    
    

提交回复
热议问题