lxml etree xmlparser remove unwanted namespace

前端未结

关注

 4  2162

感动是毒 2020-11-29 00:04

I have an xml doc that I am trying to parse using Etree.lxml

1&


      
      
        
          4条回答        

        
                    
            
            
                         
                
              
              
                
                   南方客
                                             
                
                
                (楼主)
            
              
              
                2020-11-29 00:08
              

            
            
                        
import io
import lxml.etree as ET

content='''\

  
    1
  
  
    some stuff
  

'''    
dom = ET.parse(io.BytesIO(content))


You can find namespace-aware nodes using the xpath method:

body=dom.xpath('//ns:Body',namespaces={'ns':'http://www.example.com/zzz/yyy'})
print(body)
# []


If you really want to remove namespaces, you could use an XSL transformation:

# http://wiki.tei-c.org/index.php/Remove-Namespaces.xsl
xslt='''



    
      
    



    
      
    



    
      
    


'''

xslt_doc=ET.parse(io.BytesIO(xslt))
transform=ET.XSLT(xslt_doc)
dom=transform(dom)


Here we see the namespace has been removed:

print(ET.tostring(dom))
# 
#   
#     1
#   
#   
#     some stuff
#   
# 


So you can now find the Body node this way:

print(dom.find("Body"))
# 

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它4个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复