Extracting data between two tags in HTML file

前端 未结 2 628
遥遥无期
遥遥无期 2020-12-04 00:43

I\'ve got a HUUUGE HTML file here saved on my system, which contains data from a product catalogue. The data is structured such that for each product record the name is bet

2条回答
  •  無奈伤痛
    2020-12-04 01:30

    There are two ways of solving this sort of problem: string manipulation with regexes (as suggested by gnovice) or parsing the file (or a mix of the two). Parsing is often best if your file is very well structured; regexes win for messy files.

    Here's the parsing solution.

    Start by downloading xmliotools, and calling xml_read on your file. Your example isn't completely reproducible, so here are two different versions of the data.

    Save this to test1.xml:

    
    
    'hat'
    '1829493'
    'cyan'
    'dress'
    '18'
    'dark purple'
    
    

    Save this to test2.xml.

    
    
    
    'hat'
    '1829493'
    'cyan'
    
    
    'dress'
    '18'
    'dark purple'
    
    
    

    Now compare

    x1 = xml_read('test1.xml')
    x2 = xml_read('test2.xml')
    

提交回复
热议问题