Read a line from xml file using C++

后端 未结 3 2002
傲寒
傲寒 2020-12-17 06:57

My XML File has:

< Package > xmlMetadata < /Package >

I am searching for a tag in this file and the text between the startin

相关标签:
3条回答
  • 2020-12-17 07:39

    A single line of tags on a file can hardly be described as XML. Anyway, if you really want to parse a XML file, this could be accomplished so much easier using a parser library like RapidXML. This page is an excellent resource.

    The code below is my attempt to read the following XML (yes, a XML file must have a header):

    File: demo.xml

    <?xml version="1.0" encoding="utf-8"?>
    <rootnode version="1.0" type="example">
        <Package> xmlMetadata </Package>
    </rootnode>
    

    A quick note: rapidxml is consisted only of headers. On my system I unzipped the library to /usr/include/rapidxml-1.13, so the code below could be compiled with:

    g++ read_tag.cpp -o read_tag -I/usr/include/rapidxml-1.13/

    File: read_tag.cpp

    #include <iostream>
    #include <string>
    #include <vector>
    #include <fstream>
    #include <rapidxml.hpp>
    
    using namespace std;
    using namespace rapidxml;
    
    
    int main()
    {
        string input_xml;
        string line;
        ifstream in("demo.xml");
    
        // read file into input_xml
        while(getline(in,line))
            input_xml += line;
    
        // make a safe-to-modify copy of input_xml
        // (you should never modify the contents of an std::string directly)
        vector<char> xml_copy(input_xml.begin(), input_xml.end());
        xml_copy.push_back('\0');
    
        // only use xml_copy from here on!
        xml_document<> doc;
        // we are choosing to parse the XML declaration
        // parse_no_data_nodes prevents RapidXML from using the somewhat surprising
        // behavior of having both values and data nodes, and having data nodes take
        // precedence over values when printing
        // >>> note that this will skip parsing of CDATA nodes <<<
        doc.parse<parse_declaration_node | parse_no_data_nodes>(&xml_copy[0]);
    
        // alternatively, use one of the two commented lines below to parse CDATA nodes,
        // but please note the above caveat about surprising interactions between
        // values and data nodes (also read http://www.ffuts.org/blog/a-rapidxml-gotcha/)
        // if you use one of these two declarations try to use data nodes exclusively and
        // avoid using value()
        //doc.parse<parse_declaration_node>(&xml_copy[0]); // just get the XML declaration
        //doc.parse<parse_full>(&xml_copy[0]); // parses everything (slowest)
    
        // since we have parsed the XML declaration, it is the first node
        // (otherwise the first node would be our root node)
        string encoding = doc.first_node()->first_attribute("encoding")->value();
        // encoding == "utf-8"
    
        // we didn't keep track of our previous traversal, so let's start again
        // we can match nodes by name, skipping the xml declaration entirely
        xml_node<>* cur_node = doc.first_node("rootnode");
        string rootnode_type = cur_node->first_attribute("type")->value();
        // rootnode_type == "example"
    
        // go straight to the first Package node
        cur_node = cur_node->first_node("Package");
        string content = cur_node->value(); // if the node doesn't exist, this line will crash
    
        cout << content << endl;
    }
    

    Outputs:

    xmlMetadata

    0 讨论(0)
  • 2020-12-17 07:43

    Getline doesn't search for a line it simply reads each line into the variable "line", you then have to search in that "line" for the text you want.

       size_t found=line.find("Package");
       if (found!=std::string::npos) {
           cout << line;
    

    BUT this is a bad way to handle XML - there is nothing stopping the XML writer from breaking the tag onto multiple lines. Unless this is a one off and you create the file you really should use a general XML parser to read the file and give you a list of tags.

    There are a bunch of very easy to use XML parsers, such as TinyXML

    EDIT (different xml now posted) - that's the problem with using regex to parse xml, you don't know how the xml will break lines. You can keep adding more and more layers of complexity until you have written your own xml parser - just use one of What is the best open XML parser for C++?

    0 讨论(0)
  • 2020-12-17 07:46

    This is not the way you should parse an XML file, but since you don't want to use a parser library this code might get you started.

    File: demo.xml

    <? xml version="1.0" ?>
    <fileStructure>
    <Main_Package>
       File_Navigate
    </Main_Package>
    <Dependency_Details>
    
    <Dependency>
       <Package>
          xmlMetadata
       </Package>
       <Header>
          xmlMetadata.h
       </Header>
       <Header_path>
          C:\Dependency\xmlMetadata\xmlMetadata.h
       </Header_path>
       <Implementation>
          xmlMetadata.cpp
       </Implementation>
       <Implementation_path>
          C:\Dependency\xmlMetadata\xmlMetadata.cpp
       </Implementation_path>
    </Dependency>
    
    <Dependency>
       <Package>
          xmlMetadata1
       </Package>
       <Header>
          xmlMetadata1.h
       </Header>
       <Header_path>
          C:\Dependency\xmlMetadata\xmlMetadata1.h
       </Header_path>
       <Implementation>
          xmlMetadata1.cpp
       </Implementation>
       <Implementation_path>
          C:\Dependency\xmlMetadata\xmlMetadata1.cpp
       </Implementation_path>
    </Dependency>
    
    </Dependency_Details>
    </fileStructure>
    

    The basic idea of the code is while you are reading each line of the file, strip the white spaces that are in the beginning and store the new-stripped-string into tmp, and then try to match it to one of the tags you are looking for. Once you find the begin-tag, keep printing the following lines until the close-tag is found.

    File: parse.cpp

    #include <iostream>
    #include <string>
    #include <fstream>
    
    using namespace std;
    
    int main()
    {
        string line;
        ifstream in("demo.xml");
    
        bool begin_tag = false;
        while (getline(in,line))
        {
            std::string tmp; // strip whitespaces from the beginning
            for (int i = 0; i < line.length(); i++)
            {
                if (line[i] == ' ' && tmp.size() == 0)
                {
                }
                else
                {
                    tmp += line[i];
                }
            }
    
            //cout << "-->" << tmp << "<--" << endl;
    
            if (tmp == "<Package>")
            {
                //cout << "Found <Package>" << endl;
                begin_tag = true;
                continue;
            }
            else if (tmp == "</Package>")
            {
                begin_tag = false;
                //cout << "Found </Package>" << endl;
            }
    
            if (begin_tag)
            {
                cout << tmp << endl;
            }
        }
    }
    

    Outputs:

    xmlMetadata
    xmlMetadata1
    
    0 讨论(0)
提交回复
热议问题