How to convert HTML to a 2D array
前端 未结 4 1285
[愿得一人]
[愿得一人] 2020-12-10 09:10


Lets say I copy a complete HTML table (when each and every tr and td has extra attributes) into a String. How can I take all the contents (w

相关标签:
4条回答
  • 2020-12-10 09:54

    Maybe String.split('<whateverhtmltabletag>') can help you?

    Also StringTokenizer class can be useful. Example:

    String data = "one<br>two<br>three";  
    StringTokenizer tokens = new StringTokenizer(data, "<br>");  
    while (tokens.hasMoreElements()) {  
       System.out.println(tokens.nextElement());  // prints one, then two, then three
    }
    

    Also, using indexOf("<tag"), example here: http://forums.devshed.com/java-help-9/parse-html-table-into-2d-arrays-680614.html

    You can also use an HTML parser (like jsoup) and then copy the contents from the table to an array. Here's an example in javascript: JavaScript to parse HTML table of numbers into an array

    0 讨论(0)
  • 2020-12-10 10:06

    Nevermind, I saw this code in the internet: HtmlTableParser

    It actually seems that now I have another problem, but it is not exactly related to this question, so I will open another one.

    0 讨论(0)
  • 2020-12-10 10:09

    This is how it could be done using JSoup (srsly, don't use regexp for HTML).

    Document doc = Jsoup.parse(html);
    Elements tables = doc.select("table");
    for (Element table : tables) {
        Elements trs = table.select("tr");
        String[][] trtd = new String[trs.size()][];
        for (int i = 0; i < trs.size(); i++) {
            Elements tds = trs.get(i).select("td");
            trtd[i] = new String[tds.size()];
            for (int j = 0; j < tds.size(); j++) {
                trtd[i][j] = tds.get(j).text(); 
            }
        }
        // trtd now contains the desired array for this table
    }
    

    Also, the class attribute value is not closed properly here in your example:

    <td class="bold>Td2</td>
    

    it should be

    <td class="bold">Td2</td>
    
    0 讨论(0)
  • 2020-12-10 10:09

    what i have so far, it is not the best one, but I hope it's helpful... simple with string

    public void read_data() {
        try {
            file = new File("_result.xml");
            FileReader fileReader = new FileReader(file);
            BufferedReader bufferedReader = new BufferedReader(fileReader);
            String line = "";
            String output = "";
            int a = 0, b = 0;
            boolean _write = false;
    
            while ((line = bufferedReader.readLine()) != null) {
                if(line.trim().startsWith("<td")) { _write = true; } else { _write = false; }
    
                if(_write) {
                    a = line.indexOf('>')+1;
                    b = line.lastIndexOf('<');
                    output += line.substring(a,b) + "|";
                }
    
                if(line.trim().equals("</tr>")) {
                    System.out.println(output);
                    output = "";
                }
    
            }
            fileReader.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    
    0 讨论(0)
提交回复
热议问题