parsing/extracting a HTML Table, Website in Java

*爱你&永不变心* 提交于 2019-12-03 16:32:08
Sachin Nambiar Nalavattanon

Here are the steps you would need to follow:

1) You could use any of the below java libraries for HTML scraping:

2) Use Xpath helper

Eg 1: Enter "//tr[1]//td[1]" in the query and it will give all table elements at position (1,1)

Eg 2: "/html/body[@class='tt']/center/table[1]/tbody/tr[4]/td[3]/table/tbody/tr/td" Will give you all 15 values under Montag.

Eg 3: "/html/body[@class='tt']/center/table[1]/tbody/tr/td/table/tbody/tr/td" Will give you all 380 entries of the table

OR

Example using Jsoup

import org.jsoup.Jsoup;
import java.io.IOException;

public class Main {
    public static void main(String[] args) throws IOException {
        org.jsoup.nodes.Document doc = Jsoup.connect("http://www.kantschule-falkensee.de/uploads/dmiadgspahw/klassen/A_Klasse_11.htm").get();
        org.jsoup.select.Elements rows = doc.select("tr");
        for(org.jsoup.nodes.Element row :rows)
        {
            org.jsoup.select.Elements columns = row.select("td");
            for (org.jsoup.nodes.Element column:columns)
            {
                System.out.print(column.text());
            }
            System.out.println();
        }

    }
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!