Reading Text from an Instagram Profile

社会主义新天地 提交于 2020-07-23 06:35:07

问题


The question is how to read text from an Instagram profile if a user inputs an Instagram URL. I tried using java.net.URL and all I get is a big load of HTML text. I know little to nothing about working with web pages, so I am seeking some help with how I would get text from the profile (bio, post captions, comments).

Thanks!


回答1:


You can use a scraping tool (Scrapy of Parsehub etc). Just a heads up though, this is against Instagram's TOS so be careful hint hint




回答2:


Hello, you could split the html code as a string before and after the html tag.

And take the second string in the list for the first split and the first string in the list for the second split.

But you need some knowledge of html to know what an html tag is and how you find out which tag you need to split.

Have fun, I hope I could help you!




回答3:


You can use jsoup (https://jsoup.org/) to extract the specific tag from html content.

Here is an example to extract h1 tag content from the body of the HTML.

        // Parse HTML String using JSoup library
        String HTMLSTring = "<!DOCTYPE html>"
                + "<html>"
                + "<head>"
                + "<title>JSoup Example</title>"
                + "</head>"
                + "<body>"
                + "<table><tr><td>
                       <h1>HelloWorld</h1></tr>"
                + "</table>"
                + "</body>"
                + "</html>";
 
        Document html = Jsoup.parse(HTMLSTring);
        String title = html.title();
        String h1 = html.body().getElementsByTag("h1").text();

You can find a few more examples from the below blog post https://javarevisited.blogspot.com/2014/09/how-to-parse-html-file-in-java-jsoup-example.html

Hope this is helpful.



来源:https://stackoverflow.com/questions/62854893/reading-text-from-an-instagram-profile

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!