Java web site meta data [closed]

自古美人都是妖i 提交于 2019-12-13 04:36:18

问题


Using Java what is the best way to extract meta data from a website?

I am planning on requesting the entire page, then finding where the meta data is located in that page - this seems cumbersome, is there a better way to achieve this?


回答1:


Cumbersome as it is, it's practically the only way, as far as I know.

What you can do is reading only a certain first few bytes, say 2000. This might save some time but it won't guarantee that all meta tags will be read.

Another way is to read in chunks, scan for the string </head>, if not, continue reading. This could potentially take longer for pages with large <head> tag, though.

Raw html shouldn't be too long to process anyway.



来源:https://stackoverflow.com/questions/5468385/java-web-site-meta-data

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!