Finding Ads on a web page

[亡魂溺海] 提交于 2019-12-01 05:33:58

问题


I'm writing an application that's trying to determine if there are ads on a page. This is currently using brower-driving through selenium webdriver using python.

I figured that a good amount of ads exist inside iframes, and I've made a loop to look inside each frame

browser = webdriver.Chrome()
browser.get("http://cnn.com")

all_iframes = browser.find_elements_by_tag_name("iframe")

for iframe in all_iframes:
   browser.switch_to_frame(iframe)
   print(browser.page_source)
   browser.switch_to_default_content()

browser.quit()

I'm wondering if there is any consistently found tags or tag parameters that I can use across multiple pages to determine if there are ads located on a page (both in and outside of iframes on a page). Would I have to look for instances of stuff like doubleclick or adtech or adblade inside each frame?

Or would I have to generate different rules for checking on a per-page basis?

Anyone in the know about how ads are displayed on pages? Thanks.


回答1:


You can search by the ad servers.

http://pgl.yoyo.org/as/serverlist.php?hostformat=adblockplus

It would be helpful to look at other projects and see how they handle doing the same task:

http://adblockplus.org/en/source



来源:https://stackoverflow.com/questions/13423219/finding-ads-on-a-web-page

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!