Getting the final destination of a javascript redirect on a website

霸气de小男生 提交于 2019-12-18 07:10:18

问题


I parse a website with python. They use a lot of redirects and they do them by calling javascript functions.

So when I just use urllib to parse the site, it doesn't help me, because I can't find the destination url in the returned html code.

Is there a way to access the DOM and call the correct javascript function from my python code?

All I need is the url, where the redirect takes me.


回答1:


I looked into Selenium. And if you are not running a pure script (meaning you don't have a display and can't start a "normal" browser) the solution is actually quite simple:

from selenium import webdriver

driver = webdriver.Firefox()
link = "http://yourlink.com"
driver.get(link)

#this waits for the new page to load
while(link == driver.current_url):
  time.sleep(1)

redirected_url = driver.current_url

For my usecase this is more than enough. Selenium can also interact with forms and send keystrokes to the website.




回答2:


It doesnt sound like fun to me, but every javascript function is a is also an object, so you can just read the function rather than call it and perhaps the URL is in it. Otherwise, that function may call another which you would then have to recurse into... Again, doesnt sound like fun, but might be doable.



来源:https://stackoverflow.com/questions/8053295/getting-the-final-destination-of-a-javascript-redirect-on-a-website

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!