Getting the final destination of a javascript redirect on a website

后端 未结 2 593
悲哀的现实
悲哀的现实 2020-12-20 03:40

I parse a website with python. They use a lot of redirects and they do them by calling javascript functions.

So when I just use urllib to parse the site, it doesn\'t

2条回答
  •  夕颜
    夕颜 (楼主)
    2020-12-20 04:11

    I looked into Selenium. And if you are not running a pure script (meaning you don't have a display and can't start a "normal" browser) the solution is actually quite simple:

    from selenium import webdriver
    
    driver = webdriver.Firefox()
    link = "http://yourlink.com"
    driver.get(link)
    
    #this waits for the new page to load
    while(link == driver.current_url):
      time.sleep(1)
    
    redirected_url = driver.current_url
    

    For my usecase this is more than enough. Selenium can also interact with forms and send keystrokes to the website.

提交回复
热议问题