Web Scraping Videos

只谈情不闲聊 提交于 2021-01-29 02:31:45

问题


I'm attempting to do a proof of concept by downloading a TV episode of Bob's Burgers at https://www.watchcartoononline.com/bobs-burgers-season-9-episode-3-tweentrepreneurs.

I cannot figure out how to extract the video url from this website. I used Chrome and Firefox web developer tools to figure out it is in an iframe, but extracting src urls with BeautifulSoup searching for iframes, returns links that have nothing to do with the video. Where are the references to mp4 or flv files (which I see in Developer Tools - even though clicking them is forbidden).

Any understanding on how to do video web scraping with BeautifulSoup and requests would be appreciated.

Here is some code if needed. A lot of tutorials say to use 'a' tags, but I didn't receive any 'a' tags.

import requests
from bs4 import BeautifulSoup

r = requests.get("https://www.watchcartoononline.com/bobs-burgers-season-9-episode-5-live-and-let-fly")
soup = BeautifulSoup(r.content,'html.parser')
links = soup.find_all('iframe')
for link in links:
    print(link['src'])

回答1:


import requests
url = "https://disk19.cizgifilmlerizle.com/cizgi/bobs.burgers.s09e03.mp4?st=_EEVz36ktZOv7ZxlTaXZfg&e=1541637622"
def download_file(url,filename):
    # NOTE the stream=True parameter
    r = requests.get(url, stream=True)
    with open(filename, 'wb') as f:
        for chunk in r.iter_content(chunk_size=1024): 
            if chunk: # filter out keep-alive new chunks
                f.write(chunk)
                #f.flush() commented by recommendation from J.F.Sebastian       
    return filename

download_file(url,"bobs.burgers.s09e03.mp4")

This code will download this particular episode onto your computer. The video url is nested inside the <video> tag in the <source> tag.



来源:https://stackoverflow.com/questions/53196594/web-scraping-videos

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!