scrape YouTube video from a specific channel and search?

痞子三分冷 提交于 2020-06-27 12:10:34

问题


I am using this code to get the url of a youtube channel it works fine, but I would like to add an option to search for a video with a specific title within the channel. and get the url of the first video you find with the search phrase

from bs4 import BeautifulSoup
import requests

url="https://www.youtube.com/feeds/videos.xml?user=LinusTechTips"
html = requests.get(url)
soup = BeautifulSoup(html.text, "lxml")

for entry in soup.find_all("entry"):
    for link in entry.find_all("link"):
        print(link["href"])


回答1:


In my last answer, you get all the video titles in the given youtube channel, as what you looking for But in the comments between us, you tell me you wanna run the script via cronjob, it takes more effort, so I add another answer.

from bs4 import BeautifulSoup
from lxml import etree
import urllib
import requests
import sys

def fetch_titles(url):
    video_titles = []
    html = requests.get(url)
    soup = BeautifulSoup(html.text, "lxml")
    for entry in soup.find_all("entry"):
        for link in entry.find_all("link"):
            youtube = etree.HTML(urllib.request.urlopen(link["href"]).read()) 
            video_title = youtube.xpath("//span[@id='eow-title']/@title") 
            if len(video_title)>0:
                video_titles.append({"title":video_title[0], "url":link.attrs["href"]})
    return video_titles

def main():
    if sys.argv.__len__() == 1:
        print("Error: You should specifying keyword")
        print("eg: python3 ./main.py KEYWORD")
        return

    url="https://www.youtube.com/feeds/videos.xml?user=LinusTechTips"
    keyword = sys.argv[1]

    video_titles = fetch_titles(url)
    for video in video_titles:
        if video["title"].__contains__(keyword):
            print(video["url"])
            break # add this line, if you want to print the first match only


if __name__ == "__main__":
    main()

When you call the script via Terminal, you should specify the keyword, like this:

$ python3 ./main.py Mac

Which Mac is the keyword and main.py is the python script filename

Output:

https://www.youtube.com/watch?v=l_IHSRPVqwQ




回答2:


This is a good way to do it, but you'll have a lot more leverage using a tool like youtube-dl. Try something like youtube-dl "ytsearchall:intitle:'hello world'" --dump-json --flat-playlist. youtube-dl has a ton of functionality and will probably meet all of your video scraping needs with little or no modification.

In terms of implementing your own search- the basics are pretty straightforward, but may not give you the experience you're looking for. You would want to collect the titles, probably into a dict with values that have the URL, and then you'd need to iterate over the keys searching for text. Exact keyword matching in this fashion isn't hard, but it may also not be what you're expecting, because most search engines use a lot of criteria to give you what you're looking for.




回答3:


Do like this friend:

from bs4 import BeautifulSoup
from lxml import etree
import urllib
import requests

url="https://www.youtube.com/feeds/videos.xml?user=LinusTechTips"
html = requests.get(url)
soup = BeautifulSoup(html.text, "lxml")

video_titles =[]

print("Cashing Video Titles ...")
for entry in soup.find_all("entry"):
    for link in entry.find_all("link"):
        youtube = etree.HTML(urllib.request.urlopen(link["href"]).read()) 
        video_title = youtube.xpath("//span[@id='eow-title']/@title") 
        if len(video_title)>0:
            video_titles.append({"title":video_title[0], "url":link.attrs["href"]})
            print(len(video_titles), ":", video_title[0])

print("Cashing Video Titles Done!")


keyword = input("Enter the keyword you wanna search:")
for video in video_titles:
    if video["title"].__contains__(keyword):
        print(video["url"])

Output:

Cashing Video Titles...
1: The $32,000 Mac Pro Killer
2: Sony PlayStation - by Alienware - WAN Show June 12, 2020
3: Experimental 120FPS Game Streaming!
4: We Edited This Video on an iPad Pro!
5: The Tiniest Gaming Laptop!
6: I spent two days in my attic to avoid a camera subscription!
7: Stolen iPhones Rat Out New "Owners" - WAN Show June 5, 2020
8: We got the GPU AMD wouldnât sellâ¦
9: Will More RAM Make your PC Faster?? (2020)
Cashing Video Titles Done
Enter the keyword you wanna search: Mac
https://www.youtube.com/watch?v=l_IHSRPVqwQ


来源:https://stackoverflow.com/questions/62381342/scrape-youtube-video-from-a-specific-channel-and-search

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!