How to print the href attributes using beautifulsoup while automating through selenium?

一曲冷凌霜 提交于 2019-12-02 08:33:31

If you wish to find all links without trying it manually in Beautifulsoup. Then go for requests-html

Sample code to grab all links,

from requests_html import HTMLSession
from bs4 import BeautifulSoup

url = 'https://society6.com/discover'
session = HTMLSession(mock_browser=True)
r = session.get(url, headers={'User-Agent': 'Mozilla/5.0'})

print(r.html.links)
print(r.html.absolute_links)

soup = BeautifulSoup(r.html.raw_html, 'html.parser')
a_tags = soup.find_all("a", attrs={"class": "author track"})
for a_tag in a_tags:
    print(a_tag['href'])
import requests
from bs4 import BeautifulSoup

data = requests.get('https://society6.com/discover')
soup_data = BeautifulSoup(data.content, "lxml")

for a in soup_data.find_all('a',{'class':'author track'}):
    print('https://society6.com'+a['href'])

As per your question to print the href from the desired elements you can use only Selenium using the following solution:

  • Code Block:

    from selenium import webdriver
    from selenium.webdriver.chrome.options import Options
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
    options = Options()
    options.add_argument("start-maximized")
    options.add_argument("disable-infobars")
    options.add_argument("--disable-extensions")
    options.add_argument("--disable-gpu")
    options.add_argument("--no-sandbox")
    driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\WebDrivers\ChromeDriver\chromedriver_win32\chromedriver.exe')
    driver.get("https://society6.com/login?done=/")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input#email"))).send_keys("exp4money@gmail.com")
    driver.find_element_by_css_selector("input#password").send_keys("sultan1997")
    driver.find_element_by_css_selector("button[name='login']").click()
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a#nav-user-my-society>span"))).click()
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.LINK_TEXT, "Discover"))).click()
    hrefs_elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a.author.track")))
    for element in hrefs_elements:
        print(element.get_attribute("href"))
    
  • Console Output:

    https://society6.com/pivivikstrm
    https://society6.com/cafelab
    https://society6.com/cafelab
    https://society6.com/colorandcolor
    https://society6.com/83oranges
    https://society6.com/aftrdrk
    https://society6.com/alaskanmommabear
    https://society6.com/thindesign
    https://society6.com/colorandcolor
    https://society6.com/aftrdrk
    https://society6.com/aljahorvat
    https://society6.com/bribuckley
    https://society6.com/hennkim
    https://society6.com/franciscomffonseca
    https://society6.com/83oranges
    https://society6.com/nadja1
    https://society6.com/beeple
    https://society6.com/absentisdesigns
    https://society6.com/alexandratarasoff
    https://society6.com/artdekay880
    https://society6.com/annaki
    https://society6.com/cafelab
    https://society6.com/bribuckley
    https://society6.com/bitart
    https://society6.com/draw4you
    https://society6.com/cafelab
    https://society6.com/beeple
    https://society6.com/burcukorkmazyurek
    https://society6.com/absentisdesigns
    https://society6.com/deanng
    https://society6.com/beautifulhomes
    https://society6.com/aftrdrk
    https://society6.com/printsproject
    https://society6.com/bluelela
    https://society6.com/anipani
    https://society6.com/ecmazur
    https://society6.com/batkei
    https://society6.com/menchulica
    https://society6.com/83oranges
    https://society6.com/7115
    
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!