¿Cómo imprimir los atributos href usando beautifulsoup mientras se automatiza a través de selenium?

El elemento azul es lo que quiero acceder para el desguace web

El valor href del elemento azul es lo que quiero acceder desde este HTML

Probé algunas formas de imprimir el enlace pero no funcionó.

Mi código a continuación: –

discover_page = BeautifulSoup(r.text, 'html.parser') finding_accounts = discover_page.find_all("a", class_="author track") print(len(finding_accounts)) finding_accounts = discover_page.find_all('a[class="author track"]') print(len(finding_accounts)) accounts = discover_page.select('a', {'class': 'author track'})['href'] print(len(accounts)) Output:- 0 0 TypeError: 'dict' object is not callable 

La URL de la página web es https://society6.com/discover pero la URL cambia a https://society6.com/society?show=2 después de iniciar sesión en mi cuenta

¿Qué estoy haciendo mal aquí?

Nota: – Estoy usando el navegador Selenium Chrome aquí. La respuesta dada aquí funciona en mi terminal pero no cuando ejecuto el archivo

Mi código completo: –

 from selenium import webdriver import time import requests from bs4 import BeautifulSoup import lxml driver = webdriver.Chrome() driver.get("https://society6.com/login?done=/") username = driver.find_element_by_id('email') username.send_keys("exp4money@gmail.com") password = driver.find_element_by_id('password') password.send_keys("sultan1997") driver.find_element_by_name('login').click() time.sleep(5) driver.find_element_by_link_text('My Society').click() driver.find_element_by_link_text('Discover').click() time.sleep(5) r = requests.get(driver.current_url) r.raise_for_status() '''discover_page = BeautifulSoup(r.html.raw_html, 'html.parser') finding_accounts = discover_page.find_all("a", class_="author track") print(len(finding_accounts)) finding_accounts = discover_page.find_all('a[class="author track"]') print(len(finding_accounts)) links = [] for a in discover_page.find_all('a', class_ = 'author track'): links.append(a['href']) #links.append(a.get('href')) print(links)''' #discover_page.find_all('a') links = [] for a in discover_page.find_all("a", attrs = {"class": "author track"}): links.append(a['href']) #links.append(a.get('href')) print(links) #soup.find_all("a", attrs = {"class": "author track"})''' soup = BeautifulSoup(r.content, "lxml") a_tags = soup.find_all("a", attrs={"class": "author track"}) for a in soup.find_all('a',{'class':'author track'}): print('https://society6.com'+a['href']) 

Los códigos en la documentación es el que estaba usando experimentando

Si deseas encontrar todos los enlaces sin intentarlo manualmente en Beautifulsoup. Entonces ve por peticiones-html

Código de ejemplo para capturar todos los enlaces,

 from requests_html import HTMLSession from bs4 import BeautifulSoup url = 'https://society6.com/discover' session = HTMLSession(mock_browser=True) r = session.get(url, headers={'User-Agent': 'Mozilla/5.0'}) print(r.html.links) print(r.html.absolute_links) soup = BeautifulSoup(r.html.raw_html, 'html.parser') a_tags = soup.find_all("a", attrs={"class": "author track"}) for a_tag in a_tags: print(a_tag['href']) 
 import requests from bs4 import BeautifulSoup data = requests.get('https://society6.com/discover') soup_data = BeautifulSoup(data.content, "lxml") for a in soup_data.find_all('a',{'class':'author track'}): print('https://society6.com'+a['href']) 

De acuerdo con su pregunta para imprimir la href de los elementos deseados, solo puede utilizar Selenium utilizando la siguiente solución:

  • Bloque de código:

     from selenium import webdriver from selenium.webdriver.chrome.options import Options from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC options = Options() options.add_argument("start-maximized") options.add_argument("disable-infobars") options.add_argument("--disable-extensions") options.add_argument("--disable-gpu") options.add_argument("--no-sandbox") driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\WebDrivers\ChromeDriver\chromedriver_win32\chromedriver.exe') driver.get("https://society6.com/login?done=/") WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input#email"))).send_keys("exp4money@gmail.com") driver.find_element_by_css_selector("input#password").send_keys("sultan1997") driver.find_element_by_css_selector("button[name='login']").click() WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a#nav-user-my-society>span"))).click() WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.LINK_TEXT, "Discover"))).click() hrefs_elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a.author.track"))) for element in hrefs_elements: print(element.get_attribute("href")) 
  • Salida de consola:

     https://society6.com/pivivikstrm https://society6.com/cafelab https://society6.com/cafelab https://society6.com/colorandcolor https://society6.com/83oranges https://society6.com/aftrdrk https://society6.com/alaskanmommabear https://society6.com/thindesign https://society6.com/colorandcolor https://society6.com/aftrdrk https://society6.com/aljahorvat https://society6.com/bribuckley https://society6.com/hennkim https://society6.com/franciscomffonseca https://society6.com/83oranges https://society6.com/nadja1 https://society6.com/beeple https://society6.com/absentisdesigns https://society6.com/alexandratarasoff https://society6.com/artdekay880 https://society6.com/annaki https://society6.com/cafelab https://society6.com/bribuckley https://society6.com/bitart https://society6.com/draw4you https://society6.com/cafelab https://society6.com/beeple https://society6.com/burcukorkmazyurek https://society6.com/absentisdesigns https://society6.com/deanng https://society6.com/beautifulhomes https://society6.com/aftrdrk https://society6.com/printsproject https://society6.com/bluelela https://society6.com/anipani https://society6.com/ecmazur https://society6.com/batkei https://society6.com/menchulica https://society6.com/83oranges https://society6.com/7115