Vertex AI - Antrophic and Mistral models: Why does it require Imegen access?

Script Works in Non-Headless Mode but Fails in Headless Mode with “Element Not Found” Error

Written by News One November 8, 2024

I’m using SeleniumBase in Python to automate interactions on a webpage. My script runs perfectly in non-headless mode, but when I set it to headless (headless=True or headless2=True), it fails to find certain elements, particularly #card-lib-selectCompany-change, even after multiple scrolling attempts. Here’s the relevant part of my code

from seleniumbase import SB

def scrape_servipag_service_reading(service_type, company, identifier):
    result = None
    with SB(uc=True, headless2=True) as sb:
        try:
            sb.set_window_size(1920, 1080)
            url = f"https://portal.servipag.com/paymentexpress/category/{service_type}"
            sb.uc_open_with_reconnect(url, 4)
            sb.uc_gui_click_captcha()
            sb.sleep(10)

            # Scroll to find element
            for _ in range(30):
                sb.execute_script("window.scrollBy(0, 100);")
                sb.sleep(0.2)
                if sb.is_element_visible("#card-lib-selectCompany-change"):
                    break

            sb.wait_for_element("#card-lib-selectCompany-change", timeout=100)
            sb.select_option_by_text("#card-lib-selectCompany-change", company)
            # More interactions follow...

        except Exception as e:
            print(f"Error: {e}")
    return result

When I run this code in non-headless mode, it finds and interacts with #card-lib-selectCompany-change without any issues. However, in headless mode, I receive the error:

Elemento #card-lib-selectCompany-change no se encontró después de
varios intentos de desplazamiento.

I’ve tried the following to fix the issue in headless mode:

Setting a custom user-agent.
Explicitly setting the viewport size to 1920×1080.
Adding more scroll attempts and sleep intervals.
Switching between headless=True and headless2=True.

I added a print statement in headless mode (headless=True), and it returned a result that looks as if it never actually loaded the intended page. Here’s the output for the URL and the page source:

 print("Current URL:", sb.get_current_url())
 print("Page Source:", sb.get_page_source())

The URL and page source show content unrelated to the target page:

 Current URL:
 chrome-extension://neajdppkdcdipfabeoofebfddakdcjhd/audio.html 
 Page Source: <html><head></head><body>   <script
 src="audio.js"></script><audio></audio> </body></html>

Has anyone experienced something similar or knows how to ensure the page loads fully in headless mode?

Source link

Leave a Reply Cancel reply