How to Scrape Google Search Results: Python, APIs & AI Overviews

How to scrape Google search results successfully in 2026 comes down to scale and browser rendering. I build and maintain web data pipelines. The old HTTP request tutorials are completely dead. Between Google's January 2025 SearchGuard rollout, the removal of the num=100 parameter, and the rise of asynchronous AI Overviews, direct scraping fails instantly. If you need a few dozen results for a learning project, use Playwright to automate a headless browser. If you need reliable SERP data for rank tracking or AI agents, you must use a dedicated API.

To scrape Google search results effectively, match your architecture to your daily query volume. For low-volume testing, use Python with Playwright to render the page, then parse the DOM using BeautifulSoup. For production workflows, rank tracking, or AI visibility analysis, use a dedicated SERP API or managed parser. These enterprise tools automatically handle Google's JavaScript challenges, proxy rotation, and dynamic AI Overviews to return normalized JSON.

The Best Way to Get Google Search Result Data

For low-volume learning, do-it-yourself browser automation works perfectly. For recurring pipelines, rank tracking, and speed-first AI workflows, use a dedicated SERP API or a managed parser workflow to bypass DOM complexity and JavaScript bot challenges.

You have three main options for getting search data. The modern API market groups into full SERP APIs, fast APIs, and index-based search APIs. Here is how I map those to specific jobs:

Rank tracking and SEO monitoring: Full SERP API or managed parser workflow.
Competitor intelligence: Full SERP API with historical storage capabilities.
AI visibility and brand tracking: Full SERP API with explicit AI Overview rendering.
AI agents and RAG applications: Index-based search API combined with a scrape-and-parse pipeline.
Learning projects: Playwright and BeautifulSoup stack in Python.

If you already know you need structured JSON, skip directly to the production workflow section.

Why Scraping Google Changed in 2025 and 2026

Legacy web scraping tutorials fail because the collection layer, the layout of the SERP, and the legal climate fundamentally changed over the past 18 months. You must render JavaScript to see modern search results.

SearchGuard Ended the HTTP-First Playbook

In January 2025, Google rolled out SearchGuard. Google's December 2025 legal complaint against SerpApi explicitly names SearchGuard as a technological protection measure. It operates by serving JavaScript challenges to unrecognized or automated sources before granting access to the search results page.

Google alleges that building SearchGuard required tens of thousands of person-hours. Automated query volume from providers increased drastically prior to this crackdown. If you send a naive HTTP request today, you will receive a challenge page instead of search results.

The `num=100` Removal Multiplied Costs

Historically, developers appended &num=100 to a query URL to fetch 100 results per page. Google deprecated this parameter in September 2025. Scrapers now must make 10 separate paginated requests to get the same 100 links. This drastically increased proxy bandwidth, infrastructure load, and the overall cost of data collection.

Search Engine Land analyzed 319 properties following this change. They found 87.7% lost impressions and 77.6% lost unique ranking terms. The entire SEO measurement ecosystem had to rebuild its infrastructure.

AI Overviews Turned the SERP Into an App

AI Overviews appear on nearly 48% of all queries and up to 80% of informational queries. These elements often load asynchronously after the initial HTML response. Fetching the page once via a standard request yields an empty container. Scraping Google results now requires rendering the page like a dynamic single-page application.

Google Search API vs. Google Custom Search API vs. SERP APIs

The phrase "Google Search API" usually refers to three different things. Confusing them leads to broken pipelines. Google's official JSON API is restricted and closed to new signups, making third-party tools mandatory for live SERP monitoring.

Does Google have an official Google Search API? Not in the way most developers mean it. No direct, public API exists for scraping full live google.com search results.

The Google Custom Search JSON API is the closest official tool. It requires an API key and a configured Programmable Search Engine ID. However, Google closed it to new customers. Existing customers must transition off the product by January 1, 2027. More importantly, it retrieves results from a defined Programmable Search Engine. It does not perfectly mirror a live google.com search. If your job involves precise rank tracking or global AI visibility, this official API will return mismatched data.

Third-party SERP APIs handle browser rendering, JavaScript challenges, proxy management, and JSON parsing. Some prioritize exact SERP layout fidelity. Others optimize for raw speed. Meanwhile, AI-native search APIs act as a discovery layer for language models, prioritizing semantic relevance over exact Google rankings.

4. What Data Can You Extract from a Modern Google SERP?

[Key Takeaway Box]

Do not build a pipeline expecting just a list of blue links. Build a schema that accounts for modern dynamic elements, knowledge graphs, local packs, and AI citations.

A modern Google SERP offers significantly more than organic links. Useful fields include organic results, featured snippets, People Also Ask boxes, related searches, knowledge graph data, local pack results, ads, and AI Overview citations.

Your parsing logic must isolate these specific surfaces. A robust parser handles distinct fields for searchParameters, knowledgeGraph, organic, peopleAlsoAsk, and relatedSearches. Given that AI Overviews trigger on nearly half of all queries, capturing asynchronous citations is critical for competitive intelligence.

How to Scrape Google Search Results with Python

How to use Python to scrape Google search results relies entirely on browser automation. Use Playwright as the render layer and BeautifulSoup or lxml as the parse layer after the page loads.

I recommend Python for low-volume experiments, provided your fetch layer is a headless browser. The "render first, parse second" strategy is the only reliable do-it-yourself method remaining.

BeautifulSoup is exceptional at parsing rendered HTML. It is practically useless as a standalone strategy for fetching Google directly because it cannot execute JavaScript.

Minimal Python Workflow

Construct your target URL including query (q), country (gl), and language (hl) parameters.
Use Playwright to execute JavaScript and wait for the target containers to populate.
Pass the rendered HTML to BeautifulSoup and target elements using robust fallback selectors.
Map the extracted strings to a defined dictionary structure and export to JSON.

You will eventually hit JS challenge pages. You must also account for selector drift, CAPTCHA interruptions, geo-location leakage, and incomplete AI Overview extraction.

code

# Code block: Current Playwright and BeautifulSoup example
from playwright.sync_api import sync_playwright
from bs4 import BeautifulSoup
import json

def scrape_google_search(query):
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        page = browser.new_page()
        # Navigate to Google with specific localization parameters
        page.goto(f"https://www.google.com/search?q={query}&gl=us&hl=en")

        # Wait for the main search container to ensure JavaScript execution
        page.wait_for_selector("div#search")
        html = page.content()
        browser.close()

    soup = BeautifulSoup(html, "html.parser")
    results = []

    # Extract organic results 
    for g in soup.find_all('div', class_='g'):
        title = g.find('h3')
        link = g.find('a', href=True)
        if title and link:
            results.append({
                "title": title.text, 
                "url": link['href']
            })

    return json.dumps(results, indent=2)

print(scrape_google_search("enterprise seo tools"))

When to Use a SERP API Instead of Direct Scraping

Use a SERP API when repeatability, strict JSON output, geographic targeting, and maintenance automation matter more to your business than controlling a headless browser manually.

Modern providers split into full SERP APIs, fast APIs, and index-based search APIs. Full SERP APIs provide deep element coverage like PAA and AI Overviews. Fast APIs optimize for latency but sacrifice deeper features. Index-based APIs serve discovery-first AI models.

Do not choose a provider based on marketing alone. Compare providers across price per 1,000 queries, p95 response time, AI Overview state support, schema richness, geographic controls, batch capabilities, and legal indemnification posturing.

If you are using Python, call the API with the query, location, and language parameters, handle authentication, and normalize the JSON into your own schema.

code

import requests

def fetch_serp_api_data(query, api_key):
    url = f"https://api.provider.com/search?q={query}&gl=us&api_key={api_key}"
    response = requests.get(url).json()

    # Normalize into a standard schema
    return {
        "query": query,
        "timestamp": response.get("timestamp"),
        "country": "US",
        "organic": response.get("organic_results", []),
        "people_also_ask": response.get("related_questions", []),
        "ai_overview": response.get("ai_overview", {})
    }

How to Choose the Right Method by Scale, Budget, and Risk

Evaluate your method by matching daily query volume, data depth requirements, and internal engineering capacity against the total cost of ownership. Do-it-yourself scraping often hides massive maintenance costs.

Can you scrape Google search results for free? Yes, at tiny scales. At sustained scales, usually no.

Fewer than 100 queries a day: Do-it-yourself browser automation works fine.
100 to 10,000 queries a day: Managed parser workflows or lightweight SERP APIs become necessary to handle blocking.
More than 10,000 queries a day: Enterprise SERP APIs with batch processing and webhook support are mandatory.

The true cost of building your own scraper includes proxy overhead, headless browser compute, and engineering opportunity cost. The num=100 removal alone multiplied paginated requests by 10x for deep monitoring jobs. Compare that against directional market rates of $1.50 to $3.00 per 1,000 structured requests. Once you factor in retries, selector upkeep, and missed SERP features like deferred AI Overviews, building your own pipeline stops being cheaper long before it stops being technically possible.

How to Scrape AI Overviews and Other Modern SERP Features

A scraper that misses AI Overviews is fundamentally broken for informational queries in 2026. Your parsing architecture must account for asynchronous rendering to accurately capture citations and summaries.

AI Overviews load in three primary states: complete on initial load, deferred via asynchronous rendering, or completely absent. The deferred state causes the highest failure rate. Raw HTTP requests hit the server, receive a 200 OK status, and parse the HTML before the AI Overview script finishes populating the container.

To measure AI visibility accurately, your schema must capture summary text, cited sources, source URLs, render state, timestamp, query, and country context. Playwright allows you to pause execution until the specific container populates. Structured APIs abstract this away entirely. Given AI Overviews cover 48% of total queries, field completeness directly impacts business intelligence.

Is It Legal to Scrape Google Search Results?

Legality is not a binary yes or no. The risk profile depends on Terms of Service, public-data precedents, and DMCA anti-circumvention claims. Assess your legal tolerance before deploying automation at scale.

Layer 1. Terms of Service Risk

Google explicitly prohibits unauthorized automated access in its Terms of Service. Violating ToS can lead to account bans or IP blocking. This is a contractual risk, not a criminal statute.

Layer 2. CFAA and Public-Data Precedent

Older US court cases largely protect the scraping of publicly available data under the Computer Fraud and Abuse Act. However, these precedents do not settle all legal questions regarding Google's proprietary search delivery mechanisms.

Layer 3. DMCA Anti-Circumvention and Google v. SerpApi

On December 19, 2025, Google escalated the legal landscape by filing a complaint against SerpApi alleging violations of DMCA Section 1201. Google framed SearchGuard as a technological protection measure and targeted circumvention systems rather than scraping in the abstract. This litigation is ongoing. Do not assume older public-data protections instantly invalidate a DMCA anti-circumvention claim.

If you ingest scraped SERP data into AI training pipelines, the EU AI Act brings strict transparency and copyright obligations, with enforcement ramping up to August 2026.

Production Workflow: From Query to Structured JSON

Treat SERP collection like a data product, not a script. A reliable production system requires repeatable query generation, schema normalization, timestamped storage, change detection, and automated monitoring.

A script that prints titles to the console is not a data pipeline. You need request orchestration, an extraction layer, parsing, time-series storage, alerting, and provider failover. Store fields symmetrically to ensure longitudinal accuracy. Include the query, run timestamp, country, device, position, title, URL, snippet, SERP feature type, AI Overview block, and source method.

Production pipelines break silently. Implement missing-result detection, schema validation checks, and differential alerts that flag an abnormal drop in results.

Example Managed Workflow with Olostep

I recommend using a managed workflow that separates URL generation, extraction, parsing, and retrieval.

Scrapes: Use POST /v1/scrapes to abstract the extraction layer for a single known Google search URL.
Parser: Use the pre-built @olostep/google-search parser to natively turn complex pages into backend-compatible JSON, exposing nested fields for searchParameters, knowledgeGraph, organic, and peopleAlsoAsk.
Batches: For recurring rank tracking, use /v1/batches. It processes up to 10,000 URLs per batch, supporting item retrieval and automated webhooks.
Search Endpoint: If you are building an AI agent, use the Search endpoint first. It retrieves deduplicated links and acts as a discovery layer before handing off to Scrapes or Batches.

FAQ

Can you scrape Google search results for free?

You can do low-volume experiments with your own code for free. However, free usually excludes the real costs of time, maintenance, rendering, retries, and instability. Third-party SERP tools charge because they absorb the hard parts of extraction, parsing, and reliability.

What happened to the `num=100` parameter?

Google quietly removed support for it in September 2025. The old shortcut stopped being a stable way to pull deeper Google result pages, increasing collection overhead. Scrapers must now parse pagination links recursively to retrieve large result blocks.

How do I get location-specific Google results?

Use the gl parameter for the country code and hl for the language code in your query string. True location fidelity requires a proxy or a SERP API that routes requests through localized residential IP addresses to prevent geo-leakage.

What can a SERP API return that Google Custom Search API cannot?

A full SERP API returns richer live google.com elements, broader SERP feature coverage, parsed JSON fields, and provider-specific handling for dynamic content like AI Overviews. Google's official API is tied to a Programmable Search Engine, is closed to new customers, and is not a full live google.com replica.

Do I need proxies and CAPTCHA handling for DIY scraping?

Yes. Once your query volume triggers SearchGuard, Google deploys JavaScript challenges. Evading them manually requires rotating residential proxies and sophisticated browser fingerprinting.

Next Step: Pick Your Path and Implement It

Do not let complexity paralyze your data pipeline. If you are learning the architecture and wondering how to web scrape Google search results, test the Python Playwright stack locally. If you are deciding between vendors, match your use case to the provider table and prioritize AI Overview completeness. If you are building for enterprise scale, move immediately to a production workflow.

Start with one query, one schema, and one target use case. Scale the method that survives contact with reality. Understanding how to scrape Google search results properly means acknowledging that extraction is no longer a simple script, but a rendering and compliance challenge. Choose the right tool, build robust fallback parsing, and monitor your data quality daily.

How to Scrape Google Search Results: Python, APIs & AI Overviews

The Best Way to Get Google Search Result Data

Why Scraping Google Changed in 2025 and 2026

SearchGuard Ended the HTTP-First Playbook

The `num=100` Removal Multiplied Costs

AI Overviews Turned the SERP Into an App

Google Search API vs. Google Custom Search API vs. SERP APIs

4. What Data Can You Extract from a Modern Google SERP?

How to Scrape Google Search Results with Python

Minimal Python Workflow

When to Use a SERP API Instead of Direct Scraping

How to Choose the Right Method by Scale, Budget, and Risk

How to Scrape AI Overviews and Other Modern SERP Features

Is It Legal to Scrape Google Search Results?

Production Workflow: From Query to Structured JSON

Example Managed Workflow with Olostep

FAQ

Can you scrape Google search results for free?

What happened to the `num=100` parameter?

How do I get location-specific Google results?

What can a SERP API return that Google Custom Search API cannot?

Do I need proxies and CAPTCHA handling for DIY scraping?

Next Step: Pick Your Path and Implement It

On this page

Read more

How to Scrape Google Search Results: Python, APIs & AI Overviews

The Best Way to Get Google Search Result Data

Why Scraping Google Changed in 2025 and 2026

SearchGuard Ended the HTTP-First Playbook

The num=100 Removal Multiplied Costs

AI Overviews Turned the SERP Into an App

Google Search API vs. Google Custom Search API vs. SERP APIs

4. What Data Can You Extract from a Modern Google SERP?

How to Scrape Google Search Results with Python

Minimal Python Workflow

When to Use a SERP API Instead of Direct Scraping

How to Choose the Right Method by Scale, Budget, and Risk

How to Scrape AI Overviews and Other Modern SERP Features

Is It Legal to Scrape Google Search Results?

Production Workflow: From Query to Structured JSON

Example Managed Workflow with Olostep

FAQ

Can you scrape Google search results for free?

What happened to the num=100 parameter?

How do I get location-specific Google results?

What can a SERP API return that Google Custom Search API cannot?

Do I need proxies and CAPTCHA handling for DIY scraping?

Next Step: Pick Your Path and Implement It

On this page

Read more

The `num=100` Removal Multiplied Costs

What happened to the `num=100` parameter?