Integrate The Home Depot Search Page Results Data with SerpApi and Python
This blog post is a step-by-step guide to scraping The Home Depot search results with SerpApi using Python.
Intro
SerpApi’s Home Depot API allows to scrape product information in real time for your automation without the knowledge of web scraping. In this blog post, we'll go over how to extract data from all pages using The Home Depot Search API and Python programming language. You can look at the complete code in the online IDE (Replit).
If you prefer video format, we have a dedicated video that shows how to do that: The Home Depot Search API overview - SerpApi.
What will be scraped
📌Note: Each page displays up to 24 products.
Why using API?
There're a couple of reasons that may use API, ours in particular:
- No need to create a parser from scratch and maintain it.
- Bypass blocks from Google: solve CAPTCHA or solve IP blocks.
- Pay for proxies, and CAPTCHA solvers.
- Don't need to use browser automation.
SerpApi handles everything on the backend with fast response times under ~2.5 seconds (~1.2 seconds with Ludicrous speed) per request and without browser automation, which becomes much faster. Response times and status rates are shown under SerpApi Status page.
Full Code
This code retrieves all the data with pagination:
from serpapi import GoogleSearch
import json
params = {
'api_key': '...', # https://serpapi.com/manage-api-key
'engine': 'home_depot', # SerpApi search engine
'q': 'coffeee', # query
'ps': 10, # number of items per page
'lowerbound': 20, # minimum price
'upperbound': 50, # maximum price
'hd_sort': 'top_rated', # sorted by different options
'page': 1 # pagination
}
search = GoogleSearch(params) # where data extraction happens on the SerpApi backend
results = search.get_dict() # JSON -> Python dict
home_depot_results = {
'search_information': results['search_information'],
'filters': results['filters'],
'products': []
}
while 'error' not in results:
home_depot_results['products'].extend(results['products'])
params['page'] += 1
results = search.get_dict()
print(json.dumps(home_depot_results, indent=2, ensure_ascii=False))
Preparation
Install library:
pip install google-search-results
google-search-results
is a SerpApi API package.
Code Explanation
Import libraries:
from serpapi import GoogleSearch
import json
Library | Purpose |
GoogleSearch | to scrape and parse Google results using SerpApi web scraping library. |
json | to convert extracted data to a JSON object. |
The parameters are defined for generating the URL. If you want to pass other parameters to the URL, you can do so using the params
dictionary:
params = {
'api_key': '...', # https://serpapi.com/manage-api-key
'engine': 'home_depot', # SerpApi search engine
'q': 'coffeee', # query
'ps': 10, # number of items per page
'lowerbound': 20, # minimum price
'upperbound': 50, # maximum price
'hd_sort': 'top_rated', # sorted by different options
'page': 1 # pagination
}
Parameters | Explanation |
api_key | Parameter defines the SerpApi private key to use. You can find it under your account -> API key |
engine | Set parameter to home_depot to use The Home Depot API engine. |
q | Parameter defines the search query. You can use anything that you would use in a regular The Home Depot search. |
ps | Determines the number of items per page. There are scenarios where Home depot overrides the ps value. By default Home depot returns 24 results. 48 is the max value. |
lowerbound | Defines lower bound for price in USD. |
upperbound | Defines upper bound for price in USD. |
hd_sort | Parameter defines results sorted by diferent options. |
page | Value is used to get the items on a specific page. (e.g., 1 (default) is the first page of results, 2 is the 2nd page of results, 3 is the 3rd page of results, etc.). |
📌Note: You can also add other API Parameters.
Then, we create a search
object where the data is retrieved from the SerpApi backend. In the results
dictionary we get data from JSON:
search = GoogleSearch(params) # data extraction on the SerpApi backend
results = search.get_dict() # JSON -> Python dict
Consider the parameters mentioned above to understand how they affect the request.
- You may have noticed that I made a mistake when passing the value to the
q
parameter. This was done on purpose to demonstrate that SerpApi's The Home Depot Spell Check API allows you to extract the corrected search term and search it:
print(results['search_information']['spelling_fix']) # coffee
SerpApi’s Home Depot Sorting API allows you to change the ordering of scraped data according to various product details such as price, overall rating of customer reviews, etc. via
hd_sort
. It can be set to:top_sellers
price_low_to_high
price_high_to_low
top_rated
best_match
SerpApi’s Home Depot Price Bound API allows you to refine searches using
lowerbound
, andupperbound
parameters to set the minimum and maximum price range.
At the moment, the results
dictionary only stores data from 1 page. Before extracting data, the home_depot_results
dictionary is created where this data will be added later. Since the search_information
and filters
are repeated on each subsequent page, you can extract them immediately:
home_depot_results = {
'search_information': results['search_information'],
'filters': results['filters'],
'products': []
}
SerpApi’s Home Depot Filtering API allows you to refine a search according to product details and delivers structured data. It also eliminates the need for complicated web scraping processes or tools required by scrapers.
The filters
contains various filters depending on the product data which also appear on the left side of the HTML. These filters could be used to extract data with only specific product information such as warranty conditions, durability, certifications, brands, etc. to empower your e-commerce projects with the power of data scraping.
Extracting all products
To get all products, you need to apply pagination. This is achieved by the following check: while there is no error
in the results
object of the current page, we extract the data, increase the page
parameter by 1
to get the results from next page and update the results
object with the new page data:
while 'error' not in results:
# data extraction from current page will be here
params['page'] += 1
results = search.get_dict()
📌Error check is done via google-search-results
error managements that checks for backend (failed search or no more results) or client errors.
Extending the home_depot_results['products']
list with new data from this page:
home_depot_results['products'].extend(results['products'])
# filter_categories = results['filters'][1]
# product_title = results['products'][0]['title']
# product_price = results['products'][0]['price']
# product_rating = results['products'][0]['rating']
# product_reviews = results['products'][0]['reviews']
📌Note: In the comments above, I showed how to extract specific fields. You may have noticed the results['products'][0]
. This is the index of a product, which means that we are extracting data from the first product. The results['products'][1]
is from the second product and so on.
After the all data is retrieved, it is output in JSON format:
print(json.dumps(home_depot_results, indent=2, ensure_ascii=False))
Output
{
"search_information": {
"results_state": "Results for spelling fix",
"total_results": 1111,
"spelling_fix": "coffee"
},
"products": [
{
"position": 1,
"product_id": "308728897",
"title": "Mardi Gras King Cake Medium Roast Single Serve Cups (54-Pack)",
"thumbnails": [
[
"https://images.thdstatic.com/productImages/c24e5edc-38bd-4be8-b95e-c37035c158e3/svn/community-coffee-coffee-pods-k-cups-16324-64_65.jpg",
"https://images.thdstatic.com/productImages/c24e5edc-38bd-4be8-b95e-c37035c158e3/svn/community-coffee-coffee-pods-k-cups-16324-64_100.jpg",
"https://images.thdstatic.com/productImages/c24e5edc-38bd-4be8-b95e-c37035c158e3/svn/community-coffee-coffee-pods-k-cups-16324-64_145.jpg",
"https://images.thdstatic.com/productImages/c24e5edc-38bd-4be8-b95e-c37035c158e3/svn/community-coffee-coffee-pods-k-cups-16324-64_300.jpg",
"https://images.thdstatic.com/productImages/c24e5edc-38bd-4be8-b95e-c37035c158e3/svn/community-coffee-coffee-pods-k-cups-16324-64_400.jpg",
"https://images.thdstatic.com/productImages/c24e5edc-38bd-4be8-b95e-c37035c158e3/svn/community-coffee-coffee-pods-k-cups-16324-64_600.jpg",
"https://images.thdstatic.com/productImages/c24e5edc-38bd-4be8-b95e-c37035c158e3/svn/community-coffee-coffee-pods-k-cups-16324-64_1000.jpg"
]
],
"link": "https://www.homedepot.com/p/Community-Coffee-Mardi-Gras-King-Cake-Medium-Roast-Single-Serve-Cups-54-Pack-16324/308728897",
"serpapi_link": "https://serpapi.com/search.json?delivery_zip=04401&engine=home_depot_product&product_id=308728897&store_id=2414",
"model_number": "16324",
"brand": "Community Coffee",
"collection": "https://www.homedepot.com",
"favorite": 10,
"rating": 4.9237,
"reviews": 131,
"price": 34.65,
"unit": "case",
"delivery": {
"free": true,
"free_delivery_threshold": false
},
"pickup": {
"free_ship_to_store": true
}
},
... ohter products
],
"filters": [
{
"key": "Review Rating",
"value": [
{
"name": "5",
"count": "519",
"value": "bwo5q",
"link": "https://www.homedepot.com/b/Best-Rated/N-5yc1vZbwo5q/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "Category",
"value": [
{
"name": "Food & Gifts",
"count": "134",
"value": "cigl",
"link": "https://www.homedepot.com/b/Food-Gifts/N-5yc1vZcigl/Ntk-elastic/Ntt-coffee?NCNI-5"
},
... ohter results
]
},
{
"key": "Get It Fast",
"value": [
{
"name": "Pick Up Today",
"count": "16",
"value": "1z175a5",
"link": "https://www.homedepot.com/b/Pick-Up-Today/N-5yc1vZ1z175a5/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "Brand",
"value": [
{
"name": "Victor Allen's",
"count": "55",
"value": "nig",
"link": "https://www.homedepot.com/b/Victor-Allens/N-5yc1vZnig/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "Price",
"value": [
{
"name": "$10 - $20",
"count": "4",
"value": "12ky",
"link": "https://www.homedepot.com/b/N-5yc1vZ12ky/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "Coffee/Tea Type",
"value": [
{
"name": "Pods/K cups",
"count": "92",
"value": "1z0jm9l",
"link": "https://www.homedepot.com/b/Pods-K-cups/N-5yc1vZ1z0jm9l/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "Brand Compatibility",
"value": [
{
"name": "Keurig",
"count": "71",
"value": "1z0knhy",
"link": "https://www.homedepot.com/b/Keurig/N-5yc1vZ1z0knhy/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "Flavor",
"value": [
{
"name": "Variety Pack",
"count": "12",
"value": "1z0wjbu",
"link": "https://www.homedepot.com/b/Variety-Pack/N-5yc1vZ1z0wjbu/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "Package Quantity",
"value": [
{
"name": "80",
"count": "26",
"value": "1z0w7jy",
"link": "https://www.homedepot.com/b/80/N-5yc1vZ1z0w7jy/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "Coffee Appliance Type",
"value": [
{
"name": "Other",
"count": "39",
"value": "1z1ab3g",
"link": "https://www.homedepot.com/b/Other/N-5yc1vZ1z1ab3g/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "Collection Name",
"value": [
{
"name": "Coffee Pods & K-Cups",
"count": "43",
"value": "1z1usiw",
"link": "https://www.homedepot.com/b/Coffee-Pods-K-Cups/N-5yc1vZ1z1usiw/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "Number of Cups",
"value": [
{
"name": "1 cup",
"count": "30",
"value": "1z1b0v5",
"link": "https://www.homedepot.com/b/1-cup/N-5yc1vZ1z1b0v5/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "Product volume (fl. oz.)",
"value": [
{
"name": "8",
"count": "50",
"value": "1z1bpch",
"link": "https://www.homedepot.com/b/8/N-5yc1vZ1z1bpch/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "Capacity (fl. oz.)",
"value": [
{
"name": "12 fl. oz.",
"count": "19",
"value": "1z1b105",
"link": "https://www.homedepot.com/b/12-fl-oz/N-5yc1vZ1z1b105/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "Diet & Allergens",
"value": [
{
"name": "Not Applicable",
"count": "124",
"value": "1z1bjkr",
"link": "https://www.homedepot.com/b/Not-Applicable/N-5yc1vZ1z1bjkr/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "Small Appliances Color Family",
"value": [
{
"name": "Black",
"count": "72",
"value": "1z1ab15",
"link": "https://www.homedepot.com/b/Black/N-5yc1vZ1z1ab15/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "New Arrival",
"value": [
{
"name": "Recently Added",
"count": "70",
"value": "1z179pc",
"link": "https://www.homedepot.com/b/Recently-Added/N-5yc1vZ1z179pc/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
},
{
"key": "Subscription Eligible",
"value": [
{
"name": "Subscription Eligible",
"count": "5",
"value": "1z18amw",
"link": "https://www.homedepot.com/b/Subscription-Eligible/N-5yc1vZ1z18amw/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
},
... ohter results
]
}
]
}
📌Note: Head to the playground for a live and interactive demo.
Links
- Code in the online IDE
- The Home Depot Search API
- The Home Depot Filtering API
- The Home Depot Price Bound API
- The Home Depot Sorting API
- The Home Depot Spell Check API
Add a Feature Request💫 or a Bug🐞