How to enrich product listings with Linkup

Automate catalog enrichment by pulling specifications from manufacturer sites, gathering competitive pricing, extracting images, and compiling product data from authoritative sources.

In This Guide

Catalog Enrichment

Pull specs, images, and descriptions from manufacturer sources

Competitive Pricing

Track competitor pricing across marketplaces

Attribute Extraction

Extract structured attributes for faceted search

Review Aggregation

Gather ratings and reviews from across the web

Product Matching

Identify same products across different sources

Missing Data Gap-Fill

Fill gaps in existing listings

Overview

Product listing quality directly impacts conversion rates, search rankings, and customer trust. Yet maintaining rich, accurate listings across thousands of SKUs is a massive operational challenge. Linkup's agentic search can automate the enrichment process—pulling specifications from manufacturer sites, gathering competitive pricing, extracting images, and compiling product data from authoritative sources.

Why Linkup for product enrichment?

deep search

Executes "find manufacturer page → scrape specs" workflows automatically

structuredOutput

Returns data mapped directly to your catalog schema

agentic retrieval

Navigates product pages, datasheets, and PDFs to extract structured attributes

scale with consistency

Enrich at scale with consistent, structured outputs

Configuration

Recommended settings for product enrichment

Parameter	Value	Why
`depth`	`deep`	Enrichment requires finding source pages, then scraping detailed specs
`outputType`	`structuredOutput`	Returns attributes in your exact catalog schema
`includeDomains`	`optional`	Restrict to manufacturer sites or trusted sources

Use Cases

Practical examples with prompts and schemas

Catalog Enrichment from Manufacturer Sources

Pull official specifications, descriptions, and images from manufacturer websites.

Prompt- Manufacturer Enrichment

You are a product data specialist enriching an e-commerce catalog.

Product: {product_name}
Brand: {brand}
Model/SKU: {model_number}
Current listing URL (ours): {our_listing_url}

Execute the following steps:

1. Search for the official {brand} product page for model {model_number}.

2. Once found, scrape the manufacturer's product page to extract:
   - Official product name
   - Marketing description
   - Complete technical specifications
   - All product image URLs
   - Product dimensions and weight
   - Key features list
   - Warranty information

3. Search for the product datasheet or spec sheet PDF for {model_number}. If found, extract any additional specifications not on the main product page.

4. Search for {brand} {model_number} on the brand's support/documentation page to find:
   - User manual PDF link
   - Installation guide link
   - Compatibility information

Return only official manufacturer data. Do not include third-party descriptions or reviews.

Competitive Price Monitoring

Track competitor pricing across marketplaces to inform your pricing strategy.

Prompt- Price Monitoring

You are a pricing analyst monitoring competitor listings for a specific product.

Product: {product_name}
Brand: {brand}
Model/SKU: {model_number}
UPC/EAN (if known): {upc}
Our current price: {our_price}

Execute the following steps:

1. Search for this exact product on major retail and marketplace sites:
   - Amazon
   - Walmart
   - Target
   - Best Buy
   - Home Depot (if applicable)
   - Category-specific marketplaces

2. For each listing found, scrape the product page to extract:
   - Seller/retailer name
   - Current price
   - Original price (if on sale)
   - Shipping cost or free shipping threshold
   - Stock availability
   - Listing URL
   - Any active promotions or coupons mentioned

3. Verify each listing matches our product by confirming model number or UPC.

Only include listings for the exact product—not similar or compatible items.

Product Attribute Extraction for Faceted Search

Extract structured attributes to power filters and faceted navigation.

Prompt- Attribute Extraction

You are a catalog data specialist extracting product attributes for search and filtering.

Product category: {category}
Product: {product_name}
Brand: {brand}
Model: {model_number}

Your goal is to extract all filterable attributes for this {category} product.

1. Find and scrape the manufacturer's product page for {brand} {model_number}.

2. Extract these category-specific attributes:
{attribute_list_for_category}

3. Search for the product datasheet to find any technical specifications not listed on the main page.

4. Normalize all values to standard units and formats:
   - Dimensions in inches (convert from cm if needed)
   - Weight in pounds
   - Colors as standard color names
   - Capacities in standard units

Return structured attributes ready for faceted search implementation.

Review & Rating Aggregation

Gather social proof from across the web for new or thin listings.

Prompt- Review Aggregation

You are a product research analyst gathering review data from across the web.

Product: {product_name}
Brand: {brand}
Model: {model_number}

Execute the following steps:

1. Search for {brand} {model_number} reviews on major retail sites and find the aggregate rating and review count from each source.

2. Search for {brand} {model_number} reviews on specialty review sites relevant to this product category (e.g., RTINGS, Wirecutter, CNET, Tom's Guide, etc.).

3. Search for {brand} {model_number} on YouTube to find video reviews. Extract:
   - Channel name
   - Video title
   - View count
   - Overall sentiment (positive/negative/mixed)

4. Search for Reddit discussions about {brand} {model_number} to understand real user sentiment.

Compile an overall sentiment summary based on patterns across sources. Note any consistent praise or complaints.

Product Matching & Deduplication

Identify the same product across different sources with varying names/identifiers.

Prompt- Product Matching

You are a product data specialist identifying duplicate and matching products.

Source product:
- Name: {source_product_name}
- Brand: {brand}
- Identifiers available: {identifiers} (e.g., UPC, MPN, ASIN, etc.)
- Key specs: {key_specs}

Execute the following steps:

1. Search for this product using each available identifier to find listings on other platforms.

2. For ambiguous matches (same brand, similar name, but different identifiers), scrape both product pages and compare:
   - Exact dimensions
   - Weight
   - Key distinguishing specifications
   - Package contents

3. Search for "{brand} {product_name} vs" to find comparison articles that might clarify model differences.

4. Determine match confidence:
   - EXACT: Same UPC/EAN or identical specs
   - LIKELY: Same brand + model with minor naming variations
   - POSSIBLE: Similar product, needs manual review
   - DIFFERENT: Confirmed different product

Return match candidates with confidence levels.

Missing Data Gap-Fill

Identify and fill gaps in existing listings that hurt search visibility or conversion.

Prompt- Gap Fill

You are a catalog quality analyst identifying and filling data gaps.

Product: {product_name}
Brand: {brand}
Model: {model_number}

Current listing data:
{current_listing_json}

Missing or empty fields that need enrichment:
{missing_fields_list}

Execute the following steps:

1. Find and scrape the official {brand} product page for {model_number}.

2. For each missing field, extract the value from the manufacturer source:
{for_each_missing_field_instruction}

3. If any fields cannot be found on the main product page, search for the product datasheet or specifications PDF.

4. For fields still missing after steps 1-3, search for this product on major retailers and extract the missing data from the most authoritative listing.

Return only the fields that were missing, with their values and source URLs.

Best Practices

✓ Do's

✓
Always include model numbers and UPCs — These are the most reliable identifiers for finding exact matches
✓
Prioritize manufacturer sources — Official product pages are the authoritative source for specs and descriptions
✓
Use deep with explicit scrape instructions — Product specs live on detail pages, not search snippets
✓
Request image URLs in your schema — Product imagery is often the most valuable enrichment
✓
Normalize units in your prompt — Tell Linkup to convert to your standard units (inches, pounds, etc.)
✓
Use includeDomains for price monitoring — Restrict to specific competitor sites for cleaner results

✗ Don'ts

✗
Don't rely on product names alone — Names vary wildly across retailers; always use model numbers
✗
Don't skip verification — Cross-reference identifiers to avoid enriching with wrong product data
✗
Don't use standard depth for spec extraction — You need the find-then-scrape pattern
✗
Don't mix official specs with user-generated content — Keep manufacturer data separate from reviews

Integration Patterns

Batch Catalog Enrichment

For initial catalog build or bulk updates:

Export SKUs needing enrichment (missing specs, no images, thin descriptions)
Queue enrichment jobs with brand + model number
Call Linkup API for each SKU with category-specific schema
Validate returned data against category rules
Import enriched data to PIM/catalog system
Flag low-confidence enrichments for manual review

New Product Onboarding

For products added to catalog:

Trigger on new SKU creation
Call Linkup with manufacturer + model
Auto-populate catalog fields from structuredOutput
Fetch and store product images
Queue for QA review before publishing

Competitive Price Monitoring

For ongoing price intelligence:

Schedule daily/weekly price checks for priority SKUs
Call Linkup with competitor site restrictions
Compare returned prices against current pricing
Alert on significant competitor price changes
Feed data into pricing optimization system

Catalog Quality Scoring

For maintaining catalog health:

Score existing listings on completeness
Identify SKUs below quality threshold
Batch enrich missing fields via Linkup
Re-score and track improvement
Report on catalog quality trends

Pro Tip

For high-volume enrichment, consider implementing a queue system with rate limiting to stay within API limits while maximizing throughput.

Sample Integration Code

product_enrichment.py

import requests
import json

def enrich_product_listing(
    brand: str,
    model_number: str,
    category: str,
    category_attributes: list,
    api_key: str
) -> dict:
    """
    Enrich a product listing from manufacturer sources
    """

    attributes_str = "\n".join([f"   - {attr}" for attr in category_attributes])

    prompt = f"""
    You are a product data specialist enriching an e-commerce catalog.

    Brand: {brand}
    Model: {model_number}
    Category: {category}

    1. Find and scrape the official {brand} product page for model {model_number}.

    2. Extract:
       - Official product name
       - Marketing description
       - All product image URLs
       - Complete specifications including:
{attributes_str}

    3. Search for the product datasheet PDF and extract any additional specs.

    Return only official manufacturer data.
    """

    schema = {
        "type": "object",
        "properties": {
            "brand": {"type": "string"},
            "model": {"type": "string"},
            "official_name": {"type": "string"},
            "description": {"type": "string"},
            "manufacturer_url": {"type": "string"},
            "images": {
                "type": "array",
                "items": {"type": "string"}
            },
            "specifications": {
                "type": "object",
                "additionalProperties": {"type": "string"}
            },
            "datasheet_url": {"type": "string"}
        }
    }

    response = requests.post(
        "https://api.linkup.so/v1/search",
        headers={"Authorization": f"Bearer {api_key}"},
        json={
            "q": prompt,
            "depth": "deep",
            "outputType": "structuredOutput",
            "structuredOutputSchema": json.dumps(schema)
        }
    )

    return response.json()


def monitor_competitor_prices(
    product_name: str,
    model_number: str,
    upc: str,
    competitor_domains: list,
    api_key: str
) -> dict:
    """
    Monitor competitor pricing for a specific product
    """

    prompt = f"""
    You are a pricing analyst monitoring competitor listings.

    Product: {product_name}
    Model: {model_number}
    UPC: {upc}

    Search for this exact product on competitor sites and scrape each listing for:
    - Current price
    - Original price (if on sale)
    - Stock availability
    - Active promotions

    Only include listings matching the model number or UPC.
    """

    schema = {
        "type": "object",
        "properties": {
            "product": {"type": "string"},
            "model": {"type": "string"},
            "listings": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "retailer": {"type": "string"},
                        "url": {"type": "string"},
                        "price": {"type": "number"},
                        "original_price": {"type": "number"},
                        "in_stock": {"type": "boolean"},
                        "promotions": {"type": "array", "items": {"type": "string"}}
                    }
                }
            }
        }
    }

    response = requests.post(
        "https://api.linkup.so/v1/search",
        headers={"Authorization": f"Bearer {api_key}"},
        json={
            "q": prompt,
            "depth": "deep",
            "outputType": "structuredOutput",
            "structuredOutputSchema": json.dumps(schema),
            "includeDomains": competitor_domains
        }
    )

    return response.json()


def fill_catalog_gaps(
    brand: str,
    model_number: str,
    current_data: dict,
    missing_fields: list,
    api_key: str
) -> dict:
    """
    Fill missing fields in an existing listing
    """

    fields_str = ", ".join(missing_fields)

    prompt = f"""
    You are a catalog quality analyst filling data gaps.

    Brand: {brand}
    Model: {model_number}

    Fields missing from our listing: {fields_str}

    1. Scrape the official {brand} product page for {model_number}.
    2. Extract values for these specific missing fields: {fields_str}
    3. If not found on main page, search for the product datasheet.

    Return only the missing fields with their values and source URLs.
    """

    schema = {
        "type": "object",
        "properties": {
            "enriched_fields": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "field": {"type": "string"},
                        "value": {"type": "string"},
                        "source_url": {"type": "string"}
                    }
                }
            },
            "still_missing": {
                "type": "array",
                "items": {"type": "string"}
            }
        }
    }

    response = requests.post(
        "https://api.linkup.so/v1/search",
        headers={"Authorization": f"Bearer {api_key}"},
        json={
            "q": prompt,
            "depth": "deep",
            "outputType": "structuredOutput",
            "structuredOutputSchema": json.dumps(schema)
        }
    )

    return response.json()

python