How to enrich product listings with Linkup
Automate catalog enrichment by pulling specifications from manufacturer sites, gathering competitive pricing, extracting images, and compiling product data from authoritative sources.
In This Guide
Catalog Enrichment
Pull specs, images, and descriptions from manufacturer sources
Competitive Pricing
Track competitor pricing across marketplaces
Attribute Extraction
Extract structured attributes for faceted search
Review Aggregation
Gather ratings and reviews from across the web
Product Matching
Identify same products across different sources
Missing Data Gap-Fill
Fill gaps in existing listings
Overview
Product listing quality directly impacts conversion rates, search rankings, and customer trust. Yet maintaining rich, accurate listings across thousands of SKUs is a massive operational challenge. Linkup's agentic search can automate the enrichment process—pulling specifications from manufacturer sites, gathering competitive pricing, extracting images, and compiling product data from authoritative sources.
Why Linkup for product enrichment?
deep searchExecutes "find manufacturer page → scrape specs" workflows automatically
structuredOutputReturns data mapped directly to your catalog schema
agentic retrievalNavigates product pages, datasheets, and PDFs to extract structured attributes
scale with consistencyEnrich at scale with consistent, structured outputs
Configuration
Recommended settings for product enrichment
| Parameter | Value | Why |
|---|---|---|
depth | deep | Enrichment requires finding source pages, then scraping detailed specs |
outputType | structuredOutput | Returns attributes in your exact catalog schema |
includeDomains | optional | Restrict to manufacturer sites or trusted sources |
Use Cases
Practical examples with prompts and schemas
Catalog Enrichment from Manufacturer Sources
Pull official specifications, descriptions, and images from manufacturer websites.
You are a product data specialist enriching an e-commerce catalog.
Product: {product_name}
Brand: {brand}
Model/SKU: {model_number}
Current listing URL (ours): {our_listing_url}
Execute the following steps:
1. Search for the official {brand} product page for model {model_number}.
2. Once found, scrape the manufacturer's product page to extract:
- Official product name
- Marketing description
- Complete technical specifications
- All product image URLs
- Product dimensions and weight
- Key features list
- Warranty information
3. Search for the product datasheet or spec sheet PDF for {model_number}. If found, extract any additional specifications not on the main product page.
4. Search for {brand} {model_number} on the brand's support/documentation page to find:
- User manual PDF link
- Installation guide link
- Compatibility information
Return only official manufacturer data. Do not include third-party descriptions or reviews.Competitive Price Monitoring
Track competitor pricing across marketplaces to inform your pricing strategy.
You are a pricing analyst monitoring competitor listings for a specific product.
Product: {product_name}
Brand: {brand}
Model/SKU: {model_number}
UPC/EAN (if known): {upc}
Our current price: {our_price}
Execute the following steps:
1. Search for this exact product on major retail and marketplace sites:
- Amazon
- Walmart
- Target
- Best Buy
- Home Depot (if applicable)
- Category-specific marketplaces
2. For each listing found, scrape the product page to extract:
- Seller/retailer name
- Current price
- Original price (if on sale)
- Shipping cost or free shipping threshold
- Stock availability
- Listing URL
- Any active promotions or coupons mentioned
3. Verify each listing matches our product by confirming model number or UPC.
Only include listings for the exact product—not similar or compatible items.Product Attribute Extraction for Faceted Search
Extract structured attributes to power filters and faceted navigation.
You are a catalog data specialist extracting product attributes for search and filtering.
Product category: {category}
Product: {product_name}
Brand: {brand}
Model: {model_number}
Your goal is to extract all filterable attributes for this {category} product.
1. Find and scrape the manufacturer's product page for {brand} {model_number}.
2. Extract these category-specific attributes:
{attribute_list_for_category}
3. Search for the product datasheet to find any technical specifications not listed on the main page.
4. Normalize all values to standard units and formats:
- Dimensions in inches (convert from cm if needed)
- Weight in pounds
- Colors as standard color names
- Capacities in standard units
Return structured attributes ready for faceted search implementation.Review & Rating Aggregation
Gather social proof from across the web for new or thin listings.
You are a product research analyst gathering review data from across the web.
Product: {product_name}
Brand: {brand}
Model: {model_number}
Execute the following steps:
1. Search for {brand} {model_number} reviews on major retail sites and find the aggregate rating and review count from each source.
2. Search for {brand} {model_number} reviews on specialty review sites relevant to this product category (e.g., RTINGS, Wirecutter, CNET, Tom's Guide, etc.).
3. Search for {brand} {model_number} on YouTube to find video reviews. Extract:
- Channel name
- Video title
- View count
- Overall sentiment (positive/negative/mixed)
4. Search for Reddit discussions about {brand} {model_number} to understand real user sentiment.
Compile an overall sentiment summary based on patterns across sources. Note any consistent praise or complaints.Product Matching & Deduplication
Identify the same product across different sources with varying names/identifiers.
You are a product data specialist identifying duplicate and matching products.
Source product:
- Name: {source_product_name}
- Brand: {brand}
- Identifiers available: {identifiers} (e.g., UPC, MPN, ASIN, etc.)
- Key specs: {key_specs}
Execute the following steps:
1. Search for this product using each available identifier to find listings on other platforms.
2. For ambiguous matches (same brand, similar name, but different identifiers), scrape both product pages and compare:
- Exact dimensions
- Weight
- Key distinguishing specifications
- Package contents
3. Search for "{brand} {product_name} vs" to find comparison articles that might clarify model differences.
4. Determine match confidence:
- EXACT: Same UPC/EAN or identical specs
- LIKELY: Same brand + model with minor naming variations
- POSSIBLE: Similar product, needs manual review
- DIFFERENT: Confirmed different product
Return match candidates with confidence levels.Missing Data Gap-Fill
Identify and fill gaps in existing listings that hurt search visibility or conversion.
You are a catalog quality analyst identifying and filling data gaps.
Product: {product_name}
Brand: {brand}
Model: {model_number}
Current listing data:
{current_listing_json}
Missing or empty fields that need enrichment:
{missing_fields_list}
Execute the following steps:
1. Find and scrape the official {brand} product page for {model_number}.
2. For each missing field, extract the value from the manufacturer source:
{for_each_missing_field_instruction}
3. If any fields cannot be found on the main product page, search for the product datasheet or specifications PDF.
4. For fields still missing after steps 1-3, search for this product on major retailers and extract the missing data from the most authoritative listing.
Return only the fields that were missing, with their values and source URLs.Best Practices
✓ Do's
- ✓Always include model numbers and UPCs — These are the most reliable identifiers for finding exact matches
- ✓Prioritize manufacturer sources — Official product pages are the authoritative source for specs and descriptions
- ✓Use deep with explicit scrape instructions — Product specs live on detail pages, not search snippets
- ✓Request image URLs in your schema — Product imagery is often the most valuable enrichment
- ✓Normalize units in your prompt — Tell Linkup to convert to your standard units (inches, pounds, etc.)
- ✓Use includeDomains for price monitoring — Restrict to specific competitor sites for cleaner results
✗ Don'ts
- ✗Don't rely on product names alone — Names vary wildly across retailers; always use model numbers
- ✗Don't skip verification — Cross-reference identifiers to avoid enriching with wrong product data
- ✗Don't use standard depth for spec extraction — You need the find-then-scrape pattern
- ✗Don't mix official specs with user-generated content — Keep manufacturer data separate from reviews
Integration Patterns
Batch Catalog Enrichment
For initial catalog build or bulk updates:
- Export SKUs needing enrichment (missing specs, no images, thin descriptions)
- Queue enrichment jobs with brand + model number
- Call Linkup API for each SKU with category-specific schema
- Validate returned data against category rules
- Import enriched data to PIM/catalog system
- Flag low-confidence enrichments for manual review
New Product Onboarding
For products added to catalog:
- Trigger on new SKU creation
- Call Linkup with manufacturer + model
- Auto-populate catalog fields from structuredOutput
- Fetch and store product images
- Queue for QA review before publishing
Competitive Price Monitoring
For ongoing price intelligence:
- Schedule daily/weekly price checks for priority SKUs
- Call Linkup with competitor site restrictions
- Compare returned prices against current pricing
- Alert on significant competitor price changes
- Feed data into pricing optimization system
Catalog Quality Scoring
For maintaining catalog health:
- Score existing listings on completeness
- Identify SKUs below quality threshold
- Batch enrich missing fields via Linkup
- Re-score and track improvement
- Report on catalog quality trends
Sample Integration Code
import requests
import json
def enrich_product_listing(
brand: str,
model_number: str,
category: str,
category_attributes: list,
api_key: str
) -> dict:
"""
Enrich a product listing from manufacturer sources
"""
attributes_str = "\n".join([f" - {attr}" for attr in category_attributes])
prompt = f"""
You are a product data specialist enriching an e-commerce catalog.
Brand: {brand}
Model: {model_number}
Category: {category}
1. Find and scrape the official {brand} product page for model {model_number}.
2. Extract:
- Official product name
- Marketing description
- All product image URLs
- Complete specifications including:
{attributes_str}
3. Search for the product datasheet PDF and extract any additional specs.
Return only official manufacturer data.
"""
schema = {
"type": "object",
"properties": {
"brand": {"type": "string"},
"model": {"type": "string"},
"official_name": {"type": "string"},
"description": {"type": "string"},
"manufacturer_url": {"type": "string"},
"images": {
"type": "array",
"items": {"type": "string"}
},
"specifications": {
"type": "object",
"additionalProperties": {"type": "string"}
},
"datasheet_url": {"type": "string"}
}
}
response = requests.post(
"https://api.linkup.so/v1/search",
headers={"Authorization": f"Bearer {api_key}"},
json={
"q": prompt,
"depth": "deep",
"outputType": "structuredOutput",
"structuredOutputSchema": json.dumps(schema)
}
)
return response.json()
def monitor_competitor_prices(
product_name: str,
model_number: str,
upc: str,
competitor_domains: list,
api_key: str
) -> dict:
"""
Monitor competitor pricing for a specific product
"""
prompt = f"""
You are a pricing analyst monitoring competitor listings.
Product: {product_name}
Model: {model_number}
UPC: {upc}
Search for this exact product on competitor sites and scrape each listing for:
- Current price
- Original price (if on sale)
- Stock availability
- Active promotions
Only include listings matching the model number or UPC.
"""
schema = {
"type": "object",
"properties": {
"product": {"type": "string"},
"model": {"type": "string"},
"listings": {
"type": "array",
"items": {
"type": "object",
"properties": {
"retailer": {"type": "string"},
"url": {"type": "string"},
"price": {"type": "number"},
"original_price": {"type": "number"},
"in_stock": {"type": "boolean"},
"promotions": {"type": "array", "items": {"type": "string"}}
}
}
}
}
}
response = requests.post(
"https://api.linkup.so/v1/search",
headers={"Authorization": f"Bearer {api_key}"},
json={
"q": prompt,
"depth": "deep",
"outputType": "structuredOutput",
"structuredOutputSchema": json.dumps(schema),
"includeDomains": competitor_domains
}
)
return response.json()
def fill_catalog_gaps(
brand: str,
model_number: str,
current_data: dict,
missing_fields: list,
api_key: str
) -> dict:
"""
Fill missing fields in an existing listing
"""
fields_str = ", ".join(missing_fields)
prompt = f"""
You are a catalog quality analyst filling data gaps.
Brand: {brand}
Model: {model_number}
Fields missing from our listing: {fields_str}
1. Scrape the official {brand} product page for {model_number}.
2. Extract values for these specific missing fields: {fields_str}
3. If not found on main page, search for the product datasheet.
Return only the missing fields with their values and source URLs.
"""
schema = {
"type": "object",
"properties": {
"enriched_fields": {
"type": "array",
"items": {
"type": "object",
"properties": {
"field": {"type": "string"},
"value": {"type": "string"},
"source_url": {"type": "string"}
}
}
},
"still_missing": {
"type": "array",
"items": {"type": "string"}
}
}
}
response = requests.post(
"https://api.linkup.so/v1/search",
headers={"Authorization": f"Bearer {api_key}"},
json={
"q": prompt,
"depth": "deep",
"outputType": "structuredOutput",
"structuredOutputSchema": json.dumps(schema)
}
)
return response.json()