Orchestrated Lead Generation Campaigns
Overview
v2.15.0 Feature: Orchestrated campaigns automatically chain three SpiderIQ services together:
v2.34.0 Unified Payload: Campaigns now support ALL SpiderMaps options at the top level (flat structure). Use search_query, max_results, extract_reviews, extract_photos, lang, store_images, validate_phones, fuzziq_enabled, fuzziq_unique_only, skip_proxy, and test.
- SpiderMaps - Scrape businesses from Google Maps
- SpiderSite - Crawl each business website for emails and company info
- SpiderVerify - Verify extracted email addresses
One API call, complete lead data. Instead of managing three separate job types, the orchestrator handles everything automatically. You just call /next in a loop and retrieve aggregated results.
How It Works
The Chain
| Step | Service | What Happens |
|---|---|---|
| 1 | SpiderMaps | Searches Google Maps for businesses matching your query |
| 2 | Domain Filter | Removes social media, review sites, directories (configurable) |
| 3 | SpiderSite | Crawls each valid business website |
| 4 | Email Extract | Pulls emails from crawled pages |
| 5 | SpiderVerify | Verifies each email via SMTP |
| 6 | Aggregate | Combines all data per business |
Automatic Domain Filtering
The orchestrator automatically filters out non-scrapable domains:
Social Media (filter_social_media)
- facebook.com
- instagram.com
- linkedin.com
- twitter.com / x.com
- tiktok.com
- youtube.com
- pinterest.com
Review Sites (filter_review_sites)
- yelp.com
- tripadvisor.com
- trustpilot.com
- g2.com
- capterra.com
Directories (filter_directories)
- yellowpages.com
- bbb.org
- manta.com
- booking.com
- doordash.com
- ubereats.com
- linktr.ee
Map Links (filter_maps)
- google.com/maps
- maps.google.com
- waze.com
- apple.com/maps
Creating an Orchestrated Campaign
Basic Example
- cURL
- Python
- JavaScript
curl -X POST https://spideriq.ai/api/v1/jobs/spiderMaps/campaigns/submit \
-H "Authorization: Bearer <your_token>" \
-H "Content-Type: application/json" \
-d '{
"search_query": "restaurants",
"country_code": "LU",
"max_results": 50,
"extract_reviews": true,
"lang": "en",
"workflow": {
"spidersite": {
"enabled": true,
"max_pages": 5,
"extract_company_info": true
},
"spiderverify": {
"enabled": true,
"max_emails_per_business": 5
}
}
}'
import requests
response = requests.post(
"https://spideriq.ai/api/v1/jobs/spiderMaps/campaigns/submit",
headers={"Authorization": "Bearer <your_token>"},
json={
"search_query": "restaurants", # v2.34.0: Use search_query
"country_code": "LU",
"max_results": 50, # v2.34.0: Now configurable
"extract_reviews": True,
"lang": "en",
"workflow": {
"spidersite": {
"enabled": True,
"max_pages": 5,
"extract_company_info": True
},
"spiderverify": {
"enabled": True,
"max_emails_per_business": 5
}
}
}
)
campaign = response.json()
print(f"Campaign: {campaign['campaign_id']}")
print(f"Workflow enabled: {campaign['has_workflow']}")
const response = await fetch(
'https://spideriq.ai/api/v1/jobs/spiderMaps/campaigns/submit',
{
method: 'POST',
headers: {
'Authorization': 'Bearer <your_token>',
'Content-Type': 'application/json'
},
body: JSON.stringify({
search_query: 'restaurants', // v2.34.0: Use search_query
country_code: 'LU',
max_results: 50, // v2.34.0: Now configurable
extract_reviews: true,
lang: 'en',
workflow: {
spidersite: {
enabled: true,
max_pages: 5,
extract_company_info: true
},
spiderverify: {
enabled: true,
max_emails_per_business: 5
}
}
})
}
);
const campaign = await response.json();
console.log(`Campaign: ${campaign.campaign_id}`);
Response:
{
"campaign_id": "camp_lu_restaurants_20260209_abc123",
"status": "active",
"search_query": "restaurants",
"country_code": "LU",
"total_locations": 14,
"max_results": 50,
"extract_reviews": true,
"lang": "en",
"has_workflow": true,
"workflow_config": {
"spidersite": {
"enabled": true,
"max_pages": 5,
"extract_company_info": true
},
"spiderverify": {
"enabled": true,
"max_emails_per_business": 5
}
}
}
Full Configuration Example
- Complete Workflow Payload
{
"search_query": "restaurants",
"country_code": "FR",
"name": "France Restaurant Lead Gen",
"max_results": 50,
"extract_reviews": true,
"extract_photos": false,
"lang": "fr",
"store_images": true,
"validate_phones": true,
"fuzziq_enabled": true,
"skip_proxy": false,
"test": false,
"filter": {
"mode": "population",
"min_population": 50000
},
"workflow": {
"spidersite": {
"enabled": true,
"max_pages": 10,
"crawl_strategy": "bestfirst",
"target_pages": ["contact", "about", "team"],
"enable_spa": true,
"spa_timeout": 30,
"extract_team": true,
"extract_company_info": true,
"extract_pain_points": false,
"product_description": "AI-powered restaurant management software",
"icp_description": "Restaurant owners looking to streamline operations",
"compendium": {
"enabled": true,
"cleanup_level": "fit",
"max_chars": 100000
},
"timeout": 30
},
"spiderverify": {
"enabled": true,
"check_gravatar": false,
"check_dnsbl": false,
"smtp_timeout_secs": 45,
"max_emails_per_business": 5
},
"filter_social_media": true,
"filter_review_sites": true,
"filter_directories": true,
"filter_maps": true
}
}
Workflow Configuration Reference
SpiderSite Options
| Parameter | Type | Default | Description |
|---|---|---|---|
enabled | boolean | false | Enable SpiderSite for each business |
max_pages | integer | 10 | Pages to crawl per website (1-50) |
crawl_strategy | string | "bestfirst" | bestfirst, bfs, or dfs |
target_pages | array | ["contact", "about", "team", "news", "blog"] | Priority page types |
enable_spa | boolean | true | Enable SPA/JavaScript rendering |
spa_timeout | integer | 30 | SPA rendering timeout (10-120s) |
extract_team | boolean | false | Extract team members with AI |
extract_company_info | boolean | false | Extract company info with AI |
extract_pain_points | boolean | false | Analyze pain points with AI |
product_description | string | null | Your product (for CHAMP scoring) |
icp_description | string | null | Your ICP (for CHAMP scoring) |
timeout | integer | 30 | Overall timeout (10-120s) |
CHAMP Scoring: If you provide product_description, you must also provide icp_description (and vice versa).
SpiderVerify Options
| Parameter | Type | Default | Description |
|---|---|---|---|
enabled | boolean | false | Enable email verification |
check_gravatar | boolean | false | Check for Gravatar images |
check_dnsbl | boolean | false | Check spam blacklists |
smtp_timeout_secs | integer | 45 | SMTP timeout (10-120s) |
max_emails_per_business | integer | 10 | Max emails to verify (1-50) |
Email Prioritization: The orchestrator automatically prioritizes business emails like contact@, info@, sales@ over generic addresses.
Domain Filter Options
| Parameter | Type | Default | Description |
|---|---|---|---|
filter_social_media | boolean | true | Filter Facebook, Instagram, etc. |
filter_review_sites | boolean | true | Filter Yelp, TripAdvisor, etc. |
filter_directories | boolean | true | Filter YellowPages, BBB, etc. |
filter_maps | boolean | true | Filter Google Maps links, Waze |
Compendium Options
| Parameter | Type | Default | Description |
|---|---|---|---|
enabled | boolean | true | Generate markdown compendium |
cleanup_level | string | "fit" | raw, fit, citations, minimal |
max_chars | integer | 100000 | Max compendium size |
include_in_response | boolean | true | Include in API response |
remove_duplicates | boolean | true | Deduplicate content |
Monitoring Progress
Status Endpoint
Check real-time progress with the /status endpoint:
- cURL
- Python
curl https://spideriq.ai/api/v1/jobs/spiderMaps/campaigns/{campaign_id}/status \
-H "Authorization: Bearer <your_token>"
response = requests.get(
f"https://spideriq.ai/api/v1/jobs/spiderMaps/campaigns/{campaign_id}/status",
headers={"Authorization": "Bearer <your_token>"}
)
status = response.json()
print(f"SpiderSite completed: {status['workflow_progress']['sites_completed']}")
print(f"Emails found: {status['workflow_progress']['emails_found']}")
print(f"Emails verified: {status['workflow_progress']['emails_verified']}")
Response with Workflow Progress:
{
"campaign_id": "camp_lu_restaurants_20251223_abc123",
"status": "active",
"progress": {
"completed": 5,
"failed": 0,
"pending": 9,
"total": 14,
"percentage": 35.7
},
"workflow_progress": {
"businesses_total": 150,
"sites_queued": 5,
"sites_completed": 120,
"sites_failed": 2,
"verifies_queued": 10,
"verifies_completed": 80,
"verifies_failed": 1,
"emails_found": 350,
"emails_verified": 280
},
"has_workflow": true
}
Workflow Progress Fields
| Field | Description |
|---|---|
businesses_total | Total businesses with valid domains |
sites_queued | SpiderSite jobs waiting |
sites_completed | SpiderSite jobs finished |
sites_failed | SpiderSite jobs failed |
verifies_queued | SpiderVerify jobs waiting |
verifies_completed | SpiderVerify jobs finished |
verifies_failed | SpiderVerify jobs failed |
emails_found | Total emails extracted |
emails_verified | Total emails verified |
Getting Results
Per-Job Blocking Results (v2.16.0)
For real-time integrations where you need to wait for a specific job to complete, use the blocking endpoint:
- cURL
- Python
# Wait for job completion (blocks up to 10 minutes)
curl "https://spideriq.ai/api/v1/jobs/spiderMaps/campaigns/{campaign_id}/jobs/{job_id}/results" \
-H "Authorization: Bearer <your_token>"
# Or poll without blocking
curl "https://spideriq.ai/api/v1/jobs/spiderMaps/campaigns/{campaign_id}/jobs/{job_id}/results?wait=false" \
-H "Authorization: Bearer <your_token>"
import requests
# Submit and immediately wait for results
next_resp = requests.post(
f"{API_URL}/jobs/spiderMaps/campaigns/{campaign_id}/next",
headers=HEADERS
)
job_id = next_resp.json()['current_task']['job_id']
# Block until complete (up to 10 minutes)
results = requests.get(
f"{API_URL}/jobs/spiderMaps/campaigns/{campaign_id}/jobs/{job_id}/results",
headers=HEADERS
).json()
print(f"Status: {results['status']}")
print(f"Businesses: {results['businesses_total']}")
print(f"Valid emails: {results['total_valid_emails']}")
# Process each business
for biz in results['businesses']:
if biz['valid_emails_count'] > 0:
print(f"{biz['business_name']}: {biz['emails_verified']}")
Use Case: Perfect for n8n/Xano webhooks where you need to wait for results before proceeding to the next step.
Timeouts & Partial Results:
- SpiderSite: 5 minutes per business
- SpiderVerify: 2 minutes per business
- Maximum wait: 10 minutes
- If timeout occurs, returns
status: "partial"with all available data
See the Get Job Results (Blocking) endpoint for complete documentation.
Workflow Results Endpoint
Get aggregated results combining SpiderMaps + SpiderSite + SpiderVerify data:
- cURL
- Python
curl https://spideriq.ai/api/v1/jobs/spiderMaps/campaigns/{campaign_id}/workflow-results \
-H "Authorization: Bearer <your_token>"
response = requests.get(
f"https://spideriq.ai/api/v1/jobs/spiderMaps/campaigns/{campaign_id}/workflow-results",
headers={"Authorization": "Bearer <your_token>"}
)
results = response.json()
for location in results['locations']:
print(f"\n=== {location['search_string']} ===")
for biz in location['businesses']:
print(f" {biz['business_name']}")
print(f" Domain: {biz['domain']}")
print(f" Emails: {biz['emails_found']}")
for email in biz['emails_verified']:
print(f" - {email['email']}: {email['status']} (score: {email['score']})")
Response Structure:
{
"campaign_id": "camp_lu_restaurants_20251223_abc123",
"status": "active",
"query": "restaurants",
"country_code": "LU",
"workflow_progress": {
"businesses_total": 69,
"sites_completed": 69,
"emails_found": 201,
"emails_verified": 84
},
"total_businesses": 69,
"total_with_domains": 69,
"total_emails_found": 201,
"total_valid_emails": 8,
"locations": [
{
"location_id": 843,
"search_string": "Luxembourg, Luxembourg",
"status": "completed",
"businesses_count": 69,
"businesses": [
{
"business_name": "Café des Tramways",
"business_place_id": "0x47954f2add89aa79:0x74c726ae28575bec",
"business_address": "79 Av. Pasteur, 2311 Luxembourg",
"business_phone": "35226201136",
"business_rating": 4.4,
"business_reviews_count": 706,
"business_categories": ["Bar", "Coffee shop"],
"original_website": "http://www.cafedestramways.lu/",
"domain": "cafedestramways.lu",
"domain_filtered": false,
"spidersite_status": "completed",
"pages_crawled": 2,
"emails_found": ["info@cafedestramways.lu"],
"company_info": {
"industry": "Restaurant/Bar",
"key_services": ["Flammekueches", "Burgers", "Cocktails"],
"target_audience": "Locals and tourists in Luxembourg",
"one_sentence_summary": "Cozy bar offering drinks and homemade food"
},
"spiderverify_status": "completed",
"emails_verified": [
{
"email": "info@cafedestramways.lu",
"status": "risky",
"score": 90,
"is_deliverable": true,
"is_role_account": true
}
],
"valid_emails_count": 0,
"workflow_stage": "complete"
}
]
}
]
}
Complete Python Example
import requests
import time
import csv
# Configuration
API_URL = "https://spideriq.ai/api/v1"
TOKEN = "your_token_here"
HEADERS = {"Authorization": f"Bearer {TOKEN}"}
# 1. Create Campaign with Workflow (v2.34.0+ unified payload)
campaign_response = requests.post(
f"{API_URL}/jobs/spiderMaps/campaigns/submit",
headers=HEADERS,
json={
"search_query": "restaurants", # v2.34.0: Use search_query
"country_code": "LU",
"name": "Luxembourg Restaurant Leads",
"max_results": 50, # v2.34.0: Now configurable
"extract_reviews": True,
"lang": "en",
"store_images": True,
"validate_phones": True,
"fuzziq_enabled": True,
"workflow": {
"spidersite": {
"enabled": True,
"max_pages": 5,
"extract_company_info": True
},
"spiderverify": {
"enabled": True,
"max_emails_per_business": 3
}
}
}
)
campaign = campaign_response.json()
campaign_id = campaign['campaign_id']
print(f"Created campaign: {campaign_id}")
print(f"Total locations: {campaign['total_locations']}")
# 2. Process all locations
while True:
next_response = requests.post(
f"{API_URL}/jobs/spiderMaps/campaigns/{campaign_id}/next",
headers=HEADERS
)
next_data = next_response.json()
if next_data.get('current_task'):
task = next_data['current_task']
print(f"Processing: {task['search_string']} (job: {task['job_id']})")
progress = next_data['progress']
print(f"Progress: {progress['completed']}/{progress['total']} ({progress['percentage']:.1f}%)")
if not next_data['has_more']:
print("All locations processed!")
break
time.sleep(2) # Rate limit between calls
# 3. Wait for workflow jobs to complete
print("\nWaiting for SpiderSite and SpiderVerify jobs to complete...")
while True:
status_response = requests.get(
f"{API_URL}/jobs/spiderMaps/campaigns/{campaign_id}/status",
headers=HEADERS
)
status = status_response.json()
wp = status.get('workflow_progress', {})
sites_done = wp.get('sites_completed', 0) + wp.get('sites_failed', 0)
sites_total = wp.get('businesses_total', 0)
verifies_done = wp.get('verifies_completed', 0) + wp.get('verifies_failed', 0)
print(f"Sites: {sites_done}/{sites_total} | Verifies: {verifies_done} | Emails: {wp.get('emails_found', 0)}")
# Check if workflow is complete (all sites done and verifies caught up)
if sites_done >= sites_total and wp.get('verifies_queued', 0) == 0:
break
time.sleep(5)
# 4. Get aggregated results
results_response = requests.get(
f"{API_URL}/jobs/spiderMaps/campaigns/{campaign_id}/workflow-results",
headers=HEADERS
)
results = results_response.json()
# 5. Export to CSV
with open('leads.csv', 'w', newline='', encoding='utf-8') as f:
writer = csv.writer(f)
writer.writerow([
'Business Name', 'Address', 'Phone', 'Website', 'Rating',
'Industry', 'Email', 'Email Status', 'Email Score'
])
for location in results['locations']:
for biz in location['businesses']:
for email in biz.get('emails_verified', []):
writer.writerow([
biz['business_name'],
biz.get('business_address', ''),
biz.get('business_phone', ''),
biz.get('domain', ''),
biz.get('business_rating', ''),
biz.get('company_info', {}).get('industry', ''),
email['email'],
email['status'],
email['score']
])
print(f"\nExported {results['total_valid_emails']} valid emails to leads.csv")
print(f"Total businesses: {results['total_businesses']}")
print(f"Total emails found: {results['total_emails_found']}")
Best Practices
Start Small: Test with a small country like Luxembourg (14 locations) before running large campaigns.
Use Population Filters: For large countries, filter by population to focus on major cities first.
Monitor Progress: Check /status periodically to track SpiderSite and SpiderVerify completion.
Rate Limiting: Add 1-2 second delays between /next calls to avoid rate limits.
Recommended Settings for Lead Generation
{
"workflow": {
"spidersite": {
"enabled": true,
"max_pages": 5,
"crawl_strategy": "bestfirst",
"extract_company_info": true,
"compendium": {
"enabled": false
}
},
"spiderverify": {
"enabled": true,
"max_emails_per_business": 3
}
}
}
Disable compendium for lead gen campaigns to speed up processing. Compendiums are useful for content analysis but add overhead.