Get Job Results
/api/v1/jobs/{job_id}/resultsOverview
Retrieve the complete results for a scraping job. This endpoint returns different status codes based on job state.
Path Parameters
job_idstringrequiredThe unique identifier of the job (UUID format)
Example: 550e8400-e29b-41d4-a716-446655440000
Query Parameters
formatstringResponse format for AI agent integration (v2.60.0)
Options:
yaml- Token-efficient YAML format (40% savings)md- Human-readable Markdown format (50% savings)
Default: JSON (no format parameter)
Response Status Codes
200 OKstatusJob completed successfully - results available
202 AcceptedstatusJob still processing - poll again later
410 GonestatusJob failed or was cancelled
404 Not FoundstatusJob ID does not exist
Response Structure
Flat Structure (v2.7.1): Responses now use a simplified 2-3 level nesting structure (previously 5 levels). All fields are always present - fields not applicable to your request will be null.
Top-Level Response Fields
successbooleantrue if job completed successfully, false if failed
job_idstringUnique job identifier (UUID format)
typestringJob type: spiderSite or spiderMaps
statusstringJob status: completed, failed, processing, queued, or cancelled
processing_time_secondsnumberTime taken to process the job (null if not completed)
worker_idstringWorker identifier that processed the job
completed_atstringCompletion timestamp in ISO 8601 format
messagestringAdditional context about job state (e.g., "Job is being processed")
dataobjectJob results data (structure varies by job type, see below)
error_messagestringError message if job failed (null otherwise)
SpiderSite Data Fields
Flat Structure: Social media fields are at the top level of data (e.g., data.linkedin), not nested under data.contact_info.social_media.linkedin.
Basic Information
data.urlstringWebsite URL that was crawled
data.pages_crawledintegerNumber of pages successfully crawled
data.crawl_statusstringCrawl result: success, partial, or failed
Contact Information (Flat - Top Level)
data.emailsarrayEmail addresses found (filtered - tracking emails removed)
data.phonesarrayPhone numbers found
data.addressesarrayPhysical addresses found
Social Media Profiles (All Flat - Top Level)
data.linkedinstringLinkedIn company/profile URL (null if not found)
data.twitterstringTwitter/X profile URL (null if not found)
data.facebookstringFacebook page URL (null if not found)
data.instagramstringInstagram profile URL (null if not found)
data.youtubestringYouTube channel URL (null if not found)
data.githubstringGitHub organization/user URL (null if not found)
data.tiktokstringTikTok profile URL (null if not found)
data.pintereststringPinterest profile URL (null if not found)
data.mediumstringMedium profile URL (null if not found)
data.discordstringDiscord server invite URL (null if not found)
data.whatsappstringWhatsApp contact/business URL (null if not found)
data.telegramstringTelegram contact/channel URL (null if not found)
data.snapchatstringSnapchat profile URL (null if not found)
data.redditstringReddit profile/subreddit URL (null if not found)
Markdown Compendium (v2.14.0: SpiderMedia Storage)
data.markdown_compendiumstringAI-generated markdown summary of the website (if enabled and included inline)
data.compendiumobjectCompendium metadata and storage information
AI Features (Always Present - Null If Not Enabled)
data.company_vitalsobjectCompany information extracted with AI (null if extract_company_info: false)
data.pain_pointsarrayBusiness pain points identified by AI (null if extract_pain_points: false)
data.team_membersarrayTeam members found with AI extraction (empty array if extract_team: false)
data.lead_scoringobjectCHAMP framework lead scoring (null if product/ICP not provided)
data.personalization_hooksobjectPersonalization data for outreach (null if not available)
Technical Metadata
data.metadataobjectCrawl metadata and statistics including:
browser_rendering_available: Whether SPA rendering was usedspa_enabled: Whether SPA detection was enabledsitemap_used: Whether sitemap-first crawling was usedcrawl_strategy: Strategy used (sitemap, bestfirst, bfs, dfs)total_emails_found: Total emails before filteringtotal_phones_found: Total phone numbers found
SpiderMaps Data Fields
Basic Information
data.querystringSearch query used for the scrape
data.results_countintegerNumber of business listings returned
data.businessesarrayArray of business listings (see structure below)
data.metadataobjectSearch metadata (max_results, extract_reviews, language, etc.)
Business Listing Structure
Each business in the businesses array contains:
namestringBusiness name
place_idstringGoogle Place ID
ratingnumberAverage rating (1.0-5.0)
reviews_countintegerNumber of reviews
addressstringFull street address
phonestringPhone number
websitestringBusiness website URL
categoriesarrayBusiness categories/types
coordinatesobjectLatitude and longitude coordinates
linkstringGoogle Maps link to the business
business_statusstringStatus: OPERATIONAL, CLOSED_TEMPORARILY, etc.
price_rangestringPrice range: $, $$, $$$, or $$$$
working_hoursobjectWorking hours by day of week
Example Request
- cURL
- Python
- JavaScript
curl https://spideriq.ai/api/v1/jobs/550e8400-e29b-41d4-a716-446655440000/results \
-H "Authorization: Bearer <your_token>"
import requests
job_id = "550e8400-e29b-41d4-a716-446655440000"
url = f"https://spideriq.ai/api/v1/jobs/{job_id}/results"
headers = {
"Authorization": "Bearer <your_token>"
}
response = requests.get(url, headers=headers)
if response.status_code == 200:
print("Job completed!")
print(response.json())
elif response.status_code == 202:
print("Job still processing, poll again later")
elif response.status_code == 410:
print("Job failed or was cancelled")
print(response.json())
const jobId = '550e8400-e29b-41d4-a716-446655440000';
const response = await fetch(
`https://spideriq.ai/api/v1/jobs/${jobId}/results`,
{
headers: {
'Authorization': 'Bearer <your_token>'
}
}
);
if (response.status === 200) {
const data = await response.json();
console.log('Job completed!', data);
} else if (response.status === 202) {
console.log('Job still processing');
} else if (response.status === 410) {
const error = await response.json();
console.log('Job failed:', error);
}
Example Responses
- SpiderSite - Minimal
- SpiderSite - With AI
- SpiderSite - Full CHAMP
- SpiderMaps
- Processing (202)
- Failed (410)
- Not Found (404)
Basic contact extraction without AI features:
{
"success": true,
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"type": "spiderSite",
"status": "completed",
"processing_time_seconds": 12.4,
"worker_id": "spider-site-main-1",
"completed_at": "2025-10-27T14:30:15Z",
"message": null,
"data": {
"url": "https://example.com",
"pages_crawled": 5,
"crawl_status": "success",
"emails": ["contact@example.com", "sales@example.com"],
"phones": ["+1-555-123-4567"],
"addresses": ["123 Main St, San Francisco, CA 94105"],
"linkedin": "https://linkedin.com/company/example",
"twitter": "https://twitter.com/example",
"facebook": "https://facebook.com/example",
"instagram": null,
"youtube": null,
"github": "https://github.com/example",
"tiktok": null,
"pinterest": null,
"medium": null,
"discord": null,
"whatsapp": null,
"telegram": null,
"snapchat": null,
"reddit": null,
"markdown_compendium": "# Example Company\n\nLeading provider of...",
"compendium": {
"available": true,
"chars": 8450,
"cleanup_level": "fit",
"storage_location": "spidermedia",
"download_url": "https://media.spideriq.ai/client-xxx/compendiums/550e8400-e29b-41d4-a716-446655440000.md",
"filename": "compendiums/550e8400-e29b-41d4-a716-446655440000.md",
"size_bytes": 8450,
"content_hash": "a1b2c3d4e5f6...",
"estimated_tokens": 2100
},
"company_vitals": null,
"pain_points": null,
"lead_scoring": null,
"team_members": [],
"personalization_hooks": null,
"metadata": {
"spa_enabled": true,
"sitemap_used": true,
"browser_rendering_available": true,
"crawl_strategy": "sitemap",
"total_emails_found": 2,
"total_phones_found": 1
}
},
"error_message": null
}
Request with AI company and team extraction enabled:
{
"success": true,
"job_id": "660e8400-e29b-41d4-a716-446655440001",
"type": "spiderSite",
"status": "completed",
"processing_time_seconds": 18.7,
"worker_id": "spider-site-main-2",
"completed_at": "2025-10-27T14:35:22Z",
"message": null,
"data": {
"url": "https://techstart.com",
"pages_crawled": 12,
"crawl_status": "success",
"emails": ["contact@techstart.com", "sales@techstart.com"],
"phones": ["+1-555-987-6543"],
"addresses": ["456 Tech Ave, Palo Alto, CA 94301"],
"linkedin": "https://linkedin.com/company/techstart",
"twitter": "https://twitter.com/techstart",
"facebook": null,
"instagram": "https://instagram.com/techstart",
"youtube": "https://youtube.com/techstart",
"github": "https://github.com/techstart",
"tiktok": null,
"pinterest": null,
"medium": "https://medium.com/@techstart",
"discord": null,
"whatsapp": null,
"telegram": null,
"snapchat": null,
"reddit": null,
"markdown_compendium": "# TechStart Solutions\n\nAI-powered customer support...",
"compendium": {
"available": true,
"chars": 45230,
"cleanup_level": "fit",
"storage_location": "spidermedia",
"download_url": "https://media.spideriq.ai/client-xxx/compendiums/660e8400-e29b-41d4-a716-446655440001.md",
"filename": "compendiums/660e8400-e29b-41d4-a716-446655440001.md",
"size_bytes": 45230,
"content_hash": "b2c3d4e5f6g7...",
"estimated_tokens": 11300
},
"company_vitals": {
"name": "TechStart Solutions",
"summary": "AI-powered customer support automation for SaaS companies",
"industry": "B2B SaaS",
"services": ["AI Chatbots", "Support Ticket Automation", "Customer Analytics"],
"target_audience": "Mid-market SaaS companies with 50-500 employees"
},
"pain_points": [
"Struggling to scale customer support operations",
"High ticket resolution times impacting customer satisfaction"
],
"lead_scoring": null,
"team_members": [
{
"name": "Sarah Johnson",
"title": "CEO & Founder",
"email": "sarah@techstart.com",
"linkedin": "https://linkedin.com/in/sarahjohnson"
},
{
"name": "Mike Chen",
"title": "VP of Sales",
"email": "mike@techstart.com",
"linkedin": "https://linkedin.com/in/mikechen"
}
],
"personalization_hooks": null,
"metadata": {
"spa_enabled": true,
"sitemap_used": true,
"browser_rendering_available": true,
"crawl_strategy": "sitemap",
"total_emails_found": 2,
"total_phones_found": 1
}
},
"error_message": null
}
Complete lead scoring with CHAMP framework:
{
"success": true,
"job_id": "770e8400-e29b-41d4-a716-446655440002",
"type": "spiderSite",
"status": "completed",
"processing_time_seconds": 24.1,
"worker_id": "spider-site-main-3",
"completed_at": "2025-10-27T14:40:30Z",
"message": null,
"data": {
"url": "https://techstart.com",
"pages_crawled": 15,
"crawl_status": "success",
"emails": ["contact@techstart.com", "sales@techstart.com"],
"phones": ["+1-555-987-6543"],
"addresses": ["456 Tech Ave, Palo Alto, CA 94301"],
"linkedin": "https://linkedin.com/company/techstart",
"twitter": "https://twitter.com/techstart",
"facebook": null,
"instagram": "https://instagram.com/techstart",
"youtube": "https://youtube.com/techstart",
"github": "https://github.com/techstart",
"tiktok": null,
"pinterest": null,
"medium": "https://medium.com/@techstart",
"discord": null,
"whatsapp": null,
"telegram": null,
"snapchat": null,
"reddit": null,
"markdown_compendium": "# TechStart Solutions\n\nAI-powered customer support...",
"compendium": {
"available": true,
"chars": 52340,
"cleanup_level": "fit",
"storage_location": "spidermedia",
"download_url": "https://media.spideriq.ai/client-xxx/compendiums/770e8400-e29b-41d4-a716-446655440002.md",
"filename": "compendiums/770e8400-e29b-41d4-a716-446655440002.md",
"size_bytes": 52340,
"content_hash": "c3d4e5f6g7h8...",
"estimated_tokens": 13085
},
"company_vitals": {
"name": "TechStart Solutions",
"summary": "AI-powered customer support automation for SaaS companies",
"industry": "B2B SaaS",
"services": ["AI Chatbots", "Support Ticket Automation", "Customer Analytics"],
"target_audience": "Mid-market SaaS companies with 50-500 employees"
},
"pain_points": [
"Struggling to scale customer support operations",
"High ticket resolution times impacting customer satisfaction",
"Manual ticket triage consuming 40% of support team time"
],
"lead_scoring": {
"icp_fit_score": 0.85,
"champ_analysis": {
"challenges": [
"Manual ticket triage consuming 40% of support team time",
"Customer complaints about slow response times increasing CSAT risk"
],
"authority": "VP of Customer Success (Mike Chen) leads vendor selection, CEO approval required for deals >$50k",
"money": "Series B funded ($20M raised in 2024), actively allocating budget for Q1 2025 operational efficiency tools",
"prioritization": "High - support automation listed as top Q1 2025 initiative in recent blog post"
}
},
"team_members": [
{
"name": "Sarah Johnson",
"title": "CEO & Founder",
"email": "sarah@techstart.com",
"linkedin": "https://linkedin.com/in/sarahjohnson"
},
{
"name": "Mike Chen",
"title": "VP of Sales",
"email": "mike@techstart.com",
"linkedin": "https://linkedin.com/in/mikechen"
}
],
"personalization_hooks": {
"company_name": "TechStart Solutions",
"decision_maker": "Mike Chen (VP of Customer Success)",
"key_challenge": "Manual ticket triage consuming 40% of support team time",
"urgency_signal": "Q1 2025 initiative mentioned in recent blog",
"personalization_angle": "Show how automation can free up 40% of support team capacity"
},
"metadata": {
"spa_enabled": true,
"sitemap_used": true,
"browser_rendering_available": true,
"crawl_strategy": "sitemap",
"total_emails_found": 2,
"total_phones_found": 1
}
},
"error_message": null
}
Google Maps business listings:
{
"success": true,
"job_id": "880e8400-e29b-41d4-a716-446655440003",
"type": "spiderMaps",
"status": "completed",
"processing_time_seconds": 32.5,
"worker_id": "spider-maps-main-1",
"completed_at": "2025-10-27T14:45:10Z",
"message": null,
"data": {
"query": "italian restaurants in Boston",
"results_count": 20,
"businesses": [
{
"name": "Mamma Maria",
"place_id": "0x89e3709876543210:0xabcdef1234567890",
"rating": 4.6,
"reviews_count": 823,
"address": "3 North Square, Boston, MA 02113",
"phone": "+1-617-523-9077",
"website": "https://www.mammamaria.com/",
"categories": ["Italian restaurant", "Fine dining"],
"coordinates": {
"latitude": 42.3647,
"longitude": -71.0542
},
"link": "https://www.google.com/maps/place/Mamma+Maria/...",
"business_status": "OPERATIONAL",
"price_range": "$$$",
"working_hours": {
"Monday": "5:00-10:00 PM",
"Tuesday": "5:00-10:00 PM",
"Wednesday": "5:00-10:00 PM",
"Thursday": "5:00-10:00 PM",
"Friday": "5:00-11:00 PM",
"Saturday": "5:00-11:00 PM",
"Sunday": "5:00-10:00 PM"
}
}
],
"metadata": {
"max_results": 20,
"extract_reviews": true,
"extract_photos": false,
"language": "en"
}
},
"error_message": null
}
Job still being processed:
{
"success": false,
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"type": "spiderSite",
"status": "processing",
"processing_time_seconds": null,
"worker_id": null,
"completed_at": null,
"message": "Job is still being processed. Please poll again in a few seconds.",
"data": null,
"error_message": null
}
Job failed with error:
{
"success": false,
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"type": "spiderSite",
"status": "failed",
"processing_time_seconds": 5.2,
"worker_id": "spider-site-main-1",
"completed_at": "2025-10-27T14:50:00Z",
"message": null,
"data": null,
"error_message": "Failed to connect to target URL: Connection timeout after 30s"
}
Job ID does not exist:
{
"detail": "Job not found"
}
Handling Different Status Codes
- Python - Complete Flow
- JavaScript - Complete Flow
import requests
import time
def get_job_results(job_id, auth_token, max_retries=60):
"""Get job results with automatic polling"""
url = f"https://spideriq.ai/api/v1/jobs/{job_id}/results"
headers = {"Authorization": f"Bearer {auth_token}"}
for attempt in range(max_retries):
response = requests.get(url, headers=headers)
if response.status_code == 200:
# Success - return results
return response.json()
elif response.status_code == 202:
# Still processing - wait and retry
print(f"Job processing... (attempt {attempt + 1}/{max_retries})")
time.sleep(3)
continue
elif response.status_code == 410:
# Job failed
error_data = response.json()
raise Exception(f"Job failed: {error_data.get('error')}")
elif response.status_code == 404:
raise Exception("Job not found")
else:
response.raise_for_status()
raise TimeoutError("Job did not complete within maximum retries")
# Usage
try:
results = get_job_results(
"550e8400-e29b-41d4-a716-446655440000",
"<your_token>"
)
print("Results:", results["data"])
except Exception as e:
print(f"Error: {e}")
async function getJobResults(jobId, authToken, maxRetries = 60) {
const url = `https://spideriq.ai/api/v1/jobs/${jobId}/results`;
const headers = {
'Authorization': `Bearer ${authToken}`
};
for (let attempt = 0; attempt < maxRetries; attempt++) {
const response = await fetch(url, { headers });
if (response.status === 200) {
// Success - return results
return await response.json();
}
if (response.status === 202) {
// Still processing - wait and retry
console.log(`Job processing... (attempt ${attempt + 1}/${maxRetries})`);
await new Promise(resolve => setTimeout(resolve, 3000));
continue;
}
if (response.status === 410) {
// Job failed
const error = await response.json();
throw new Error(`Job failed: ${error.error}`);
}
if (response.status === 404) {
throw new Error('Job not found');
}
throw new Error(`Unexpected status: ${response.status}`);
}
throw new Error('Job did not complete within maximum retries');
}
// Usage
try {
const results = await getJobResults(
'550e8400-e29b-41d4-a716-446655440000',
'<your_token>'
);
console.log('Results:', results.data);
} catch (error) {
console.error('Error:', error.message);
}
Data Storage
Screenshot Storage: SpiderSite job screenshots are stored in Cloudflare R2 and accessible via CDN at cdn.spideriq.ai. URLs are permanent and do not expire.
Best Practices
Don't poll too frequently: Respect the 100 requests/minute rate limit. Poll every 3-5 seconds for optimal balance between responsiveness and rate limit compliance.
Save job IDs: Store job IDs in your database to retrieve results later. Results remain available indefinitely.