Guides Overview
Welcome to SpiderIQ Guidesβ
This section provides comprehensive guides and tutorials to help you get the most out of SpiderIQ's web scraping and Google Maps data extraction capabilities.
What is SpiderIQ?β
SpiderIQ is a high-performance API service that provides four specialized capabilities:
SpiderSite
Website Scraping
Extract content from any website using the Crawl4AI library with optional AI-powered data extraction.
- Full-page markdown conversion
- AI-powered content extraction
- Screenshot capture
- Metadata extraction
SpiderMaps
Google Maps Scraping
Extract business information from Google Maps using Playwright browser automation.
- Business details (name, address, phone)
- Reviews and ratings
- Business hours
- Categories and photos
- Campaign System (v2.14.0): Multi-location orchestration
SpiderVerify
Email Verification
Verify email addresses at the SMTP level without sending actual emails.
- Deliverability checking
- Disposable email detection
- Role account identification
- Quality scoring (0-100)
SpiderPeople
Decision Maker Discovery (v2.17.0)
Find the right people behind companies using ICP-based search.
- Natural language search by role + location
- Profile lookup by LinkedIn URL
- AI research reports
- Experience & education data
SpiderBrowser
Anti-Detect Browser Management (v2.29.0)
Manage persistent authenticated browser sessions at scale with Camoufox.
- C++-level fingerprint spoofing
- VNC web access for manual login/CAPTCHA
- Cookie export (Netscape format for yt-dlp)
- SpiderProxy mobile IP integration
- Profile warmup automation
SpiderCompanyData
Company Data Enrichment (v2.36.0)
Enrich leads with official company data from government registries.
- US SEC EDGAR (public companies)
- UK Companies House (5M+ companies)
- EU VIES VAT validation
- Directors and officers data
- 24-hour caching
Quick Linksβ
Getting Started
Submit your first job in 5 minutes
Authentication
Learn about API authentication
API Reference
Complete API documentation
Available Guidesβ
Scraping Websites
Complete guide to website scraping with SpiderSite
Google Maps
Extract business data from Google Maps
Email Verification
Verify emails via SMTP without sending
People Research
Research LinkedIn profiles with AI insights
Browser Automation
Manage anti-detect browser profiles with VNC access, automated login, and cookie export
Company Data
Enrich leads with official company data from US, UK, and EU registries
v2.18.0: SpiderFuzzer Deduplicationβ
SpiderFuzzer Deduplication
Automatic record deduplication across all job types with per-client data isolation.
- Per-Record Unique Flag: Each record marked with
fuzziq_unique: true/false - Standalone API: Check and manage records via
/api/v1/fuzziq/*endpoints - Response Filtering: Use
fuzziq_unique_only: trueto return only new records - Isolated Schemas: Separate PostgreSQL schemas per client for complete data isolation
v2.15.0: Orchestrated Campaignsβ
Orchestrated Campaigns
Chain SpiderMaps + SpiderSite + SpiderVerify in a single workflow
Xano Integration
Build lead gen systems with Xano no-code backend
n8n Integration
Automate campaigns with n8n workflow automation
Common Use Casesβ
Content Aggregationβ
Extract articles, blog posts, and documentation from multiple sources for content analysis or aggregation platforms.
Example: News monitoring, competitor content analysis, research aggregation
E-commerce Dataβ
Scrape product information, prices, and reviews from e-commerce sites for price monitoring or market research.
Example: Price comparison tools, inventory monitoring, product catalog building
Local Business Researchβ
Extract business information from Google Maps for lead generation, market research, or directory creation.
Example: B2B prospecting, competitive analysis, local SEO research
Real Estate & Property Dataβ
Gather property listings, prices, and details for real estate analysis and market trends.
Example: Property aggregators, market analysis tools, investment research
Job Board Aggregationβ
Collect job postings from multiple sources to create comprehensive job search platforms.
Example: Job aggregators, salary analysis, hiring trend research
How SpiderIQ Worksβ
Processing Flowβ
- Submit - Client submits a job via API
- Queue - Job is queued for processing
- Process - Available worker picks up and processes the job
- Store - Results are saved (screenshots to Cloudflare R2, data to Database)
- Retrieve - Client polls for results and receives data
Architectureβ
SpiderIQ is built on a scalable, distributed architecture:
- API Gateway - FastAPI-based REST API
- Message Queue - Job distribution system
- Workers - Distributed scraping workers (Docker containers)
- Database - Database for job metadata and results
- Cache - Redis for performance optimization
- CDN Storage - Cloudflare R2 for screenshots
Worker Typesβ
- SpiderSite Workers - 70 workers for website scraping
- SpiderMaps Workers - 42 workers for Google Maps scraping
- SpiderVerify Workers - 10 workers for email verification
- SpiderPeople Workers - 1 worker for LinkedIn research
- SpiderBrowser Workers - 1 worker for anti-detect browser management
Performance & Limitsβ
Rate Limitsβ
Standard Rate Limit: 100 requests per minute per client
Burst allowance of 20 requests for occasional spikes. Contact us for higher limits.
Processing Timesβ
| Job Type | Average Time | Range |
|---|---|---|
| SpiderSite (simple page) | 5-15s | 3-30s |
| SpiderSite (with AI) | 10-25s | 5-45s |
| SpiderMaps | 3-8s | 2-15s |
| SpiderVerify (single) | 2-5s | 1-10s |
| SpiderVerify (bulk 100) | 30-60s | 20-120s |
| SpiderPeople (profile) | 5-10s | 3-15s |
| SpiderPeople (search) | 5-15s | 3-20s |
| SpiderPeople (research) | 15-30s | 10-45s |
Queue Capacityβ
- Normal load: < 20 jobs queued
- Moderate load: 20-50 jobs queued
- High load: > 50 jobs queued
Use the Queue Stats endpoint to monitor current load.
Best Practicesβ
Poll efficiently: Use 2-5 second intervals when polling for results to balance responsiveness and rate limit compliance.
Handle rate limits: Implement exponential backoff when you receive 429 (Too Many Requests) responses.
Check queue load: Use /system/queue-stats before submitting bulk jobs to avoid overwhelming the queue.
Store job IDs: Save job IDs in your database to retrieve results later if needed.
Respect robots.txt: While SpiderIQ can scrape most sites, ensure you have permission and respect robots.txt directives.
Need Help?β
API Reference
Complete API documentation with all endpoints
Support
Contact our support team
System Status
Check API health and queue stats
Get API Access
Request API credentials
Next Stepsβ
Contact admin@spideriq.ai to get your API credentials
Follow our 5-minute quickstart guide to submit your first job
Learn about website scraping and explore the API reference
Use the API reference to build your integration