Error Handling & Troubleshooting
Overview
This guide covers common errors you may encounter when using SpiderIQ API and how to handle them gracefully in your applications.
HTTP Status Codes
200 OK - Success
Job completed successfully and results are available.
{
"success": true,
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"type": "spiderSite",
"status": "completed",
"data": { ... }
}
Action: Process the results
201 Created - Job Submitted
Job was successfully submitted and queued for processing.
{
"success": true,
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"type": "spiderSite",
"status": "queued",
"message": "Job submitted successfully"
}
Action: Save the job_id and poll for results
202 Accepted - Job Processing
Job is still being processed. Results not yet available.
{
"success": false,
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "processing",
"message": "Job is still being processed"
}
Action: Wait and poll again
Handling Example:
import time
import requests
def wait_for_job(job_id, headers, max_wait=120):
"""Poll for job completion with timeout"""
start_time = time.time()
while time.time() - start_time < max_wait:
response = requests.get(
f"https://spideriq.ai/api/v1/jobs/{job_id}/results",
headers=headers
)
if response.status_code == 200:
return response.json()
elif response.status_code == 202:
print("⏳ Job still processing...")
time.sleep(3)
else:
raise Exception(f"Error: {response.status_code}")
raise TimeoutError("Job did not complete within timeout period")
400 Bad Request - Invalid Input
The request was malformed or contains invalid data.
Common causes:
Invalid URL Format
Error:
{
"detail": "Invalid URL format. Please provide a valid HTTP/HTTPS URL."
}
Solution:
- Ensure URL starts with
http://orhttps:// - Check for typos in the URL
- Validate URL format before submitting
from urllib.parse import urlparse
def is_valid_url(url):
try:
result = urlparse(url)
return all([result.scheme, result.netloc])
except:
return False
url = "https://example.com"
if is_valid_url(url):
# Submit job
pass
else:
print("❌ Invalid URL format")
Missing Required Fields
Error:
{
"detail": "Missing required field: url"
}
Solution: Ensure all required fields are present in your request:
data = {
"url": "https://example.com", # Required
"instructions": "Extract..." # Optional
}
Invalid Job Type
Error:
{
"detail": "Invalid job_type. Must be 'spiderSite' or 'spiderMaps'"
}
Solution: Use correct job type values:
# Correct
data = {"url": "...", "job_type": "spiderSite"}
data = {"url": "...", "job_type": "spiderMaps"}
# Incorrect
data = {"url": "...", "job_type": "scrape"} # ❌ Invalid
401 Unauthorized - Authentication Failed
Your credentials are missing, invalid, or malformed.
Error Response:
{
"detail": "Invalid authentication token format. Expected: client_id:api_key:api_secret"
}
Common causes:
Ensure you're sending the Authorization header:
# Correct
headers = {
"Authorization": "Bearer <your_token>"
}
# Incorrect - missing header
headers = {}
SpiderIQ expects a three-part token format:
Authorization: Bearer client_id:api_key:api_secret
# Correct
token = "cli_abc123:sk_def456:secret_ghi789"
headers = {"Authorization": f"Bearer {token}"}
# Incorrect - missing parts
token = "cli_abc123:sk_def456" # ❌ Missing secret
Contact support if credentials are not working:
- Email: admin@spideriq.ai
- Include your client ID (do NOT send API key/secret)
Handling Example:
def make_authenticated_request(url, data):
"""Make request with proper error handling"""
headers = {
"Authorization": f"Bearer {os.getenv('SPIDERIQ_TOKEN')}",
"Content-Type": "application/json"
}
response = requests.post(url, headers=headers, json=data)
if response.status_code == 401:
raise Exception(
"Authentication failed. Please check your credentials."
)
return response
403 Forbidden - Access Denied
Your account exists but is inactive or lacks permission.
Error Response:
{
"detail": "Client account is inactive"
}
Action: Contact support at admin@spideriq.ai to reactivate your account
404 Not Found - Resource Doesn't Exist
The requested job ID doesn't exist.
Error Response:
{
"detail": "Job not found"
}
Common causes:
- Typo in job ID
- Job ID from different environment
- Very old job that was cleaned up
Solution:
def get_job_results(job_id, headers):
"""Get results with 404 handling"""
response = requests.get(
f"https://spideriq.ai/api/v1/jobs/{job_id}/results",
headers=headers
)
if response.status_code == 404:
print(f"❌ Job {job_id} not found")
return None
return response.json()
410 Gone - Job Failed or Cancelled
The job has failed, been cancelled, or encountered an error during processing.
Error Response:
{
"success": false,
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "failed",
"error": "Target URL is not accessible",
"message": "Job failed during processing"
}
Common failure reasons:
URL Not Accessible
- Website is down
- URL is invalid or broken
- Site requires authentication
- Connection timeout
Scraping Blocked
- Site blocks bots
- Rate limiting by target site
- CAPTCHA protection
- IP blocked
Timeout
- Page took too long to load
- Large website with many pages
- Slow server response
Worker Error
- Internal processing error
- Resource constraints
- Unexpected page structure
Handling Example:
def handle_job_result(job_id, headers):
"""Handle all job result scenarios"""
response = requests.get(
f"https://spideriq.ai/api/v1/jobs/{job_id}/results",
headers=headers
)
if response.status_code == 200:
# Success
return response.json()
elif response.status_code == 202:
# Still processing
print("⏳ Job still processing...")
return None
elif response.status_code == 410:
# Job failed
error_data = response.json()
print(f"❌ Job failed: {error_data.get('error', 'Unknown error')}")
# Check if we should retry
if "timeout" in error_data.get('error', '').lower():
print("💡 Try submitting again with longer timeout")
elif "not accessible" in error_data.get('error', '').lower():
print("💡 Check if the URL is valid and publicly accessible")
return None
else:
print(f"⚠️ Unexpected status: {response.status_code}")
return None
429 Too Many Requests - Rate Limited
You've exceeded the rate limit (100 requests per minute).
Error Response:
{
"detail": "Rate limit exceeded. Maximum 100 requests per minute."
}
Response Headers:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1698345678
Retry-After: 42
Handling with Exponential Backoff:
import time
import requests
def make_request_with_backoff(url, headers, data, max_retries=3):
"""Make request with exponential backoff on rate limits"""
for attempt in range(max_retries):
response = requests.post(url, headers=headers, json=data)
if response.status_code == 429:
# Rate limited
retry_after = int(response.headers.get('Retry-After', 60))
if attempt < max_retries - 1:
wait_time = min(retry_after, 2 ** attempt * 5)
print(f"⏳ Rate limited. Waiting {wait_time}s before retry...")
time.sleep(wait_time)
else:
raise Exception("Max retries exceeded")
else:
return response
raise Exception("Request failed after all retries")
Best Practices for Rate Limiting:
Track your rate: Monitor the X-RateLimit-Remaining header to know how many requests you have left
Implement backoff: Always use exponential backoff when you hit rate limits
Batch wisely: Submit jobs in controlled batches (e.g., 10-20 at a time) rather than all at once
Respect Retry-After: Always check and respect the Retry-After header value
500 Internal Server Error - Server Issue
An unexpected error occurred on the server.
Error Response:
{
"detail": "Internal server error"
}
Action:
- Retry the request after a brief delay
- If error persists, contact support
- Include your job ID or request details when reporting
def make_request_with_retry(url, headers, data, max_retries=3):
"""Retry on server errors"""
for attempt in range(max_retries):
try:
response = requests.post(url, headers=headers, json=data)
if response.status_code == 500:
if attempt < max_retries - 1:
wait_time = (attempt + 1) * 5
print(f"⚠️ Server error. Retrying in {wait_time}s...")
time.sleep(wait_time)
else:
raise Exception("Server error persists after retries")
else:
return response
except requests.exceptions.RequestException as e:
print(f"❌ Request failed: {e}")
if attempt < max_retries - 1:
time.sleep(5)
else:
raise
return None
Job Status Errors
Checking Job Status
Use the status endpoint to check if a job failed:
def check_job_status(job_id, headers):
"""Check current job status"""
response = requests.get(
f"https://spideriq.ai/api/v1/jobs/{job_id}/status",
headers=headers
)
if response.status_code == 200:
status_data = response.json()
status = status_data['status']
if status == 'completed':
print("✓ Job completed successfully")
elif status == 'processing':
print("⏳ Job is being processed")
elif status == 'queued':
print("📋 Job is queued, waiting for worker")
elif status == 'failed':
print(f"❌ Job failed: {status_data.get('error')}")
elif status == 'cancelled':
print("⚠️ Job was cancelled")
return status_data
return None
Comprehensive Error Handler
Here's a complete error handling implementation:
import requests
import time
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
class SpiderIQClient:
def __init__(self, token):
self.base_url = "https://spideriq.ai/api/v1"
self.headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json"
}
def submit_job(self, job_type, url, instructions=None, max_retries=3):
"""Submit job with error handling and retries"""
data = {"url": url}
if instructions:
data["instructions"] = instructions
endpoint = f"{self.base_url}/jobs/{job_type}/submit"
for attempt in range(max_retries):
try:
response = requests.post(endpoint, headers=self.headers, json=data)
if response.status_code == 201:
result = response.json()
logger.info(f"✓ Job submitted: {result['job_id']}")
return result['job_id']
elif response.status_code == 400:
error = response.json()
logger.error(f"❌ Bad request: {error['detail']}")
return None # Don't retry on client errors
elif response.status_code == 401:
logger.error("❌ Authentication failed")
return None # Don't retry on auth errors
elif response.status_code == 429:
retry_after = int(response.headers.get('Retry-After', 60))
logger.warning(f"⏳ Rate limited. Waiting {retry_after}s...")
time.sleep(retry_after)
elif response.status_code == 500:
if attempt < max_retries - 1:
wait = (attempt + 1) * 5
logger.warning(f"⚠️ Server error. Retrying in {wait}s...")
time.sleep(wait)
else:
logger.error("❌ Server error persists")
return None
except requests.exceptions.Timeout:
logger.warning(f"⏱️ Request timeout (attempt {attempt + 1})")
if attempt < max_retries - 1:
time.sleep(5)
else:
return None
except requests.exceptions.ConnectionError:
logger.warning(f"🔌 Connection error (attempt {attempt + 1})")
if attempt < max_retries - 1:
time.sleep(5)
else:
return None
return None
def get_results(self, job_id, max_wait=120, poll_interval=3):
"""Poll for results with timeout"""
endpoint = f"{self.base_url}/jobs/{job_id}/results"
start_time = time.time()
while time.time() - start_time < max_wait:
try:
response = requests.get(endpoint, headers=self.headers)
if response.status_code == 200:
logger.info(f"✓ Job {job_id} completed")
return response.json()
elif response.status_code == 202:
logger.info(f"⏳ Job {job_id} still processing...")
time.sleep(poll_interval)
elif response.status_code == 404:
logger.error(f"❌ Job {job_id} not found")
return None
elif response.status_code == 410:
error_data = response.json()
logger.error(f"❌ Job {job_id} failed: {error_data.get('error')}")
return None
else:
logger.warning(f"⚠️ Unexpected status: {response.status_code}")
return None
except requests.exceptions.RequestException as e:
logger.error(f"❌ Request error: {e}")
time.sleep(poll_interval)
logger.error(f"⏱️ Timeout waiting for job {job_id}")
return None
# Usage
client = SpiderIQClient("<your_token>")
# Submit job
job_id = client.submit_job(
job_type="spiderSite",
url="https://example.com",
instructions="Extract contact information"
)
if job_id:
# Get results
results = client.get_results(job_id)
if results:
print("Success!")
print(results['data'])
else:
print("Job failed or timed out")
Debugging Tips
Enable Verbose Logging
import logging
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
# Now all requests will be logged
Inspect Response Headers
response = requests.post(url, headers=headers, json=data)
# Check rate limit status
print(f"Rate limit remaining: {response.headers.get('X-RateLimit-Remaining')}")
print(f"Rate limit resets at: {response.headers.get('X-RateLimit-Reset')}")
# Check response details
print(f"Status: {response.status_code}")
print(f"Body: {response.text}")
Test with System Health
Before submitting jobs, verify API connectivity:
def check_api_health():
"""Test API connectivity"""
try:
response = requests.get(
"https://spideriq.ai/api/v1/system/health",
timeout=5
)
if response.status_code == 200:
health = response.json()
print(f"✓ API is healthy")
print(f" Database: {health.get('database')}")
print(f" Queue: {health.get('queue')}")
return True
else:
print(f"⚠️ API returned status {response.status_code}")
return False
except requests.exceptions.RequestException as e:
print(f"❌ Cannot reach API: {e}")
return False
# Run before submitting jobs
if check_api_health():
# Proceed with job submission
pass
Save Failed Requests
import json
from datetime import datetime
def log_failed_request(url, data, response):
"""Log failed requests for debugging"""
timestamp = datetime.now().isoformat()
log_entry = {
"timestamp": timestamp,
"url": url,
"request_data": data,
"status_code": response.status_code,
"response": response.text
}
with open('failed_requests.log', 'a') as f:
f.write(json.dumps(log_entry) + '\n')
print(f"⚠️ Failed request logged to failed_requests.log")
When to Contact Support
Contact support at admin@spideriq.ai if:
- ✉️ Authentication errors persist after verifying credentials
- ✉️ Server errors (500) continue for extended periods
- ✉️ Jobs consistently fail with the same error
- ✉️ You need higher rate limits for your use case
- ✉️ You encounter unexpected behavior not covered in docs
Include in your support request:
- Your client ID (NOT your API key or secret)
- Job ID(s) if applicable
- Error messages received
- Steps to reproduce the issue
- Timestamp of when the issue occurred