Skip to main content

Error Handling & Troubleshooting

Overview

This guide covers common errors you may encounter when using SpiderIQ API and how to handle them gracefully in your applications.

HTTP Status Codes

200 OK - Success

Job completed successfully and results are available.

{
"success": true,
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"type": "spiderSite",
"status": "completed",
"data": { ... }
}

Action: Process the results


201 Created - Job Submitted

Job was successfully submitted and queued for processing.

{
"success": true,
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"type": "spiderSite",
"status": "queued",
"message": "Job submitted successfully"
}

Action: Save the job_id and poll for results


202 Accepted - Job Processing

Job is still being processed. Results not yet available.

{
"success": false,
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "processing",
"message": "Job is still being processed"
}

Action: Wait and poll again

Handling Example:

import time
import requests

def wait_for_job(job_id, headers, max_wait=120):
"""Poll for job completion with timeout"""
start_time = time.time()

while time.time() - start_time < max_wait:
response = requests.get(
f"https://spideriq.ai/api/v1/jobs/{job_id}/results",
headers=headers
)

if response.status_code == 200:
return response.json()
elif response.status_code == 202:
print("⏳ Job still processing...")
time.sleep(3)
else:
raise Exception(f"Error: {response.status_code}")

raise TimeoutError("Job did not complete within timeout period")

400 Bad Request - Invalid Input

The request was malformed or contains invalid data.

Common causes:

Invalid URL Format

Error:

{
"detail": "Invalid URL format. Please provide a valid HTTP/HTTPS URL."
}

Solution:

  • Ensure URL starts with http:// or https://
  • Check for typos in the URL
  • Validate URL format before submitting
from urllib.parse import urlparse

def is_valid_url(url):
try:
result = urlparse(url)
return all([result.scheme, result.netloc])
except:
return False

url = "https://example.com"
if is_valid_url(url):
# Submit job
pass
else:
print("❌ Invalid URL format")
Missing Required Fields

Error:

{
"detail": "Missing required field: url"
}

Solution: Ensure all required fields are present in your request:

data = {
"url": "https://example.com", # Required
"instructions": "Extract..." # Optional
}
Invalid Job Type

Error:

{
"detail": "Invalid job_type. Must be 'spiderSite' or 'spiderMaps'"
}

Solution: Use correct job type values:

# Correct
data = {"url": "...", "job_type": "spiderSite"}
data = {"url": "...", "job_type": "spiderMaps"}

# Incorrect
data = {"url": "...", "job_type": "scrape"} # ❌ Invalid

401 Unauthorized - Authentication Failed

Your credentials are missing, invalid, or malformed.

Error Response:

{
"detail": "Invalid authentication token format. Expected: client_id:api_key:api_secret"
}

Common causes:

Missing Authorization Header

Ensure you're sending the Authorization header:

# Correct
headers = {
"Authorization": "Bearer <your_token>"
}

# Incorrect - missing header
headers = {}
Incorrect Token Format

SpiderIQ expects a three-part token format:

Authorization: Bearer client_id:api_key:api_secret
# Correct
token = "cli_abc123:sk_def456:secret_ghi789"
headers = {"Authorization": f"Bearer {token}"}

# Incorrect - missing parts
token = "cli_abc123:sk_def456" # ❌ Missing secret
Expired or Invalid Credentials

Contact support if credentials are not working:

Handling Example:

def make_authenticated_request(url, data):
"""Make request with proper error handling"""
headers = {
"Authorization": f"Bearer {os.getenv('SPIDERIQ_TOKEN')}",
"Content-Type": "application/json"
}

response = requests.post(url, headers=headers, json=data)

if response.status_code == 401:
raise Exception(
"Authentication failed. Please check your credentials."
)

return response

403 Forbidden - Access Denied

Your account exists but is inactive or lacks permission.

Error Response:

{
"detail": "Client account is inactive"
}

Action: Contact support at admin@spideriq.ai to reactivate your account


404 Not Found - Resource Doesn't Exist

The requested job ID doesn't exist.

Error Response:

{
"detail": "Job not found"
}

Common causes:

  • Typo in job ID
  • Job ID from different environment
  • Very old job that was cleaned up

Solution:

def get_job_results(job_id, headers):
"""Get results with 404 handling"""
response = requests.get(
f"https://spideriq.ai/api/v1/jobs/{job_id}/results",
headers=headers
)

if response.status_code == 404:
print(f"❌ Job {job_id} not found")
return None

return response.json()

410 Gone - Job Failed or Cancelled

The job has failed, been cancelled, or encountered an error during processing.

Error Response:

{
"success": false,
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "failed",
"error": "Target URL is not accessible",
"message": "Job failed during processing"
}

Common failure reasons:

🔗

URL Not Accessible

  • Website is down
  • URL is invalid or broken
  • Site requires authentication
  • Connection timeout
🚫

Scraping Blocked

  • Site blocks bots
  • Rate limiting by target site
  • CAPTCHA protection
  • IP blocked
🕐

Timeout

  • Page took too long to load
  • Large website with many pages
  • Slow server response
🖥️

Worker Error

  • Internal processing error
  • Resource constraints
  • Unexpected page structure

Handling Example:

def handle_job_result(job_id, headers):
"""Handle all job result scenarios"""
response = requests.get(
f"https://spideriq.ai/api/v1/jobs/{job_id}/results",
headers=headers
)

if response.status_code == 200:
# Success
return response.json()

elif response.status_code == 202:
# Still processing
print("⏳ Job still processing...")
return None

elif response.status_code == 410:
# Job failed
error_data = response.json()
print(f"❌ Job failed: {error_data.get('error', 'Unknown error')}")

# Check if we should retry
if "timeout" in error_data.get('error', '').lower():
print("💡 Try submitting again with longer timeout")
elif "not accessible" in error_data.get('error', '').lower():
print("💡 Check if the URL is valid and publicly accessible")

return None

else:
print(f"⚠️ Unexpected status: {response.status_code}")
return None

429 Too Many Requests - Rate Limited

You've exceeded the rate limit (100 requests per minute).

Error Response:

{
"detail": "Rate limit exceeded. Maximum 100 requests per minute."
}

Response Headers:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1698345678
Retry-After: 42

Handling with Exponential Backoff:

import time
import requests

def make_request_with_backoff(url, headers, data, max_retries=3):
"""Make request with exponential backoff on rate limits"""

for attempt in range(max_retries):
response = requests.post(url, headers=headers, json=data)

if response.status_code == 429:
# Rate limited
retry_after = int(response.headers.get('Retry-After', 60))

if attempt < max_retries - 1:
wait_time = min(retry_after, 2 ** attempt * 5)
print(f"⏳ Rate limited. Waiting {wait_time}s before retry...")
time.sleep(wait_time)
else:
raise Exception("Max retries exceeded")
else:
return response

raise Exception("Request failed after all retries")

Best Practices for Rate Limiting:

tip

Track your rate: Monitor the X-RateLimit-Remaining header to know how many requests you have left

tip

Implement backoff: Always use exponential backoff when you hit rate limits

tip

Batch wisely: Submit jobs in controlled batches (e.g., 10-20 at a time) rather than all at once

tip

Respect Retry-After: Always check and respect the Retry-After header value


500 Internal Server Error - Server Issue

An unexpected error occurred on the server.

Error Response:

{
"detail": "Internal server error"
}

Action:

  • Retry the request after a brief delay
  • If error persists, contact support
  • Include your job ID or request details when reporting
def make_request_with_retry(url, headers, data, max_retries=3):
"""Retry on server errors"""

for attempt in range(max_retries):
try:
response = requests.post(url, headers=headers, json=data)

if response.status_code == 500:
if attempt < max_retries - 1:
wait_time = (attempt + 1) * 5
print(f"⚠️ Server error. Retrying in {wait_time}s...")
time.sleep(wait_time)
else:
raise Exception("Server error persists after retries")
else:
return response

except requests.exceptions.RequestException as e:
print(f"❌ Request failed: {e}")
if attempt < max_retries - 1:
time.sleep(5)
else:
raise

return None

Job Status Errors

Checking Job Status

Use the status endpoint to check if a job failed:

def check_job_status(job_id, headers):
"""Check current job status"""
response = requests.get(
f"https://spideriq.ai/api/v1/jobs/{job_id}/status",
headers=headers
)

if response.status_code == 200:
status_data = response.json()
status = status_data['status']

if status == 'completed':
print("✓ Job completed successfully")
elif status == 'processing':
print("⏳ Job is being processed")
elif status == 'queued':
print("📋 Job is queued, waiting for worker")
elif status == 'failed':
print(f"❌ Job failed: {status_data.get('error')}")
elif status == 'cancelled':
print("⚠️ Job was cancelled")

return status_data

return None

Comprehensive Error Handler

Here's a complete error handling implementation:

import requests
import time
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class SpiderIQClient:
def __init__(self, token):
self.base_url = "https://spideriq.ai/api/v1"
self.headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json"
}

def submit_job(self, job_type, url, instructions=None, max_retries=3):
"""Submit job with error handling and retries"""

data = {"url": url}
if instructions:
data["instructions"] = instructions

endpoint = f"{self.base_url}/jobs/{job_type}/submit"

for attempt in range(max_retries):
try:
response = requests.post(endpoint, headers=self.headers, json=data)

if response.status_code == 201:
result = response.json()
logger.info(f"✓ Job submitted: {result['job_id']}")
return result['job_id']

elif response.status_code == 400:
error = response.json()
logger.error(f"❌ Bad request: {error['detail']}")
return None # Don't retry on client errors

elif response.status_code == 401:
logger.error("❌ Authentication failed")
return None # Don't retry on auth errors

elif response.status_code == 429:
retry_after = int(response.headers.get('Retry-After', 60))
logger.warning(f"⏳ Rate limited. Waiting {retry_after}s...")
time.sleep(retry_after)

elif response.status_code == 500:
if attempt < max_retries - 1:
wait = (attempt + 1) * 5
logger.warning(f"⚠️ Server error. Retrying in {wait}s...")
time.sleep(wait)
else:
logger.error("❌ Server error persists")
return None

except requests.exceptions.Timeout:
logger.warning(f"⏱️ Request timeout (attempt {attempt + 1})")
if attempt < max_retries - 1:
time.sleep(5)
else:
return None

except requests.exceptions.ConnectionError:
logger.warning(f"🔌 Connection error (attempt {attempt + 1})")
if attempt < max_retries - 1:
time.sleep(5)
else:
return None

return None

def get_results(self, job_id, max_wait=120, poll_interval=3):
"""Poll for results with timeout"""

endpoint = f"{self.base_url}/jobs/{job_id}/results"
start_time = time.time()

while time.time() - start_time < max_wait:
try:
response = requests.get(endpoint, headers=self.headers)

if response.status_code == 200:
logger.info(f"✓ Job {job_id} completed")
return response.json()

elif response.status_code == 202:
logger.info(f"⏳ Job {job_id} still processing...")
time.sleep(poll_interval)

elif response.status_code == 404:
logger.error(f"❌ Job {job_id} not found")
return None

elif response.status_code == 410:
error_data = response.json()
logger.error(f"❌ Job {job_id} failed: {error_data.get('error')}")
return None

else:
logger.warning(f"⚠️ Unexpected status: {response.status_code}")
return None

except requests.exceptions.RequestException as e:
logger.error(f"❌ Request error: {e}")
time.sleep(poll_interval)

logger.error(f"⏱️ Timeout waiting for job {job_id}")
return None

# Usage
client = SpiderIQClient("<your_token>")

# Submit job
job_id = client.submit_job(
job_type="spiderSite",
url="https://example.com",
instructions="Extract contact information"
)

if job_id:
# Get results
results = client.get_results(job_id)

if results:
print("Success!")
print(results['data'])
else:
print("Job failed or timed out")

Debugging Tips

Enable Verbose Logging

import logging

logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)

# Now all requests will be logged

Inspect Response Headers

response = requests.post(url, headers=headers, json=data)

# Check rate limit status
print(f"Rate limit remaining: {response.headers.get('X-RateLimit-Remaining')}")
print(f"Rate limit resets at: {response.headers.get('X-RateLimit-Reset')}")

# Check response details
print(f"Status: {response.status_code}")
print(f"Body: {response.text}")

Test with System Health

Before submitting jobs, verify API connectivity:

def check_api_health():
"""Test API connectivity"""
try:
response = requests.get(
"https://spideriq.ai/api/v1/system/health",
timeout=5
)

if response.status_code == 200:
health = response.json()
print(f"✓ API is healthy")
print(f" Database: {health.get('database')}")
print(f" Queue: {health.get('queue')}")
return True
else:
print(f"⚠️ API returned status {response.status_code}")
return False

except requests.exceptions.RequestException as e:
print(f"❌ Cannot reach API: {e}")
return False

# Run before submitting jobs
if check_api_health():
# Proceed with job submission
pass

Save Failed Requests

import json
from datetime import datetime

def log_failed_request(url, data, response):
"""Log failed requests for debugging"""
timestamp = datetime.now().isoformat()
log_entry = {
"timestamp": timestamp,
"url": url,
"request_data": data,
"status_code": response.status_code,
"response": response.text
}

with open('failed_requests.log', 'a') as f:
f.write(json.dumps(log_entry) + '\n')

print(f"⚠️ Failed request logged to failed_requests.log")

When to Contact Support

Contact support at admin@spideriq.ai if:

  • ✉️ Authentication errors persist after verifying credentials
  • ✉️ Server errors (500) continue for extended periods
  • ✉️ Jobs consistently fail with the same error
  • ✉️ You need higher rate limits for your use case
  • ✉️ You encounter unexpected behavior not covered in docs

Include in your support request:

  • Your client ID (NOT your API key or secret)
  • Job ID(s) if applicable
  • Error messages received
  • Steps to reproduce the issue
  • Timestamp of when the issue occurred

Next Steps