Landing Page Capture
Overview
SpiderLanding captures landing pages from URLs (typically Facebook/Google ad tracking links) and provides:
- Screenshots: Above-fold and full-page captures
- HTML Bundles: Self-contained HTML with embedded assets
- AI Extraction: Marketing content analysis via Claude
- Redirect Tracking: Full chain from tracking URL to final page
Use Cases:
- Competitor ad research and landing page analysis
- Marketing funnel documentation
- A/B test monitoring
- Archiving campaign landing pages before they change
Quick Start
1. Submit a Landing Page Capture
curl -X POST "https://spideriq.ai/api/v1/jobs/spiderLanding/submit" \
-H "Authorization: Bearer $CLIENT_ID:$API_KEY:$API_SECRET" \
-H "Content-Type: application/json" \
-d '{
"payload": {
"url": "https://example.com/landing"
}
}'
2. Get Results
curl "https://spideriq.ai/api/v1/jobs/{job_id}/results" \
-H "Authorization: Bearer $CLIENT_ID:$API_KEY:$API_SECRET"
3. Access Your Files
Results include public URLs for:
screenshot_fold_url- Above-fold screenshot (PNG)screenshot_full_url- Full-page screenshot (PNG)html_bundle_url- Self-contained HTML file
What You Get
1. Screenshots
Above-Fold Screenshot
Captures exactly what users see when the page loads (viewport only).
File: fold.png
Default Size: 1440x900px
Full-Page Screenshot
Captures the entire page height, including content below the fold.
File: full.png
Height: Varies by page
2. HTML Bundle
The HTML bundle is a self-contained file with all assets embedded:
- CSS stylesheets (inline)
- JavaScript files (inline)
- Images (base64 data URIs)
- Fonts (embedded)
File Sizes: Simple pages produce 5-50KB bundles. Complex marketing pages can be 10-50MB due to embedded images.
3. AI-Extracted Content
When extract_content is enabled, Claude AI analyzes the page and extracts:
{
"extracted": {
"meta": {
"title": "Page Title",
"description": "Meta description",
"og_image": "https://..."
},
"content": {
"headline": "Main H1 Headline",
"subheadline": "Supporting text",
"cta_primary": {
"text": "Start Free Trial",
"url": "https://...",
"style": "button"
},
"value_propositions": [
"Benefit 1",
"Benefit 2"
],
"testimonials": [
{
"quote": "Great product!",
"author": "John",
"title": "CEO"
}
],
"trust_signals": [
"10,000+ customers",
"Featured in Forbes"
]
},
"design": {
"layout_type": "hero_features_testimonials_cta",
"color_palette": ["#FF5733", "#1A1A1A"],
"font_families": ["Inter", "system-ui"],
"has_video": false,
"has_chat_widget": true,
"form_fields": ["email", "name"]
}
}
}
4. Redirect Chain
Track the full journey from ad click to landing page:
{
"capture": {
"initial_url": "https://tracking.example.com/ad?id=123",
"final_url": "https://landing-page.com/offer",
"redirect_chain": [
"https://tracking.example.com/ad?id=123",
"https://tracking.example.com/redirect/abc",
"https://landing-page.com/offer"
]
}
}
Configuration Options
Viewport Size
Customize the browser viewport for different device simulations:
{
"payload": {
"url": "https://example.com",
"options": {
"viewport": {
"width": 1920,
"height": 1080
}
}
}
}
Common Viewports:
| Device | Width | Height |
|---|---|---|
| Desktop (default) | 1440 | 900 |
| Desktop HD | 1920 | 1080 |
| Laptop | 1366 | 768 |
| Tablet | 768 | 1024 |
Feature Toggles
Disable features you don't need to speed up processing:
{
"payload": {
"url": "https://example.com",
"options": {
"capture_screenshot": true,
"capture_full_page": true,
"capture_html_bundle": false,
"extract_content": false,
"dismiss_popups": false,
"scroll_for_lazy_load": false
}
}
}
Timeouts
Adjust timeouts for slow-loading pages:
{
"payload": {
"url": "https://slow-page.com",
"options": {
"timeout_seconds": 90,
"max_redirects": 15
}
}
}
Common Patterns
Competitor Landing Page Research
Capture competitor landing pages for analysis:
curl -X POST "https://spideriq.ai/api/v1/jobs/spiderLanding/submit" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"payload": {
"url": "https://competitor.com/pricing",
"options": {
"capture_screenshot": true,
"capture_full_page": true,
"capture_html_bundle": true,
"extract_content": true
}
}
}'
Facebook Ad Tracking URLs
Capture landing pages from Facebook ad tracking links:
curl -X POST "https://spideriq.ai/api/v1/jobs/spiderLanding/submit" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"payload": {
"url": "https://l.facebook.com/l.php?u=https%3A%2F%2Fexample.com",
"ad_id": "fb_123456789"
}
}'
The ad_id field lets you correlate captures with your ad tracking system.
Screenshots Only (Fast Mode)
When you only need visual documentation:
curl -X POST "https://spideriq.ai/api/v1/jobs/spiderLanding/submit" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"payload": {
"url": "https://example.com",
"options": {
"capture_screenshot": true,
"capture_full_page": true,
"capture_html_bundle": false,
"extract_content": false,
"scroll_for_lazy_load": false
}
}
}'
This runs much faster (3-5 seconds) since it skips HTML bundling and AI extraction.
Handling Popups
SpiderLanding automatically handles common overlays:
- Cookie consent banners
- Newsletter signup popups
- Chat widgets
- Interstitial ads
The AI identifies dismiss buttons and clicks them before capture.
If popup dismissal fails, the page is still captured - you'll just see the popup in the screenshot.
Lazy Loading
Modern landing pages often lazy-load images and content. SpiderLanding:
- Scrolls through the entire page
- Waits for network activity to settle
- Scrolls back to top
- Then captures
This ensures lazy-loaded content appears in your screenshots and HTML bundle.
Storage
Files are stored in your SpiderMedia bucket:
https://media.spideriq.ai/{your-bucket}/landings/{job-id}/
├── fold.png
├── full.png
└── page.html
Files are retained according to your storage policy.
Error Handling
SpiderLanding uses graceful degradation:
| Scenario | Behavior |
|---|---|
| Page never reaches "network idle" | Captures after timeout with whatever loaded |
| Popup dismissal fails | Continues capture (popup visible) |
| Screenshot fails | Other captures still attempted |
| HTML bundling fails | Returns null for html_bundle_url |
| AI extraction fails | Returns extracted: null |
All errors are recorded in the errors array:
{
"data": {
"errors": [
"Navigation timeout: Page did not reach network idle",
"HTML bundle: Monolith process timed out"
]
}
}
Performance
| Page Type | Expected Time |
|---|---|
| Simple static page | 3-5 seconds |
| Marketing landing page | 15-30 seconds |
| Complex SPA (React/Vue) | 30-60 seconds |
| Heavy page with timeout | 60-90 seconds |
Factors affecting speed:
- Page load time
- Number of assets to embed
- AI extraction (adds 2-5 seconds)
- Lazy loading scroll time
Best Practices
Disable unused features
If you don't need AI extraction, disable it to save 2-5 seconds per capture.
Use appropriate timeouts
Most pages load in 30 seconds. Only increase timeout for known slow pages.
Store ad_id for correlation
Always include ad_id when capturing ad landing pages to correlate with your ad spend data.
Check redirect chain
The redirect chain shows you tracking pixels and intermediate redirects - useful for understanding ad attribution.