Skip to main content

Landing Page Capture

Overview

SpiderLanding captures landing pages from URLs (typically Facebook/Google ad tracking links) and provides:

  • Screenshots: Above-fold and full-page captures
  • HTML Bundles: Self-contained HTML with embedded assets
  • AI Extraction: Marketing content analysis via Claude
  • Redirect Tracking: Full chain from tracking URL to final page
info

Use Cases:

  • Competitor ad research and landing page analysis
  • Marketing funnel documentation
  • A/B test monitoring
  • Archiving campaign landing pages before they change

Quick Start

1. Submit a Landing Page Capture

curl -X POST "https://spideriq.ai/api/v1/jobs/spiderLanding/submit" \
-H "Authorization: Bearer $CLIENT_ID:$API_KEY:$API_SECRET" \
-H "Content-Type: application/json" \
-d '{
"payload": {
"url": "https://example.com/landing"
}
}'

2. Get Results

curl "https://spideriq.ai/api/v1/jobs/{job_id}/results" \
-H "Authorization: Bearer $CLIENT_ID:$API_KEY:$API_SECRET"

3. Access Your Files

Results include public URLs for:

  • screenshot_fold_url - Above-fold screenshot (PNG)
  • screenshot_full_url - Full-page screenshot (PNG)
  • html_bundle_url - Self-contained HTML file

What You Get

1. Screenshots

📱

Above-Fold Screenshot

Captures exactly what users see when the page loads (viewport only).

File: fold.png Default Size: 1440x900px

📜

Full-Page Screenshot

Captures the entire page height, including content below the fold.

File: full.png Height: Varies by page

2. HTML Bundle

The HTML bundle is a self-contained file with all assets embedded:

  • CSS stylesheets (inline)
  • JavaScript files (inline)
  • Images (base64 data URIs)
  • Fonts (embedded)
note

File Sizes: Simple pages produce 5-50KB bundles. Complex marketing pages can be 10-50MB due to embedded images.

3. AI-Extracted Content

When extract_content is enabled, Claude AI analyzes the page and extracts:

{
"extracted": {
"meta": {
"title": "Page Title",
"description": "Meta description",
"og_image": "https://..."
},
"content": {
"headline": "Main H1 Headline",
"subheadline": "Supporting text",
"cta_primary": {
"text": "Start Free Trial",
"url": "https://...",
"style": "button"
},
"value_propositions": [
"Benefit 1",
"Benefit 2"
],
"testimonials": [
{
"quote": "Great product!",
"author": "John",
"title": "CEO"
}
],
"trust_signals": [
"10,000+ customers",
"Featured in Forbes"
]
},
"design": {
"layout_type": "hero_features_testimonials_cta",
"color_palette": ["#FF5733", "#1A1A1A"],
"font_families": ["Inter", "system-ui"],
"has_video": false,
"has_chat_widget": true,
"form_fields": ["email", "name"]
}
}
}

4. Redirect Chain

Track the full journey from ad click to landing page:

{
"capture": {
"initial_url": "https://tracking.example.com/ad?id=123",
"final_url": "https://landing-page.com/offer",
"redirect_chain": [
"https://tracking.example.com/ad?id=123",
"https://tracking.example.com/redirect/abc",
"https://landing-page.com/offer"
]
}
}

Configuration Options

Viewport Size

Customize the browser viewport for different device simulations:

{
"payload": {
"url": "https://example.com",
"options": {
"viewport": {
"width": 1920,
"height": 1080
}
}
}
}

Common Viewports:

DeviceWidthHeight
Desktop (default)1440900
Desktop HD19201080
Laptop1366768
Tablet7681024

Feature Toggles

Disable features you don't need to speed up processing:

{
"payload": {
"url": "https://example.com",
"options": {
"capture_screenshot": true,
"capture_full_page": true,
"capture_html_bundle": false,
"extract_content": false,
"dismiss_popups": false,
"scroll_for_lazy_load": false
}
}
}

Timeouts

Adjust timeouts for slow-loading pages:

{
"payload": {
"url": "https://slow-page.com",
"options": {
"timeout_seconds": 90,
"max_redirects": 15
}
}
}

Common Patterns

Competitor Landing Page Research

Capture competitor landing pages for analysis:

curl -X POST "https://spideriq.ai/api/v1/jobs/spiderLanding/submit" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"payload": {
"url": "https://competitor.com/pricing",
"options": {
"capture_screenshot": true,
"capture_full_page": true,
"capture_html_bundle": true,
"extract_content": true
}
}
}'

Facebook Ad Tracking URLs

Capture landing pages from Facebook ad tracking links:

curl -X POST "https://spideriq.ai/api/v1/jobs/spiderLanding/submit" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"payload": {
"url": "https://l.facebook.com/l.php?u=https%3A%2F%2Fexample.com",
"ad_id": "fb_123456789"
}
}'

The ad_id field lets you correlate captures with your ad tracking system.

Screenshots Only (Fast Mode)

When you only need visual documentation:

curl -X POST "https://spideriq.ai/api/v1/jobs/spiderLanding/submit" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"payload": {
"url": "https://example.com",
"options": {
"capture_screenshot": true,
"capture_full_page": true,
"capture_html_bundle": false,
"extract_content": false,
"scroll_for_lazy_load": false
}
}
}'

This runs much faster (3-5 seconds) since it skips HTML bundling and AI extraction.

Handling Popups

SpiderLanding automatically handles common overlays:

  • Cookie consent banners
  • Newsletter signup popups
  • Chat widgets
  • Interstitial ads

The AI identifies dismiss buttons and clicks them before capture.

note

If popup dismissal fails, the page is still captured - you'll just see the popup in the screenshot.

Lazy Loading

Modern landing pages often lazy-load images and content. SpiderLanding:

  1. Scrolls through the entire page
  2. Waits for network activity to settle
  3. Scrolls back to top
  4. Then captures

This ensures lazy-loaded content appears in your screenshots and HTML bundle.

Storage

Files are stored in your SpiderMedia bucket:

https://media.spideriq.ai/{your-bucket}/landings/{job-id}/
├── fold.png
├── full.png
└── page.html

Files are retained according to your storage policy.

Error Handling

SpiderLanding uses graceful degradation:

ScenarioBehavior
Page never reaches "network idle"Captures after timeout with whatever loaded
Popup dismissal failsContinues capture (popup visible)
Screenshot failsOther captures still attempted
HTML bundling failsReturns null for html_bundle_url
AI extraction failsReturns extracted: null

All errors are recorded in the errors array:

{
"data": {
"errors": [
"Navigation timeout: Page did not reach network idle",
"HTML bundle: Monolith process timed out"
]
}
}

Performance

Page TypeExpected Time
Simple static page3-5 seconds
Marketing landing page15-30 seconds
Complex SPA (React/Vue)30-60 seconds
Heavy page with timeout60-90 seconds

Factors affecting speed:

  • Page load time
  • Number of assets to embed
  • AI extraction (adds 2-5 seconds)
  • Lazy loading scroll time

Best Practices

Disable unused features

If you don't need AI extraction, disable it to save 2-5 seconds per capture.

Use appropriate timeouts

Most pages load in 30 seconds. Only increase timeout for known slow pages.

Store ad_id for correlation

Always include ad_id when capturing ad landing pages to correlate with your ad spend data.

Check redirect chain

The redirect chain shows you tracking pixels and intermediate redirects - useful for understanding ad attribution.

Next Steps