Bulk Import Records - SpiderIQ by Di-Atomic

Overview

Bulk import records to your canonical database. This is the most efficient way to:

Seed your database from CRM exports
Import existing customer lists
Pre-populate records from external sources

Import up to 1000 records in a single request. Duplicates can be automatically skipped.

Request Body

records

array

required

Array of records to import (1-1000 records)Each record can contain:

email, full_name, first_name, last_name, linkedin_url, position
company_name, company_domain, google_place_id, website, phone
city, country

record_type

string

required

Type of records. One of: business, contact, email, profile

skip_duplicates

boolean

default:"true"

If true, duplicate records are silently skipped. If false, duplicates cause errors.

Examples

Import Business Records

curl -X POST https://spideriq.ai/api/v1/fuzziq/canonical/import \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <client_id>:<api_key>:<api_secret>" \
  -d '{
    "records": [
      {
        "company_name": "Acme Corp",
        "company_domain": "acme.com",
        "phone": "+1-555-100-0001"
      },
      {
        "company_name": "Beta Inc",
        "company_domain": "beta.io",
        "phone": "+1-555-100-0002"
      },
      {
        "company_name": "Gamma LLC",
        "company_domain": "gamma.com",
        "phone": "+1-555-100-0003"
      }
    ],
    "record_type": "business",
    "skip_duplicates": true
  }'

Import Contact Records from CRM

import requests
import json

headers = {"Authorization": "Bearer <your_token>", "Content-Type": "application/json"}
url = "https://spideriq.ai/api/v1/fuzziq/canonical/import"

# Example: Import HubSpot contacts export
with open("hubspot_contacts.json") as f:
    hubspot_data = json.load(f)

# Transform to SpiderFuzzer format
records = [
    {
        "email": contact["email"],
        "full_name": f"{contact['firstname']} {contact['lastname']}",
        "company_name": contact.get("company"),
        "linkedin_url": contact.get("linkedin_profile")
    }
    for contact in hubspot_data["contacts"]
    if contact.get("email")  # Skip records without email
]

# Import
response = requests.post(url, headers=headers, json={
    "records": records[:1000],  # First 1000
    "record_type": "contact",
    "skip_duplicates": True
})

result = response.json()
print(f"Imported {result['imported_count']} contacts")
print(f"Skipped {result['duplicate_count']} duplicates")

if result["errors"]:
    print(f"Errors: {result['errors']}")

Import with Error Checking

Set skip_duplicates: false to get errors for duplicates:

curl -X POST https://spideriq.ai/api/v1/fuzziq/canonical/import \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <client_id>:<api_key>:<api_secret>" \
  -d '{
    "records": [
      {"email": "john@example.com"},
      {"email": "john@example.com"}
    ],
    "record_type": "email",
    "skip_duplicates": false
  }'

Response:

{
  "success": false,
  "imported_count": 1,
  "duplicate_count": 1,
  "error_count": 1,
  "errors": ["Duplicate record: john@example.com"]
}

Response

Success Response

{
  "success": true,
  "imported_count": 847,
  "duplicate_count": 153,
  "error_count": 0,
  "errors": []
}

Partial Success (with skip_duplicates: false)

{
  "success": false,
  "imported_count": 500,
  "duplicate_count": 300,
  "error_count": 300,
  "errors": [
    "Duplicate record: john@example.com",
    "Duplicate record: jane@example.com",
    "Duplicate record: info@acme.com"
  ]
}

Response Fields

Field	Type	Description
`success`	boolean	`true` if all records imported or `skip_duplicates=true`
`imported_count`	integer	Number of new records added
`duplicate_count`	integer	Number of duplicate records found
`error_count`	integer	Number of errors encountered
`errors`	array	First 10 error messages (truncated for large imports)

Use Cases

1. Seed from Salesforce Export

import requests
import pandas as pd

# Load Salesforce export
df = pd.read_csv("salesforce_accounts.csv")

# Transform to SpiderFuzzer format
records = df[["Name", "Website", "Phone"]].rename(columns={
    "Name": "company_name",
    "Website": "company_domain",
    "Phone": "phone"
}).to_dict("records")

# Clean domain (remove http://)
for r in records:
    if r["company_domain"]:
        r["company_domain"] = r["company_domain"].replace("https://", "").replace("http://", "").split("/")[0]

# Import
response = requests.post(url, headers=headers, json={
    "records": records,
    "record_type": "business",
    "skip_duplicates": True
})

2. Block Competitor Domains

# Import competitor domains to exclude from campaigns
competitors = [
    {"company_domain": "competitor1.com"},
    {"company_domain": "competitor2.com"},
    {"company_domain": "competitor3.io"},
    # ... more competitors
]

requests.post(url, headers=headers, json={
    "records": competitors,
    "record_type": "business",
    "skip_duplicates": True
})

3. Import LinkedIn Profiles

# Import known LinkedIn profiles
profiles = [
    {
        "linkedin_url": "https://linkedin.com/in/johndoe",
        "full_name": "John Doe",
        "company_name": "Acme Corp"
    },
    {
        "linkedin_url": "https://linkedin.com/in/janesmith",
        "full_name": "Jane Smith",
        "company_name": "Beta Inc"
    }
]

requests.post(url, headers=headers, json={
    "records": profiles,
    "record_type": "profile",
    "skip_duplicates": True
})

Limits

Maximum records: 1000 per request
Rate limit: Standard API rate limits apply

For very large imports (millions of records), split into batches of 1000 and use async processing:

import asyncio
import aiohttp

async def import_batch(session, batch):
    async with session.post(url, json={"records": batch, "record_type": "business", "skip_duplicates": True}) as resp:
        return await resp.json()

async def bulk_import(all_records):
    async with aiohttp.ClientSession(headers=headers) as session:
        batches = [all_records[i:i+1000] for i in range(0, len(all_records), 1000)]
        results = await asyncio.gather(*[import_batch(session, b) for b in batches])
        return results

Error Responses

SpiderFuzzer Not Configured

{
  "detail": "FuzzIQ is not configured"
}

Status Code: 503 Service Unavailable

API Reference

​Overview

​Request Body

​Examples

​Import Business Records

​Import Contact Records from CRM

​Import with Error Checking

​Response

​Success Response

​Partial Success (with skip_duplicates: false)

​Response Fields

​Use Cases

​1. Seed from Salesforce Export

​2. Block Competitor Domains

​3. Import LinkedIn Profiles

​Limits

​Error Responses

​SpiderFuzzer Not Configured