Skip to main content

Check Batch Records

POST/api/v1/fuzziq/check-batch

Overview

Check multiple records against your canonical database in a single request. This is more efficient than individual checks when processing SpiderMaps results or bulk imports.

Returns records split into two arrays: unique (new records) and duplicates (matched existing records).

Request Body

recordsarrayrequired

Array of records to check (1-100 records)

Each record can contain:

  • email, full_name, first_name, last_name, linkedin_url, position
  • company_name, company_domain, google_place_id, website, phone
  • city, country
record_typestringrequired

Type of records. One of: business, contact, email, profile

add_to_canonicalbooleandefault: true

If true, automatically add unique records to canonical database

campaign_idstring

Campaign ID for scoped deduplication. When provided, deduplication is scoped to records from this campaign only.

thresholdnumberdefault: 0.5

Confidence threshold for ML matching (0.0-1.0)

idempotency_keystring

Unique key for retry safety. Same key returns cached result (max 100 chars)

Examples

Check SpiderMaps Business Results

curl -X POST https://spideriq.ai/api/v1/fuzziq/check-batch \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <client_id>:<api_key>:<api_secret>" \
-d '{
"records": [
{
"company_name": "McDonald'\''s Paris",
"google_place_id": "ChIJ123456789",
"phone": "+33-1-23-45-67-89"
},
{
"company_name": "McDonald'\''s Lyon",
"google_place_id": "ChIJ987654321",
"phone": "+33-4-56-78-90-12"
},
{
"company_name": "Burger King Paris",
"google_place_id": "ChIJabcdefghi",
"phone": "+33-1-98-76-54-32"
}
],
"record_type": "business",
"add_to_canonical": true
}'

Check Contacts with Idempotency

Use idempotency_key to safely retry failed requests without creating duplicate canonical records:

curl -X POST https://spideriq.ai/api/v1/fuzziq/check-batch \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <client_id>:<api_key>:<api_secret>" \
-d '{
"records": [
{"email": "john@example.com", "full_name": "John Doe"},
{"email": "jane@example.com", "full_name": "Jane Smith"}
],
"record_type": "contact",
"idempotency_key": "job-550e8400-batch-1"
}'

Response

Success Response

{
"success": true,
"unique": [
{
"company_name": "McDonald's Paris",
"google_place_id": "ChIJ123456789",
"phone": "+33-1-23-45-67-89"
}
],
"duplicates": [
{
"record": {
"company_name": "McDonald's Lyon",
"google_place_id": "ChIJ987654321",
"phone": "+33-4-56-78-90-12"
},
"matched_canonical_id": 12345,
"confidence": 1.0,
"match_type": "google_place_id"
}
],
"stats": {
"total_checked": 2,
"unique_count": 1,
"duplicate_count": 1,
"added_to_canonical": 1,
"skipped": false,
"reason": null
}
}

SpiderFuzzer Not Configured

When SpiderFuzzer is not enabled, all records are returned as unique:

{
"success": true,
"unique": [
{"company_name": "McDonald's Paris", "google_place_id": "ChIJ123456789"}
],
"duplicates": [],
"stats": {
"total_checked": 1,
"unique_count": 1,
"duplicate_count": 0,
"skipped": true,
"reason": "fuzziq_not_configured"
}
}

Response Fields

FieldTypeDescription
successbooleanWhether the request succeeded
uniquearrayRecords that are unique (not duplicates)
duplicatesarrayRecords that matched existing canonical records
stats.total_checkedintegerTotal records checked
stats.unique_countintegerNumber of unique records
stats.duplicate_countintegerNumber of duplicates found
stats.added_to_canonicalintegerRecords added to canonical database
stats.skippedbooleanWhether deduplication was skipped
stats.reasonstringReason if skipped

Duplicate Match Object

Each item in duplicates array contains:

FieldTypeDescription
recordobjectThe original record that was checked
matched_canonical_idintegerID of the matched canonical record
confidencenumberMatch confidence (0.0-1.0)
match_typestringType of match (exact_hash, email, google_place_id, etc.)

Use Cases

1. Deduplicate SpiderMaps Results

When running multi-location campaigns, the same business may appear in multiple locations:

# After receiving SpiderMaps results
spidermaps_results = job_result["data"]["businesses"]

# Check for duplicates with SpiderFuzzer
dedup_response = requests.post(
"https://spideriq.ai/api/v1/fuzziq/check-batch",
headers=headers,
json={
"records": [
{
"google_place_id": b["place_id"],
"company_name": b["name"],
"phone": b.get("phone")
}
for b in spidermaps_results
],
"record_type": "business",
"campaign_id": "camp_restaurants_paris"
}
)

# Only process unique businesses
unique_businesses = dedup_response.json()["unique"]

2. Pre-Check Before Expensive Operations

Check records before running SpiderSite or SpiderVerify to save costs:

# Check contacts before email verification
contacts = [{"email": "john@example.com"}, {"email": "jane@example.com"}]

dedup = requests.post(url, json={
"records": contacts,
"record_type": "email",
"add_to_canonical": False # Don't add yet
})

# Only verify unique emails
unique_emails = [r["email"] for r in dedup.json()["unique"]]

Limits

  • Maximum records: 100 per request
  • Idempotency cache: Results cached for 5 minutes when using idempotency_key
note

For larger imports (100-1000 records), use the Bulk Import endpoint instead.