Check Batch Records
/api/v1/fuzziq/check-batchOverview
Check multiple records against your canonical database in a single request. This is more efficient than individual checks when processing SpiderMaps results or bulk imports.
Returns records split into two arrays: unique (new records) and duplicates (matched existing records).
Request Body
recordsarrayrequiredArray of records to check (1-100 records)
Each record can contain:
email,full_name,first_name,last_name,linkedin_url,positioncompany_name,company_domain,google_place_id,website,phonecity,country
record_typestringrequiredType of records. One of: business, contact, email, profile
add_to_canonicalbooleandefault: trueIf true, automatically add unique records to canonical database
campaign_idstringCampaign ID for scoped deduplication. When provided, deduplication is scoped to records from this campaign only.
thresholdnumberdefault: 0.5Confidence threshold for ML matching (0.0-1.0)
idempotency_keystringUnique key for retry safety. Same key returns cached result (max 100 chars)
Examples
Check SpiderMaps Business Results
- cURL
- Python
- JavaScript
curl -X POST https://spideriq.ai/api/v1/fuzziq/check-batch \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <client_id>:<api_key>:<api_secret>" \
-d '{
"records": [
{
"company_name": "McDonald'\''s Paris",
"google_place_id": "ChIJ123456789",
"phone": "+33-1-23-45-67-89"
},
{
"company_name": "McDonald'\''s Lyon",
"google_place_id": "ChIJ987654321",
"phone": "+33-4-56-78-90-12"
},
{
"company_name": "Burger King Paris",
"google_place_id": "ChIJabcdefghi",
"phone": "+33-1-98-76-54-32"
}
],
"record_type": "business",
"add_to_canonical": true
}'
import requests
headers = {
"Authorization": "Bearer <your_token>",
"Content-Type": "application/json"
}
# Example: Deduplicate SpiderMaps results
businesses = [
{
"company_name": "McDonald's Paris",
"google_place_id": "ChIJ123456789",
"phone": "+33-1-23-45-67-89"
},
{
"company_name": "McDonald's Lyon",
"google_place_id": "ChIJ987654321",
"phone": "+33-4-56-78-90-12"
}
]
data = {
"records": businesses,
"record_type": "business",
"add_to_canonical": True
}
response = requests.post(
"https://spideriq.ai/api/v1/fuzziq/check-batch",
headers=headers,
json=data
)
result = response.json()
# Only process unique businesses
for business in result["unique"]:
print(f"New business: {business['company_name']}")
# Log duplicates
for dup in result["duplicates"]:
print(f"Duplicate: {dup['record']['company_name']} matched ID {dup['matched_canonical_id']}")
const response = await fetch(
'https://spideriq.ai/api/v1/fuzziq/check-batch',
{
method: 'POST',
headers: {
'Authorization': 'Bearer <your_token>',
'Content-Type': 'application/json'
},
body: JSON.stringify({
records: [
{
company_name: "McDonald's Paris",
google_place_id: "ChIJ123456789",
phone: "+33-1-23-45-67-89"
},
{
company_name: "McDonald's Lyon",
google_place_id: "ChIJ987654321",
phone: "+33-4-56-78-90-12"
}
],
record_type: "business",
add_to_canonical: true
})
}
);
const data = await response.json();
console.log(`Unique: ${data.stats.unique_count}, Duplicates: ${data.stats.duplicate_count}`);
Check Contacts with Idempotency
Use idempotency_key to safely retry failed requests without creating duplicate canonical records:
curl -X POST https://spideriq.ai/api/v1/fuzziq/check-batch \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <client_id>:<api_key>:<api_secret>" \
-d '{
"records": [
{"email": "john@example.com", "full_name": "John Doe"},
{"email": "jane@example.com", "full_name": "Jane Smith"}
],
"record_type": "contact",
"idempotency_key": "job-550e8400-batch-1"
}'
Response
Success Response
{
"success": true,
"unique": [
{
"company_name": "McDonald's Paris",
"google_place_id": "ChIJ123456789",
"phone": "+33-1-23-45-67-89"
}
],
"duplicates": [
{
"record": {
"company_name": "McDonald's Lyon",
"google_place_id": "ChIJ987654321",
"phone": "+33-4-56-78-90-12"
},
"matched_canonical_id": 12345,
"confidence": 1.0,
"match_type": "google_place_id"
}
],
"stats": {
"total_checked": 2,
"unique_count": 1,
"duplicate_count": 1,
"added_to_canonical": 1,
"skipped": false,
"reason": null
}
}
SpiderFuzzer Not Configured
When SpiderFuzzer is not enabled, all records are returned as unique:
{
"success": true,
"unique": [
{"company_name": "McDonald's Paris", "google_place_id": "ChIJ123456789"}
],
"duplicates": [],
"stats": {
"total_checked": 1,
"unique_count": 1,
"duplicate_count": 0,
"skipped": true,
"reason": "fuzziq_not_configured"
}
}
Response Fields
| Field | Type | Description |
|---|---|---|
success | boolean | Whether the request succeeded |
unique | array | Records that are unique (not duplicates) |
duplicates | array | Records that matched existing canonical records |
stats.total_checked | integer | Total records checked |
stats.unique_count | integer | Number of unique records |
stats.duplicate_count | integer | Number of duplicates found |
stats.added_to_canonical | integer | Records added to canonical database |
stats.skipped | boolean | Whether deduplication was skipped |
stats.reason | string | Reason if skipped |
Duplicate Match Object
Each item in duplicates array contains:
| Field | Type | Description |
|---|---|---|
record | object | The original record that was checked |
matched_canonical_id | integer | ID of the matched canonical record |
confidence | number | Match confidence (0.0-1.0) |
match_type | string | Type of match (exact_hash, email, google_place_id, etc.) |
Use Cases
1. Deduplicate SpiderMaps Results
When running multi-location campaigns, the same business may appear in multiple locations:
# After receiving SpiderMaps results
spidermaps_results = job_result["data"]["businesses"]
# Check for duplicates with SpiderFuzzer
dedup_response = requests.post(
"https://spideriq.ai/api/v1/fuzziq/check-batch",
headers=headers,
json={
"records": [
{
"google_place_id": b["place_id"],
"company_name": b["name"],
"phone": b.get("phone")
}
for b in spidermaps_results
],
"record_type": "business",
"campaign_id": "camp_restaurants_paris"
}
)
# Only process unique businesses
unique_businesses = dedup_response.json()["unique"]
2. Pre-Check Before Expensive Operations
Check records before running SpiderSite or SpiderVerify to save costs:
# Check contacts before email verification
contacts = [{"email": "john@example.com"}, {"email": "jane@example.com"}]
dedup = requests.post(url, json={
"records": contacts,
"record_type": "email",
"add_to_canonical": False # Don't add yet
})
# Only verify unique emails
unique_emails = [r["email"] for r in dedup.json()["unique"]]
Limits
- Maximum records: 100 per request
- Idempotency cache: Results cached for 5 minutes when using
idempotency_key
For larger imports (100-1000 records), use the Bulk Import endpoint instead.