InvestorLift Scraper API
Pull clean, structured data from any InvestorLift marketplace listing — one URL or thousands at a time. Every example below is a single curl command, ready to paste into a terminal.
Paste your API_KEY below — every curl snippet auto-fills with it.
One property → /api/scrape. Many → /api/bulk-scrape.
Address, price, ARV, beds/baths, photos — direct from the listing's NUXT_DATA.
Stored in your browser's localStorage only — never sent anywhere except as the
X-API-Key header. Once saved, every curl block below fills it in for you.
Regenerate rotates the server-side key. The old key stops working immediately; the new one is persisted to .api_key and stays fixed across restarts until you regenerate again.
Try it now
Paste any InvestorLift property URL and run a real scrape against this server. Uses the API key you saved above.
Authentication
-H "X-API-Key: {KEY}"
-H "Authorization: Bearer {KEY}"
?api_key={KEY}
/api/scrape
Extract a single property page. Returns the full record from the page's NUXT_DATA blob.
Request body
{
"url": "https://investorlift.com/marketplace/p/abc123", # required
"mode": "clean", # "clean" | "raw" (default: clean)
"filters": { "min_price": 100000, "states": ["TX"] } # optional, see Filter schema
}
curl example
curl -X POST {BASE}/api/scrape \ -H "X-API-Key: {KEY}" \ -H "Content-Type: application/json" \ -d '{"url":"https://investorlift.com/marketplace/p/abc123","mode":"clean"}'
/api/bulk-scrape
Run a parallel extraction over many URLs. Returns immediately with the queued count;
poll /api/bulk-scrape/status for live progress.
Request body
{
"urls": ["https://investorlift.com/marketplace/p/abc", "…/xyz"],
"mode": "clean", # "clean" | "raw"
"workers": 8, # 1..20
"filters": { # optional
"min_price": 100000,
"max_price": 400000,
"min_beds": 3,
"states": ["TX", "FL"],
"property_types": ["Single-Family"],
"exclude_under_contract": true
}
}
curl example
curl -X POST {BASE}/api/bulk-scrape \ -H "X-API-Key: {KEY}" \ -H "Content-Type: application/json" \ -d '{ "urls": [ "https://investorlift.com/marketplace/p/abc", "https://investorlift.com/marketplace/p/xyz" ], "mode": "clean", "workers": 8 }'
Full pipeline (start → poll → download)
# 1) Start the bulk job curl -X POST {BASE}/api/bulk-scrape \ -H "X-API-Key: {KEY}" \ -H "Content-Type: application/json" \ -d '{"urls":["https://investorlift.com/marketplace/p/abc"],"mode":"clean","workers":8}' # 2) Poll live status until "running": false curl -H "X-API-Key: {KEY}" {BASE}/api/bulk-scrape/status # 3a) Pull results as JSON curl -H "X-API-Key: {KEY}" {BASE}/api/bulk-scrape/json # 3b) Or download the Excel workbook (use -OJ to preserve the server filename) curl -H "X-API-Key: {KEY}" -OJ {BASE}/api/bulk-scrape/excel
/api/bulk-scrape/status
Live progress for the current bulk run. Safe to call as often as 1×/sec.
curl
curl -H "X-API-Key: {KEY}" {BASE}/api/bulk-scrape/status
Response shape
{
"running": false,
"done": 42,
"total": 42,
"succeeded": 39,
"failed": 1,
"filtered_out": 2,
"results": [{ /* clean property records */ }],
"failures": [{ "url": "…", "error": "HTTP 503" }],
"log": [{ "url": "…", "ok": true }],
"started": 1736380000.123,
"finished": 1736380012.456,
"excel_ready": true,
"run_dir": "2026-05-24_..."
}
/api/bulk-scrape/excel
Two-sheet workbook (Properties + Failures). Frozen header, AutoFilter, hyperlinked URL cells.
# Save as the server-suggested filename via -OJ: curl -H "X-API-Key: {KEY}" -OJ {BASE}/api/bulk-scrape/excel # Or via query-string auth (handy for direct browser download): curl -OJ "{BASE}/api/bulk-scrape/excel?api_key={KEY}"
/api/bulk-scrape/json
Pretty-printed JSON of the last bulk run — includes results, failures, filters, started/finished timestamps.
curl -H "X-API-Key: {KEY}" {BASE}/api/bulk-scrape/json -o investorlift_bulk.json
Job tracking
Every meaningful API call this server handles is persisted to Supabase as a row in api_jobs.
Single-property scrapes additionally land in scrape_results; bulk runs land in bulk_scrape_runs.
Recording is fire-and-forget on a small background pool — a Supabase outage delays the audit trail but never blocks a real response.
High-frequency polling endpoints (/status, /history, /log) are intentionally not tracked so the audit table stays signal-rich. Toggle the whole layer off with TRACK_API_JOBS=false if needed.
/api/jobs
Recent API jobs, newest first. Defaults to 50 rows; ?limit=N goes up to 500.
curl
# 50 most-recent jobs curl -H "X-API-Key: {KEY}" {BASE}/api/jobs # Last 10 jobs curl -H "X-API-Key: {KEY}" {BASE}/api/jobs?limit=10
Response shape
{
"configured": true, # false → Supabase not set up yet
"tracking": true, # TRACK_API_JOBS env toggle
"count": 3,
"jobs": [
{
"job_id": "job-20260524T034512-ab12cd34",
"endpoint": "/api/scrape",
"method": "POST",
"status": "succeeded",
"http_status": 200,
"remote_addr": "127.0.0.1",
"api_key_hint": "…aB12",
"request_summary": { ... }, # redacted body + query
"response_summary": { "ok": true, "matched": true },
"started_at": "2026-05-24T03:45:12Z",
"finished_at": "2026-05-24T03:45:13Z",
"duration_ms": 812
}
]
}
/api/jobs/<job_id>
One job by id, with its persisted scrape result or bulk run attached when present.
curl
curl -H "X-API-Key: {KEY}" {BASE}/api/jobs/job-20260524T034512-ab12cd34
/api/admin/regenerate-key
Rotate the server's API key. You must authenticate with the current key. The response contains the new key once — store it immediately. The old key stops working as soon as this endpoint returns. The new key is persisted to .api_key on the server, so it survives restarts and stays fixed until you call this endpoint again.
curl
curl -X POST {BASE}/api/admin/regenerate-key \ -H "X-API-Key: {KEY}"
Response
{
"ok": true,
"api_key": "il_…new key here…",
"warning": "Store this key now — it will not be shown again."
}
Marketplace → Supabase sync
Full-marketplace pipeline: Selenium URL discovery → parallel NUXT extraction → bulk write to your Supabase database.
Sync endpoints don't require an API key (they're internal-UI routes), but they do require the server's SUPABASE_URL and SUPABASE_SERVICE_ROLE_KEY env vars to be set.
/api/marketplace-sync/run
Trigger a sync. Returns 202 immediately; poll /status for progress.
Request body (all optional)
{
"discovery_mode": "all", # "active" | "historical" | "all"
"states": ["TX", "FL"], # omit for all 50
"trigger": "API" # shown in scrape_runs.triggered_by
}
curl
# Full marketplace, all 50 states, both active + historical passes curl -X POST {BASE}/api/marketplace-sync/run \ -H "Content-Type: application/json" \ -d '{"discovery_mode":"all","trigger":"API"}' # Just TX + FL, active listings only curl -X POST {BASE}/api/marketplace-sync/run \ -H "Content-Type: application/json" \ -d '{"discovery_mode":"active","states":["TX","FL"],"trigger":"API"}'
/api/marketplace-sync/status
Snapshot of the current/last sync — phase, per-table progress, recent rows.
curl {BASE}/api/marketplace-sync/status
/api/marketplace-sync/stop
Ask the in-flight sync to wind down. Kills the scraper subprocess and rolls sync_control back to idle.
curl -X POST {BASE}/api/marketplace-sync/stop
/api/marketplace-sync/probe
Cheap connectivity check against Supabase — verifies the URL + service-role key are valid.
curl {BASE}/api/marketplace-sync/probe
/api/marketplace-sync/history
Recent sync runs (last 25), newest first.
curl {BASE}/api/marketplace-sync/history
/api/marketplace-sync/log
Last N lines of the live sync log (default 200, max 800). Plain JSON fallback for environments that buffer SSE.
curl {BASE}/api/marketplace-sync/log?n=200
/api/info
Self-describing JSON: endpoints, auth requirements, current limits, supported filter keys.
curl {BASE}/api/info
/api/states
All 50 US state codes + display names + the canonical property-type list. Useful for building dropdowns.
curl {BASE}/api/states
/health
Liveness probe — returns {"ok": true}. No auth required.
curl {BASE}/health
Filter schema
Every filters key is optional. Records that don't match are dropped from
results and counted under filtered_out.
Filters apply after extraction — they don't reduce HTTP calls.
| Key | Type | Example | Matches when |
|---|---|---|---|
| min_price / max_price | number | 100000 | price within range |
| min_arv / max_arv | number | 300000 | ARV estimate within range |
| min_arv_pct / max_arv_pct | number | 70 | price ÷ ARV percentage in range |
| min_beds / max_beds | number | 3 | bedroom count in range |
| min_baths / max_baths | number | 2 | bathroom count in range |
| min_sqft / max_sqft | number | 1200 | square footage in range |
| min_year / max_year | number | 1990 | year_built in range |
| min_days / max_days | number | 30 | days_published in range |
| states | string[] | ["TX","FL"] | state code matches (case-insensitive) |
| cities | string[] | ["Houston"] | city matches (case-insensitive) |
| zips | string[] | ["33101"] | zip is in the list |
| property_types | string[] | ["Single-Family"] | type contains/contained-by any item |
| exclude_under_contract | bool | true | drops sold/pending listings |
Error codes
Every error returns {"error":"…","kind":"…"}.
Missing/invalid url or mode; non-investorlift.com URL.
No or invalid API key — pass X-API-Key.
A bulk run is already in progress — poll /api/bulk-scrape/status first.
Too many URLs in one bulk call — split into batches of 5,000 or fewer.
Page loaded but no NUXT data was found — the listing may be offline or behind a login.
Upstream HTTP failure after retries — wait 30s and retry.