What is it
Per-API-Call Pricing is a billing unit where customers are charged per API request, regardless of payload size or processing time. The unit is the request itself: one call to the endpoint produces one charge, whether the response is a single search result or a full page of structured data.
It is the simplest of all usage meters. There is nothing to estimate about tokens, megabytes, or seconds of compute — a developer can forecast a bill by multiplying expected request volume by a published rate. That legibility is the model’s defining strength, and it is why per-call pricing dominates categories where the unit of work maps cleanly to a single request: web search, scraping and extraction, and per-image or per-clip media processing.
The model shows up two ways across the corpus. For pure-request APIs like SerpApi, Exa, and Tavily, the call is the product, and the entire rate card is denominated in requests (often expressed as credits that map back to calls). For broader platforms — image generators like Recraft and Ideogram, or speech vendors that also meter audio minutes — per-call pricing is one line on a multi-dimensional bill, used for the discrete operations (a background removal, an upscale) that don’t fit a continuous meter.
The model’s weakness is the flip side of its simplicity: not all calls cost the same to serve. A deep-research request and a cached lookup both count as one call unless the vendor splits them into separate SKUs or tiers — which, as the worked examples below show, is exactly what the better-designed APIs do.
How it works
The mechanic is a multiplication: bill = calls × rate. Vendors layer three things on top of that base formula:
| Lever | What it does | Example from the corpus |
|---|---|---|
| Free allowance | A monthly call quota at $0 to seed adoption | SerpApi Free: 250 searches/mo; Tavily Free: 1,000 credits/mo |
| Per-1k quoting | Normalizes sub-cent prices into readable figures | Exa Search “$7 / 1k” (~$0.007/call); ScraperAPI $3/1k requests |
| Endpoint / mode tiers | Different call types priced differently | Exa fixed-effort Low/Med/High at $0.025 / $0.10 / $0.50 per request |
Worked example — search API at scale. A team running an agent on Exa issues 50,000 standard searches a month. At the raw searchResults rate of $0.005 per call, that’s $250. Switch every call to sourcedAnswer (an LLM-composed answer) at $0.006 and the same volume costs $300 — a 20% per-call premium that compounds to $50/month purely from the output type chosen. Add 10,000 deep sourcedAnswer searches at $0.055 and another $550 lands on the bill. The base formula never changes; the rate does, per endpoint and per mode.
Worked example — success-based metering. SerpApi charges only for successful searches. Its enterprise plan bundles 100,000 complimentary searches into a $3,750/month base, then meters extra calls per 1,000 by speed mode: on-demand at $7.50 (Best Effort), $15.00 (Ludicrous Speed), or $30.00 (Ludicrous Speed Max), or reserved at $2.75 / $5.50 / $11.00 per 1,000 on a commit. Blocked, errored, and CAPTCHA’d requests are not counted, so the meter tracks delivered value rather than attempted calls — a meaningful difference for scraping-adjacent workloads where failure rates are real.
Worked example — per-operation media. On Recraft, API units are prepaid at $1 = 1,000 units and each operation deducts a fixed amount: a background removal or vectorization costs $0.01 per request, a V4 raster generation $0.04, and a V4 Pro vector image $0.30. Five thousand background removals is a flat $50 — no tokens, no compute time, just calls. See the introduction to usage-based pricing guide for how this maps to value-metric selection, and the usage invoicing and billing cycles guide for how per-call counts roll into a monthly invoice.
Companies using this
The companies below all list api-calls as a billing unit in their blueprint entry — verified against each vendor’s official pricing page. The category clusters tightly into three groups: search APIs, scraping/extraction APIs, and per-image or per-clip media generation, with a long tail of inference platforms that expose a per-request meter alongside tokens.
Patterns observed
-
Search APIs are the purest expression of the model. SerpApi, Tavily, Exa, Linkup, and You.com all price the search request as the atomic unit. Tavily denominates it in credits ($0.008/credit pay-as-you-go), Exa quotes per-1k ($7/1k), and SerpApi sells monthly search allotments — but underneath, the meter is the call.
-
Per-1k quoting is near-universal for sub-cent prices. Because a single search or scrape often costs a fraction of a cent, vendors quote per 1,000 to stay legible. Exa’s “$7 / 1k” (~$0.007 each) and ScraperAPI’s “$3 per 1,000 requests without rendering, $7 with” are the canonical examples. The model stays purely usage-based; only the display unit is scaled.
-
The better APIs split one “call” into priced sub-types. Rather than charging one flat rate, Exa prices Search, Contents, and Research separately and adds a deep-search 10x premium; Tavily publishes a per-endpoint credit table where a Research call ranges from 4 to 250 credits. This preserves the simplicity of per-call billing while letting price track the real cost of serving each request type.
-
Success-based metering is a credible variant. SerpApi and ZenRows bill only on successful results — ZenRows charges a CPM on successful Scraper API responses, SerpApi excludes blocked/errored/CAPTCHA’d searches. For any API with a meaningful failure rate, charging per delivered result rather than per attempted call aligns the meter with value.
-
Media generators converge on a penny for utility calls. Recraft, Ideogram, and PhotoRoom all price background removal at $0.01 per request, even as their generative image calls span $0.022 to $0.30. The utility operation is cheap and standardized; the generative one carries the model premium.
Counterexamples & variants
The cleanest counterexample sits inside the model itself. Exa’s Research endpoint shows where flat per-call pricing breaks down: a single Research call ranges from $0.25 to $2.50 depending on reasoning depth, a 10x spread on what is nominally “one request.” Tavily has the same problem — its Research endpoint consumes anywhere from 4 to 250 credits per call, and that range lives only in the docs, making the per-call abstraction actively misleading for buyers who budget off the headline “$0.008/credit” rate. When the work behind a request varies by two orders of magnitude, “per call” stops being a useful unit and the vendor has to fall back to dynamic credit ranges.
The variant worth calling out is credits as a per-call proxy. Diffbot, Firecrawl, PhotoRoom, and Tavily all expose a credit currency rather than a raw per-call price, where each endpoint deducts a different number of credits. This lets a vendor keep a single legible unit on the pricing page while charging more for expensive calls — a deliberate softening of pure per-call pricing that trades transparency for flexibility. ZenRows goes further still, layering a CPM on successful scraper calls plus a separate per-GB charge for proxy and browser data, so the request is only one of several meters.
Finally, the large inference platforms — OpenAI, Anthropic, Google, DeepSeek, Cohere, and Mistral AI — list api-calls as a billing unit but are not really per-call businesses. Their economics run on tokens; the request count is incidental. They appear here because the corpus records the meter, not because per-call pricing drives their bill. For those vendors, see the token-based pricing theme.
What this means for buyers vs vendors
For buyers
Per-call pricing is the easiest model to forecast — until it isn’t. For fixed-cost endpoints (a SerpApi search, a Recraft background removal), multiply your expected request volume by the published rate and you have a budget. The trap is the variable-cost endpoint hiding inside an otherwise flat rate card: Exa’s Research at up to $0.50/request, Tavily’s Research at up to 250 credits/call. Before committing, map which endpoints your workload actually hits and price the expensive ones explicitly. Watch for success-based metering too — on SerpApi and ZenRows, high error rates won’t inflate your bill, but on APIs that charge per attempted call, a flaky target site can double your cost. Model your scenario with the pricing calculator and read the choosing the right usage metric guide to sanity-check that the vendor’s unit matches your value.
For vendors
Per-call pricing buys you legibility, which converts: a developer who can estimate a bill in ten seconds signs up faster. But the unit only works if your cost-to-serve is roughly uniform per call. The moment one endpoint costs 100x another, flat per-call pricing either loses you money on the expensive calls or overcharges on the cheap ones. The corpus’s best designs solve this without abandoning the model: split calls into priced sub-types (Exa, Tavily), quote per-1k to keep sub-cent prices readable, and offer success-based metering where failure rates are real so buyers trust the meter. If your endpoints vary wildly, a credit currency (Diffbot, Firecrawl) lets you keep one unit on the page while charging differentially underneath — at the cost of the transparency that made per-call attractive in the first place. See usage invoicing and billing cycles for how to roll per-call counts into a clean monthly invoice.
| Company | Product | Pricing model | Billing units | Free tier | Verified |
|---|---|---|---|---|---|
| Anthropic | Claude API (token-based) + Claude.ai consumer subscriptions (Free/Pro/Team/Enterprise) | freemiumsubscriptionseat-based+1 | tokensseatsapi-calls | Yes | 2026-05-29 |
| AssemblyAI | Speech-to-Text & Audio AI APIs | pure-usage | api-callstokens | Yes | 2026-05-29 |
| Bland AI | AI phone call automation platform — inbound and outbound voice agents at scale | hybridpure-usagesubscription | api-callscreditsmedia-minutes | Yes | 2026-05-29 |
| Browserbase | Browser-agent infrastructure: headless browser sessions, web Search/Fetch APIs, agent identity, runtime, and a model gateway behind one API key | freemiumhybridpure-usage | browser-hoursapi-callsrequests+2 | Yes | 2026-06-02 |
| Cartesia | Real-time voice AI platform (Sonic TTS, voice cloning, voice agents) | freemiumsubscriptionhybrid+1 | creditsrequestsapi-calls+1 | Yes | 2026-05-29 |
| Cerebras | Wafer-scale AI inference cloud and WSE hardware systems | pure-usagesubscriptioncommitment | tokensapi-callsgpu-hours | Yes | 2026-05-30 |
| Clipdrop | AI image-editing and generation tools (background removal, upscaling, text-to-image), now part of Jasper | freemiumsubscription | requestscreditsapi-calls | Yes | 2026-06-05 |
| Cohere | Command, Embed, Rerank APIs | pure-usage | tokensapi-callsrequests | Yes | 2026-05-29 |
| Deepgram | Usage-based speech-to-text, text-to-speech, and voice agent APIs | pure-usagefreemium | media-minutestokenscredits+1 | Yes | 2026-05-31 |
| DeepSeek | DeepSeek API (V4-Flash + V4-Pro models, 1M context) with token-based pricing and aggressive cache discounts | freemiumpure-usage | tokensapi-calls | Yes | 2026-06-05 |
| Diffbot | Web-extraction APIs (Extract, Crawl, Natural Language) plus a Knowledge Graph, metered on monthly credits | hybridfreemium | creditsapi-calls | Yes | 2026-06-04 |
| Exa | AI web search API for agents — search, contents, deep research, and monitoring endpoints billed per request | pure-usagefreemium | requestscreditsapi-calls+1 | Yes | 2026-06-01 |
| Firecrawl | Web-scraping and data-extraction API for AI agents — scrape, crawl, map, search, and extract pages into clean markdown/JSON | subscriptionhybridfreemium | creditspages-renderedapi-calls+1 | Yes | 2026-06-02 |
| Freepik | AI creative suite — image, video, audio generation plus a 200M+ stock library | subscriptionhybridpure-usage+1 | seatscreditsapi-calls | Yes | 2026-06-05 |
| Gemini API & AI Studio | pure-usagefreemium | tokensrequestsapi-calls | Yes | 2026-05-29 | |
| Groq | GroqCloud — LPU-based ultra-low-latency inference API for Llama, GPT-OSS, Qwen, Whisper, and Mixtral | pure-usagehybridcommitment | tokensrequestsapi-calls | Yes | 2026-05-29 |
| Hedra | AI video, avatar, image, and audio generation platform (Hedra Studio + API) | subscriptionfreemium | creditsmedia-minutescharacters+1 | Yes | 2026-06-04 |
| HeyGen | AI avatar and video generation platform | subscriptionfreemium | creditsseatsapi-calls | Yes | 2026-05-30 |
| Ideogram | Text-aware AI image generation platform | freemiumsubscriptionhybrid | creditsapi-calls | Yes | 2026-05-31 |
| Jina AI | Search Foundation API (Embeddings, Reranker, Reader, DeepSearch, Classifier) | pure-usagefreemium | tokensrequestsapi-calls | Yes | 2026-06-03 |
| Linkup | Web search API for AI agents — Search, Fetch, and async Research endpoints with grounded, structured results | pure-usagefreemium | requestscreditsapi-calls | Yes | 2026-06-04 |
| Mistral AI | Open and commercial LLM APIs | pure-usagefreemium | tokensseatsapi-calls+2 | Yes | 2026-05-31 |
| Novita AI | Pay-as-you-go AI cloud: 200+ model inference APIs, on-demand GPUs, and per-second agent sandboxes under one API | pure-usagefreemium | tokensgpu-hourscpu-hours+2 | Yes | 2026-06-02 |
| OpenAI | ChatGPT consumer subscriptions + GPT-5.x API with token-based usage billing | freemiumsubscriptionseat-based+1 | tokensseatsapi-calls+1 | Yes | 2026-05-30 |
| OpenMeter | Open-source usage metering and billing platform for AI, agentic, and developer tools | freemium | eventsapi-calls | Yes | 2026-06-03 |
| Patronus AI | LLM and AI agent evaluation, monitoring, and guardrail platform | freemiumpure-usage | api-callscredits | Yes | 2026-06-04 |
| Perplexity AI | AI-native answer engine with citations and multi-model search | freemiumsubscriptionseat-based+1 | seatstokensrequests+1 | Yes | 2026-05-29 |
| PhotoRoom | AI image-editing app and per-image Image Editing / Remove Background API for e-commerce product visuals | subscriptionpure-usagefreemium | api-callscreditsseats | Yes | 2026-06-05 |
| Playground | AI image generation and graphic-design studio with a monthly credit pool | freemiumsubscriptionhybrid | creditsapi-calls | Yes | 2026-06-04 |
| Recraft | AI image and vector generation studio plus a per-image generation API | freemiumsubscriptionhybrid | creditsapi-callsseats | Yes | 2026-06-01 |
| Rev AI | Pay-as-you-go speech-to-text, transcription, and audio-intelligence APIs | pure-usagefreemium | media-minutescreditsapi-calls | Yes | 2026-06-04 |
| Rows | Rows AI spreadsheet | subscriptionhybrid | seatstasksapi-calls | Yes | 2026-06-08 |
| ScraperAPI | Web scraping API that handles proxies, browsers, and CAPTCHAs behind a single endpoint | subscriptionpure-usage | creditsrequestsapi-calls | No | 2026-06-04 |
| SerpApi | Real-time search-results API (Google, Bing, and other engines) | subscriptionpure-usage | api-callsrequests | Yes | 2026-06-04 |
| Tavily | Tavily Search API | pure-usagefreemium | creditsapi-callsrequests | Yes | 2026-06-03 |
| Upstash | Upstash (Redis, Vector, QStash, Search, Workflow) | pure-usagefreemiumhybrid | requestsapi-callsvectors-indexed+3 | Yes | 2026-06-03 |
| You.com | Web search, contents, research, and finance-research APIs for AI systems | pure-usagefreemium | api-callsrequestspages-rendered | Yes | 2026-06-01 |
| ZenRows | Universal Scraper API, Scraping Browser, and Residential Proxies | hybridsubscriptionpure-usage | requestsapi-callsbandwidth-gb+2 | Yes | 2026-06-04 |
FAQ
What is per-API-call pricing?
Per-API-call pricing is a billing unit where the customer is charged per API request, regardless of payload size or processing time. One request triggers one charge, usually quoted per call or per 1,000 calls.
How is per-API-call pricing different from token-based pricing?
Token pricing meters the volume of text processed inside a request, so two calls can cost very differently. Per-call pricing charges a flat amount per request no matter how large the input or output is, which makes it simpler to forecast but a looser fit to the vendor's actual serving cost.
Do vendors charge for failed API calls?
It varies. Some, like SerpApi and ZenRows, bill only on successful results and do not count blocked, errored, or CAPTCHA'd requests. Many other request-metered APIs charge for every call that hits the endpoint, so failure handling is a real cost line for high-error workloads like scraping.
Which companies use per-API-call pricing?
It dominates the search-API category (SerpApi, Tavily, Exa, Linkup, You.com), web scraping and extraction (Firecrawl, ScraperAPI, ZenRows, Diffbot), and image and media generation (PhotoRoom, Recraft, Ideogram, HeyGen). Across the corpus, 38 in-corpus companies list api-calls as a billing unit.
Why do vendors quote prices per 1,000 calls instead of per call?
Because the per-call price is often a fraction of a cent. Quoting '$7 per 1,000 requests' instead of '$0.007 each' keeps the model purely usage-based while staying readable, and it makes volume tiers easy to compare on a pricing page.
Trivia
-
SerpApi bills only successful searches — blocked, errored, or CAPTCHA'd requests are not counted — and quotes its enterprise rate as $7.50 per 1,000 searches on-demand, dropping to $2.75 per 1,000 on a reserved commit.
-
Exa quotes "$7 / 1k" for its Search endpoint — about $0.007 per call — because per-1,000 quoting makes a sub-cent price legible on a pricing page, while its fixed-effort Research tier runs from $0.025 to $0.50 per request, a 20x spread on one endpoint.
-
Per-image processing has converged near a penny: Recraft, Ideogram, and PhotoRoom all price background removal at exactly $0.01 per request, even though their generative image calls range from $0.022 to $0.30 each.
Related billing units
- Credit-Based BillingA billing unit where customers pre-purchase or are allocated a pool of credits that deplete as they use the product, often at variable rates per feature.
- Token-Based PricingA billing unit common in LLM and AI products, where customers are charged per input and output token processed.
- Per-Seat PricingA billing unit where the vendor charges a fixed fee per named user, regardless of how much each user consumes.
- Per-Resolution PricingA billing unit unique to AI customer-support products, where the vendor charges only when an AI agent resolves a customer issue without escalation.
- Bandwidth-Based PricingA billing unit where customers are charged per gigabyte of data transferred out of the platform.
- Per-Function-Invocation PricingA billing unit where customers are charged per serverless function invocation, often combined with a separate compute-time charge.
- CPU-Hour PricingA billing unit where customers are charged for the CPU time their workloads consume, typically measured in vCPU-seconds or vCPU-hours.
- GB-Hour PricingA billing unit where customers are charged for the memory their workloads consume over time, measured in gigabyte-hours.
- GPU-Hour PricingA billing unit where customers are charged for GPU time consumed, typically measured per-second or per-hour by GPU type.
- Per-GB Storage PricingA billing unit where customers are charged per gigabyte of data stored on the platform per month.
- Media-Minute PricingA billing unit where customers are charged per minute of audio or video processed — used by speech, voice, and video AI vendors.