How much does Jina AI cost?

Jina AI prices all of its Search Foundation APIs from one shared token balance. Every new API key includes 10 million free tokens (non-commercial). Paid top-ups are $50 for 1 billion tokens ($0.050 per 1M) on a Standard key and $500 for 11 billion tokens ($0.045 per 1M) on a Premium key.

Does Jina AI have a free tier?

Yes. Every new API key comes with 10 million free tokens and no credit card is required, but those free tokens are restricted to non-commercial use under a CC-BY-NC license. Free-tier rate limits apply (for example 500 RPM on the Reader API).

How does Jina AI's token billing work across products?

All five products — Embeddings, Reranker, Reader, DeepSearch and Classifier — draw from one shared per-key token balance. Embeddings and Reranker count input tokens; Reader counts output-response tokens; web search charges a fixed cost starting at 10,000 tokens per request; DeepSearch counts every token across the whole reason-search-read process.

What changed in Jina AI's pricing on May 6, 2025?

Jina AI introduced its current pricing model on May 6, 2025. Customers who enabled auto-recharge before that date are grandfathered onto their original price; the newer pricing applies only if they modify their auto-recharge settings or purchase a new API key.

Did Elastic acquiring Jina AI change the pricing?

Elastic (NYSE: ESTC) completed its acquisition of Jina AI on October 9, 2025, and founder Han Xiao became Elastic's VP of AI. As of the June 2026 review the standalone Jina Search Foundation API and its shared token-credit pricing continue to operate independently of Elastic's own product billing.

How much does a DeepSearch query cost on Jina AI?

DeepSearch counts the total tokens used across the entire reason-search-read process. A simple query may use around 10,000 tokens, while a complex multi-hop query can consume roughly 500,000 tokens — all drawn from the same shared key balance, controllable via the budget_tokens parameter.

Jina AI Pricing

AI Summary

Jina AI uses a pure usage model built on a single shared token-credit balance: one API key holds one pool of prepaid tokens that every Search Foundation product — Embeddings, Reranker, Reader, DeepSearch and Classifier — draws from, with no seats and no per-product subscriptions.
Every new Jina AI API key includes 10 million free tokens with no credit card required, restricted to non-commercial use under a CC-BY-NC license.
Paid token top-ups are $50 for 1 billion tokens ($0.050 per 1M) on a Standard key and $500 for 11 billion tokens ($0.045 per 1M) on a Premium key, with the larger package roughly 10% cheaper per token and carrying higher rate limits.
Rate limits scale across no-key, Free, Paid and Premium tiers, measured in requests-per-minute (RPM) and tokens-per-minute (TPM), enforced whichever threshold trips first.
Jina AI introduced its current shared-balance pricing model on May 6, 2025; auto-recharge subscribers from before that date are grandfathered onto their original price until they change settings or buy a new key.
Jina AI was acquired by Elastic (NYSE: ESTC) in a deal completed October 2025, but the Jina Search Foundation API and its token-credit pricing continue to operate as a standalone product.

Pricing summary

Jina AI 2026 — shared token-credit balance

Pure usage: one API key holds one token balance that every product (Embeddings, Reranker, Reader, DeepSearch, Classifier) draws down; top up as you go.

Free (Toy Experiment)

Evaluation / non-commercial

0.050 / 1M

Standard (Prototype)

$50 /1B tokens

Prototype development

0.045 / 1M

Premium (Production)

$500 /11B tokens

Production deployment

Tokens never expire and are shared across all Search Foundation APIs. Cloud (AWS SageMaker / Azure) and on-prem / VPC deployments are billed separately through the CSP account or via sales. Dashboard token balance and top-up are login-gated; package prices above are from the public pricing block on jina.ai/embeddings.

About

Jina AI is a Berlin-based search-foundation company that builds the embedding, reranking, reading and search models behind retrieval-augmented generation (RAG) and AI-agent systems. Its product line — the Search Foundation API — bundles five hosted endpoints: Embeddings (multimodal, multilingual jina-embeddings-v5), Reranker (jina-reranker-v3), Reader (r.jina.ai URL-to-markdown and s.jina.ai web search), DeepSearch (an iterative reason-search-read agent, jina-deepsearch-v1), and Classifier (zero-/few-shot text classification).

The company targets developers and AI teams building search, RAG and agentic applications, with native integrations into vector stores and LLMOps frameworks (Pinecone, Qdrant, Weaviate, Milvus, MongoDB, LlamaIndex, LangChain, Haystack and more). Models are also distributed open-weight on Hugging Face and deployable through AWS SageMaker, Microsoft Azure, and (soon) Google Cloud for teams that prefer to handle billing through their cloud provider.

Commercially, Jina AI positions itself as a single drop-in API for the whole retrieval stack rather than a per-product subscription: one API key carries one token balance that every endpoint draws from, so a team can mix embeddings, reranking, reading and search without managing separate plans or invoices.

Founded in Berlin in 2020 by Han Xiao and co-founders, Jina AI raised a $30M Series A led by Canaan Partners in November 2021 (roughly $39M total funding). On October 9, 2025, Elastic (NYSE: ESTC) completed its acquisition of Jina AI, with Han Xiao becoming Elastic’s VP of AI. As of this review the Jina Search Foundation API continues to operate as a standalone product with its own token-credit pricing — the analysis below covers that standalone API, not Elastic’s broader platform billing.

Pricing summary : How Jina’s shared token-credit balance works

Jina AI uses a pure usage model built on a single shared token-credit balance. Each API key holds a pool of prepaid tokens, and every product — Embeddings, Reranker, Reader, DeepSearch and Classifier — consumes from that same pool. There are no seats and no per-product subscriptions; you buy tokens once and spend them anywhere.

Free allotment: every new API key includes 10 million free tokens, with no credit card required, restricted to non-commercial use (CC-BY-NC).
Paid top-ups: $50 buys 1 billion tokens (0.050 / 1M) on a Standard key; $500 buys 11 billion tokens (0.045 / 1M) on a Premium key — the larger package is ~10% cheaper per token and unlocks higher rate limits.
Rate-limit tiers: limits scale by key class — no-key, Free, Paid and Premium — measured in requests-per-minute (RPM) and tokens-per-minute (TPM), enforced whichever threshold trips first.
Per-product token counting differs: Embeddings and Reranker count input tokens; Reader (r.jina.ai) counts output-response tokens; web search (s.jina.ai) charges a fixed cost starting at 10,000 tokens per request; DeepSearch counts every token across the whole reason-search-read process; the Classifier counts input (and, for zero-shot, label) tokens.

What makes this different: most retrieval vendors price each capability (embeddings vs. rerank vs. search) on its own meter — Jina collapses the entire stack onto one fungible token balance, so the buying decision is “how many tokens” rather than “which products.”

Pricing by product

All five products draw from the same shared token balance and the same three rate-limit tiers (Free / Paid / Premium). The token packages are universal; what differs per product is how tokens are counted and the rate limits that apply.

Token packages (shared across all APIs)

Tier	Price	Included	Key mechanics
Free (Toy Experiment)	$0	10M tokens	One-time per new key; non-commercial only (CC-BY-NC); no card required
Standard (Prototype)	$50	1B tokens	0.050 / 1M tokens; Standard key; basic key management + technical support
Premium (Production)	$500	11B tokens	0.045 / 1M tokens; Premium key; higher rate limits + priority support

Embeddings API (`api.jina.ai/v1/embeddings`)

Tier	Rate limit	Token counting	Key mechanics
Free	100 RPM & 100K TPM	Count input tokens	`jina-embeddings-v5` text + omni models
Paid	500 RPM & 2M TPM	Count input tokens	Standard key
Premium	5,000 RPM & 50M TPM	Count input tokens	Premium key

Reranker API (`api.jina.ai/v1/rerank`)

Tier	Rate limit	Token counting	Key mechanics
Free	100 RPM & 100K TPM	Count input tokens	`jina-reranker-v3`
Paid	500 RPM & 2M TPM	Count input tokens	Standard key
Premium	5,000 RPM & 50M TPM	Count input tokens	Premium key

Reader API (`r.jina.ai` URL-to-markdown · `s.jina.ai` web search)

Tier	Reader (`r.jina.ai`)	Search (`s.jina.ai`)	Token counting
Without key	20 RPM	blocked	Reader: output-response tokens
Free	500 RPM	100 RPM	Search: fixed cost, starting at 10,000 tokens per request
Paid	500 RPM	100 RPM	ReaderLM-v2 conversion costs 3× tokens
Premium	5,000 RPM	1,000 RPM	—

DeepSearch API (`deepsearch.jina.ai/v1/chat/completions`)

Tier	Rate limit	Token counting	Key mechanics
Free	50 RPM	Count total tokens in the whole process	`jina-deepsearch-v1`; ~10K tokens for a simple query, ~500K for a complex one
Paid	50 RPM	Count total tokens in the whole process	`budget_tokens` / `team_size` parameters control depth and breadth
Premium	500 RPM	Count total tokens in the whole process	Highest throughput

Classifier API (`api.jina.ai/v1/train` · `/v1/classify`)

Tier	Rate limit	Token counting	Key mechanics
Free	25 RPM & 25K TPM	Train: input × num_iters · Few-shot: input · Zero-shot: input + label tokens	Train + zero/few-shot
Paid	125 RPM & 500K TPM	(as above)	Standard key
Premium	1,250 RPM & 12M TPM	(as above)	Premium key

Sales motions across products: PLG / self-serve token top-ups for all individual and prototype usage (Free / Standard / Premium keys); sales-led for customized Kubernetes / VPC and on-premises deployments, and for cloud-marketplace (AWS SageMaker / Azure) billing through the customer’s CSP account.

Hidden costs : where DeepSearch and Reader silently drain the shared balance

The shared token balance is simple to reason about for embeddings — but the same balance funds products whose per-call token cost is wildly different. The two archetypes below show how a clean “$0.050/1M” headline rate translates into real monthly bills, and where the shared pool gets drained faster than teams expect.

Archetype A: A RAG startup indexing a knowledge base on Embeddings + Reranker

A small team embeds a 50M-token corpus once, re-embeds ~10M tokens/month of new content, and reranks 1M user queries/month (averaging ~2,000 input tokens per rerank call):

Line item	Monthly cost
Initial corpus embedding: 50M tokens × $0.050/1M (one-time, amortized)	~$2.50
Incremental embedding: 10M tokens × $0.050/1M	$0.50
Reranker: 1M queries × ~2,000 tokens = 2B tokens × $0.050/1M	$100.00
Estimated total (steady state)	~$101/month

The embeddings are nearly free; the reranker is the cost center because it re-tokenizes the full candidate document set on every query. At scale, switching to the $0.045/1M Premium package (the $500/11B top-up) and trimming candidate-set size matters far more than optimizing the embedding step.

Archetype B: An agent product running DeepSearch on every user question

A research-assistant app routes every user question through DeepSearch at default depth — say 20,000 complex questions/month at ~500K tokens each:

Line item	Monthly cost
DeepSearch: 20,000 queries × 500K tokens = 10B tokens × $0.045/1M	$450.00
Plus Reader/search sub-calls already counted inside DeepSearch’s total	(included)
Estimated total	~$450/month

A single DeepSearch query at ~500K tokens costs ~$0.0225 — cheap per call, but because the agent counts every token across search, page-reads, reflection and synthesis, an app that runs it on every question can exhaust an 11B-token Premium package in a single month. The budget_tokens and reasoning_effort parameters are the real cost levers here, not the package price.

Hidden costs to watch:

Reader output, not input, is metered — r.jina.ai counts the output markdown tokens, so reading a long page costs more than the URL implies; the ReaderLM-v2 high-quality mode costs 3× tokens.
Web search is a flat 10,000-token floor — every s.jina.ai request costs at least 10,000 tokens regardless of result size, so high-frequency search loops add up fast.
DeepSearch is unbounded by default — without setting budget_tokens, a single complex query can silently consume ~500K tokens; multi-agent team_size > 1 multiplies consumption while sharing one budget.
Free tokens are non-commercial — the 10M free allotment is CC-BY-NC; any production use requires a paid top-up, so the “free tier” is an evaluation budget, not a production runway.

Want to estimate your own Jina AI bill? Use the Jina AI pricing calculator to model your monthly cost across embedding volume, rerank query count, Reader pages and DeepSearch depth.

Pricing evolution : from open-source neural search to a shared token balance

Cadence

Quarter	Price changes	Product / SKU additions	Notes
2023 Q4	0	1	Open-source 8K-context embeddings + hosted Embeddings API; per-token rate from this era is not screenshot-verified
2024 Q2	0	1	Reader API (`r.jina.ai`) and web search (`s.jina.ai`) launched on the shared token model
2024 Q3	0	1	ReaderLM small language model released (3× token cost in high-quality mode)
2025 Q1	0	1	DeepSearch API (`jina-deepsearch-v1`) launched, priced on total tokens per reason-search-read process
2025 Q2	1	0	2025-05-06 new shared-balance pricing model: 10M free tokens, $50/1B Standard, $500/11B Premium; pre-date auto-recharge grandfathered
2025 Q4	0	0	2025-10-09 acquired by Elastic (NYSE: ESTC); standalone API pricing unchanged

Tracked range: 2023 Q4–2026 Q2. Per-token rates and credit-package figures prior to the May 6, 2025 model change could not be screenshot-verified — Wayback snapshots of jina.ai/embeddings and jina.ai/api-dashboard archived as blank client-rendered skeletons, so historical price values are recorded as unknown rather than guessed. Quarters not listed were verified stable.

Notable changes

2023-10 — Hosted Embeddings API launched alongside open-source jina-embeddings-v2; a 563-point Hacker News thread marked Jina’s arrival as a credible OpenAI embeddings alternative.
2024-04 — Reader API launched, extending the token model from pure embeddings to URL-reading and web search, and establishing the “one key, many products” shared-balance pattern.
2025-02 — DeepSearch launched (jina-deepsearch-v1), the first Jina product where a single call can consume hundreds of thousands of tokens — making token budgeting a first-class API concern.
2025-05-06 — Current shared-balance pricing model introduced (first-party attested on the live pricing page). This is the single confirmed pricing inflection in the tracked range; older subscribers were grandfathered.
2025-10-09 — Elastic acquisition completed; pricing left intact.

The May 6, 2025 pricing change in detail

The May 6, 2025 change is the one pricing inflection Jina attests to directly: the live pricing surface carries the notice “We introduced a new pricing model on May 6th, 2025. If you enabled auto-recharge before this date, you’ll continue to pay the old price (the one when you purchased). The new pricing only applies if you modify your auto-recharge settings or purchase a new API key.” The grandfathering clause is a deliberate trust move — it protects existing auto-recharge customers from a silent price increase, and it means two customers on the same product today can be paying different per-token rates depending on when they first enabled auto-recharge. Because the pre-change rate card is not screenshot-verifiable (the archived pages rendered blank), the exact magnitude of the change is recorded as unknown; third-party trackers describe the earlier v3-era rate as roughly $0.02/1M with ~1M free tokens per key, which would make the current $0.050/1M / 10M-free structure a re-basing rather than a simple cut or hike.

What’s unique : one fungible token balance for an entire retrieval stack

1. One shared token balance across five products — not five meters. Most retrieval vendors price embeddings, reranking, and search on separate meters with separate invoices. Jina collapses the whole stack onto a single fungible per-key token balance, so the buying decision is “how many tokens” rather than “which products.” This is a genuinely different take on usage-based pricing models: the value metric is normalized to one unit (tokens) even though the underlying products do very different work.

2. Per-product token counting differs even though the unit is shared. The clever part is that the same token unit means different things per product: Embeddings and Reranker count input tokens, Reader counts output tokens, web search charges a flat 10,000-token floor, and DeepSearch counts every token across an entire agent loop. This lets Jina price radically different cost profiles on one balance — a careful exercise in choosing the right usage metric where the headline meter stays constant but the per-call cost reflects real compute.

3. Tokens never expire and the free tier is generous but license-gated. 10M free tokens per key with no credit card is unusually generous for an embeddings API, but the CC-BY-NC restriction means it is explicitly an evaluation budget, not a production runway. This is a sharp free-tier design decision: maximize trial conversion while legally forcing commercial users to pay.

4. Grandfathered pricing as a trust mechanic. The May 6, 2025 change preserved old prices for existing auto-recharge customers — a deliberate signal that Jina would not silently reprice active workloads. Few usage-based billing vendors document grandfathering this explicitly on the live pricing page.

5. Cloud-marketplace billing as a parallel rail. Beyond the self-serve token top-up, Jina lets AWS SageMaker / Azure customers deploy models and pay through their own CSP account — a second pricing rail that bypasses the token balance entirely for enterprises that prefer consolidated cloud billing.

Strengths & weaknesses

Strengths	Weaknesses
One shared token balance across five products is radically simpler than per-product meters	A single DeepSearch query can silently consume ~500K tokens, making spend hard to predict without `budget_tokens` discipline
10M free tokens with no credit card is a generous, frictionless evaluation budget	Free tokens are non-commercial (CC-BY-NC) — the “free tier” cannot back a production app
Transparent public package prices ($50/1B, $500/11B) with tokens that never expire	The actual per-key balance, usage view and top-up confirmation are login-gated and not publicly inspectable
Grandfathered pricing on the May 2025 change protects existing customers from silent hikes	Historical rate card is not publicly recoverable — archived pages rendered blank, so price history is opaque
Premium $500/11B package is ~10% cheaper per token and unlocks higher rate limits	Reranker re-tokenizes full candidate sets, so rerank cost dominates RAG bills at query scale
Cloud-marketplace (AWS/Azure) and VPC deployment give enterprises a non-token billing rail	Post-Elastic-acquisition roadmap for standalone API pricing is uncertain

Billing UX : prepaid token controls and auto top-up

Jina exposes its billing controls in the per-key API Key & Billing dashboard (login-gated) and in the public top-up flow on each product page. Named controls observed on the live pricing surfaces:

Available tokens — a live per-key balance counter shown beside the API key on every product playground.
Top up this API key with more tokens — recharge an existing key with the $50 (1B) or $500 (11B) package rather than creating a new key.
Auto top-up on low token balance — when the balance drops below a set threshold (e.g. < 1M tokens), Jina automatically recharges the saved payment method for the last-purchased package. Pre-May-6-2025 auto-recharge subscribers are grandfathered onto their original price until they change settings or buy a new key.
Two Ways to Purchase — a toggle between (a) the Jina Search Foundation API top-up and (b) deployment through a cloud service provider (AWS / Azure), where billing is handled in the CSP account.
Rate Limit panel — a per-product RPM/TPM reference (no-key / Free / Paid / Premium) reachable from every playground, used to size which key class a workload needs.
Usage view — a per-key usage breakdown in the dashboard (login-gated).

Currency note: depending on location, charges may be in USD, EUR or other currencies, and taxes may apply; figures above are the public USD package prices. The dashboard token balance and top-up confirmation are login-gated and were not captured.

Strategic wins : where Jina’s token-balance model paid off

1. Collapsing five products onto one token balance removed packaging friction entirely

By making one API key carry one token balance for embeddings, reranking, reading, search, deep-search and classification, Jina eliminated the most common adoption blocker in retrieval tooling: deciding which products to buy and how to budget each. A developer who buys 1B tokens can experiment across the whole stack without new contracts or invoices. This is a textbook application of usage-based pricing as a growth lever — the buying decision shrinks to a single number, which lowers the activation energy for trying additional products.

2. A 10M-token, no-card free tier maximized top-of-funnel while CC-BY-NC protected revenue

Jina’s free allotment is generous enough to run real evaluation workloads, which feeds developer adoption and the kind of organic community discussion that put its embeddings on a 563-point Hacker News thread. But the CC-BY-NC license gate means commercial users must convert to paid — so the free tier drives trials without cannibalizing production revenue. The design separates “prove it works” from “run it in production” cleanly.

3. Open-weight distribution turned the free models into a paid-API funnel

By publishing models open-weight on Hugging Face while selling the hosted API, Jina built a two-sided funnel: self-hosters validate the models for free and become advocates, while teams that don’t want to run GPUs convert to the token-balance API. This open-core distribution mirrors the land-and-expand patterns seen across developer-first AI infrastructure — the open weights are the top of the funnel, the managed API is the monetization.

4. Grandfathering the May 2025 change preserved trust during a pricing migration

Repricing live workloads is one of the riskiest moves in usage-based billing. By explicitly grandfathering pre-May-2025 auto-recharge customers onto their old price, Jina migrated to a cleaner model without forcing a price increase on its most committed users — and said so publicly on the pricing page. That transparency is itself a retention asset.

Areas to improve : where the shared balance creates blind spots

1. DeepSearch needs guardrails surfaced before the spend, not after

A single DeepSearch query can consume ~500K tokens, and the cost is invisible until the balance drops. The budget_tokens parameter exists, but it is opt-in and buried in API docs. A concrete fix: a per-key default DeepSearch budget cap plus a dashboard estimate (“this configuration will cost up to ~X tokens per query”) shown before the first call — turning cost unpredictability into a visible, controllable number.

Package prices are public, but the live per-key balance, usage breakdown and top-up confirmation all sit behind login. For a product that prides itself on transparent pricing, exposing a read-only public usage estimator (tokens-per-product, projected monthly spend) would let prospects model their bill before committing — closing the gap between transparent list prices and real cost forecasting.

3. The opaque price history is a missed trust signal

Because the pre-2025 rate card is not publicly recoverable, customers cannot see how Jina’s pricing has evolved — which matters for teams betting a production stack on it. Publishing a simple changelog of pricing changes (even just “May 6, 2025: moved to shared-balance model”) would convert an opacity weakness into a pricing-stability signal, especially valuable now that Jina sits inside a public company (Elastic).

Monetization stack & signals : how Jina AI builds & buys its revenue engine

Buys 1 Builds 2

The read — where the monetization investment is going

A textbook self-serve spine: Jina builds the meter and prepaid auto-recharge first-party behind its own 'API Key & Billing' dashboard (not Metronome/Orb) and buys only the money movement from Stripe. Metering built, payments bought.

Stack — build vs buy

Builds in-house · 2

Per-key token meter In-house build Docs Jun 2026

“Token usage can be monitored in the 'API Key & Billing' tab by entering your API key, allowing you to view the recent usage history and remaining tokens.”
Auto top-up / prepaid recharge Billing Docs Jun 2026

“When your token balance drops below the set threshold, we will automatically recharge your saved payment method for the last purchased package, until the threshold is met.”

Buys (vendor) · 1

Stripe Payments Docs Jun 2026

“Payments are processed through Stripe, supporting a variety of payment methods including credit cards, Google Pay, and PayPal. An invoice will be issued to the email address associated with your Stripe account upon the purchase of tokens.”

Signals reviewed Jun 2026 · derived from product docs

Key takeaways

A single fungible meter can unify products with very different cost profiles — if the counting rules adapt. Jina charges one token unit across embeddings, reading, search and agents, but counts input tokens for some products, output tokens for others, and a flat floor for search. The lesson for other pricing teams: a shared value metric simplifies buying without forcing identical cost structures, as long as the metering rules reflect each product’s real compute.
A generous free tier and a license gate are not contradictory — they’re complementary. 10M free tokens drive trials; the CC-BY-NC restriction forces commercial conversion. Teams designing free tiers should separate “evaluation budget” from “production runway” deliberately rather than letting a free tier accidentally become a production subsidy.
Agent products break per-call cost intuition — budget controls must be first-class. DeepSearch’s ~500K-token queries show that once you sell an iterative agent, a single “request” can cost 50× a normal call. Pricing teams shipping agentic products need spend caps and budget parameters surfaced in the UI, not buried in docs.
Grandfathering is a cheap, high-trust way to migrate pricing models. Jina changed its entire pricing structure in May 2025 without alienating committed customers by freezing their old price. Other teams facing a pricing migration should treat explicit grandfathering as the default, not the exception.
Open weights plus a hosted token API is a durable developer-funnel. Publishing models open-weight while monetizing the managed API lets self-hosters and paying customers coexist in one funnel. The open models are marketing; the token-balance API is the business.

UBP implications

Normalizing multiple products to one consumption unit is a powerful packaging simplification — but it shifts complexity into the metering layer. Jina proves you can sell an entire stack on one token balance, but the engineering cost moves into per-product token-counting rules. For usage-based pricing strategy broadly, the question becomes whether a unified meter’s buying simplicity outweighs the metering complexity it creates.
Agentic products force a rethink of the request as a billing unit. When one DeepSearch call can consume 500K tokens, “per request” pricing collapses and token-budget pricing becomes mandatory. As more AI products ship iterative agents, value-metric design will increasingly need budget-bounded consumption units rather than flat per-call rates.
Acquisition by an incumbent is a live test of whether a clean usage model survives platform consolidation. Jina’s shared-balance API now sits inside Elastic. Whether it stays a standalone token-priced product or folds into Elastic’s platform billing will be a useful signal for how independent usage-based pricing models fare after enterprise acquisition.

Sources

Jina AI Embeddings API & pricing (accessed 2026-06-03)
Jina AI Reranker API (accessed 2026-06-03)
Jina AI Reader API (accessed 2026-06-03)
Jina AI DeepSearch API (accessed 2026-06-03)
Jina AI Classifier API (accessed 2026-06-03)
Jina AI API dashboard (login-gated billing) (accessed 2026-06-03)
Jina AI news & model release log (accessed 2026-06-03)
AIToolWorth — Jina AI pricing tiers — independent second source for token packages (accessed 2026-06-04)
CostBench — Jina embeddings pricing — independent second source for token packages (accessed 2026-06-04)
Browse the full pricing blueprint corpus for more usage-based pricing teardowns.

Bottom line

Jina AI runs one of the cleanest pure-usage models in retrieval tooling: a single fungible token balance that funds an entire stack — embeddings, reranking, URL-reading, web search, deep-search and classification — from one API key, with 10M free tokens to start and transparent $50/1B and $500/11B top-ups that never expire. The simplicity is real, but it hides sharp per-product cost differences: Reranker re-tokenizes full candidate sets, web search has a 10,000-token floor, and a single DeepSearch query can quietly burn ~500K tokens. The May 6, 2025 shift to this shared-balance model — explicitly grandfathered for existing customers — is the one confirmed pricing inflection, while the pre-2025 rate card stays opaque because the archives rendered blank. Now folded into Elastic, Jina’s standalone token-credit API is a live experiment in whether a clean usage model survives acquisition by a public incumbent.

Compare Jina AI’s shared-token model against other embeddings and retrieval vendors in the pricing blueprint — including the Cohere blueprint for a per-token-plus-private-deployment contrast and the Pinecone blueprint for a vector-database take on usage pricing.

Pricing timeline : Major events on a vertical axis

Each milestone below corresponds to a public pricing change, product launch, or material adjustment. Major events use a filled marker; minor adjustments use a faded one.

Current snapshot: shared token-credit balance across all Search Foundation APIs

Jun 2026

Jina AI prices Embeddings, Reranker, Reader, DeepSearch and Classifier from one shared API-key token balance. Every new key includes 10M free tokens (non-commercial, CC-BY-NC). Paid top-ups: $50 for 1B tokens ($0.050/1M) on a Standard key, $500 for 11B tokens ($0.045/1M) on a Premium key. Rate limits scale by tier (Free / Paid / Premium). Billing dashboard balance and top-up are login-gated and were not captured.

captured 2026-06-03

Acquired by Elastic (NYSE: ESTC)

Oct 2025

Elastic completes its acquisition of Jina AI; founder/CEO Han Xiao becomes Elastic's VP of AI. The standalone Jina Search Foundation API and its token-credit pricing continue to operate independently post-acquisition. No public repricing announced as of the June 2026 review.

New shared-balance pricing model introduced

May 2025

Jina introduces its current pricing model: one shared per-key token balance across all Search Foundation APIs with 10M free tokens per new key, $50/1B ($0.050/1M) Standard and $500/11B ($0.045/1M) Premium top-ups, and auto top-up on low balance. Customers who enabled auto-recharge before this date are grandfathered onto their prior price. First-party attested on the live pricing page ('We introduced a new pricing model on May 6th, 2025').

DeepSearch API (jina-deepsearch-v1) launched

Feb 2025

Jina launches DeepSearch (search.jina.ai live Feb 2; jina-deepsearch-v1 announced Feb 14, 2025): an iterative reason-search-read agent priced on total tokens consumed across the whole process — a single complex query can use ~500K tokens. Adds Free/Paid 50 RPM and Premium 500 RPM tiers to the same shared balance.

ReaderLM small language model

Sep 2024

Jina releases ReaderLM (later ReaderLM-v2, 1.5B params) for HTML-to-markdown/JSON conversion, drawing a 199-point Hacker News thread. ReaderLM-v2 conversion is metered at 3x normal token cost on the shared balance.

Reader API (r.jina.ai) launched

Apr 2024

Jina ships the Reader API: prepend r.jina.ai to any URL to convert a webpage to LLM-friendly markdown, and s.jina.ai for web search (SERP). Reader joins the shared token model; output-response tokens are counted, and search costs a fixed 10,000 tokens per request.

Open-source 8K-context embeddings + hosted Embeddings API

Oct 2023

Jina launches jina-embeddings-v2, marketed as the first open-source 8K-context text embedding rivaling OpenAI, alongside a hosted Embeddings API. The launch drew a 563-point / 201-comment Hacker News thread. Per-token rates from this era are NOT screenshot-verified (Wayback snapshots of jina.ai/embeddings archived as blank JS skeletons); third-party trackers later reported the v3-era rate at roughly $0.02 / 1M tokens with ~1M free tokens per key — recorded here as indicative, exact figure unknown.

$30M Series A (Canaan Partners)

Nov 2021

Jina AI raises a $30M Series A led by Canaan Partners (with GGV Capital, SAP.iO and others), bringing total funding to roughly $39M. Company shifts toward hosted neural-search and, later, a managed embeddings API.

Jina AI founded in Berlin

Feb 2020

Han Xiao and co-founders launch Jina AI as an open-source neural-search company. Initial product is the open-source Jina search framework; no hosted per-token API pricing yet.

Trivia

· Jina AI collapses its entire retrieval stack — embeddings, reranking, URL-reading, web search, deep-search and classification — onto ONE shared token balance per API key, so the buying decision is 'how many tokens' rather than 'which products'. Most retrieval vendors meter each capability separately.
· Jina AI was acquired by Elastic (NYSE: ESTC) in a deal completed October 9, 2025; founder/CEO Han Xiao became Elastic's VP of AI, yet the standalone Jina Search Foundation API and its token-credit pricing continue to operate independently.
· Jina's free tier hands every new API key 10 million tokens with no credit card — but they are restricted to non-commercial use under a CC-BY-NC license, an unusually explicit license-gate on a free API allotment.

Questions & answers

How much does Jina AI cost?: Jina AI prices all of its Search Foundation APIs from one shared token balance. Every new API key includes 10 million free tokens (non-commercial). Paid top-ups are $50 for 1 billion tokens ($0.050 per 1M) on a Standard key and $500 for 11 billion tokens ($0.045 per 1M) on a Premium key.
Does Jina AI have a free tier?: Yes. Every new API key comes with 10 million free tokens and no credit card is required, but those free tokens are restricted to non-commercial use under a CC-BY-NC license. Free-tier rate limits apply (for example 500 RPM on the Reader API).
How does Jina AI's token billing work across products?: All five products — Embeddings, Reranker, Reader, DeepSearch and Classifier — draw from one shared per-key token balance. Embeddings and Reranker count input tokens; Reader counts output-response tokens; web search charges a fixed cost starting at 10,000 tokens per request; DeepSearch counts every token across the whole reason-search-read process.
What changed in Jina AI's pricing on May 6, 2025?: Jina AI introduced its current pricing model on May 6, 2025. Customers who enabled auto-recharge before that date are grandfathered onto their original price; the newer pricing applies only if they modify their auto-recharge settings or purchase a new API key.
Did Elastic acquiring Jina AI change the pricing?: Elastic (NYSE: ESTC) completed its acquisition of Jina AI on October 9, 2025, and founder Han Xiao became Elastic's VP of AI. As of the June 2026 review the standalone Jina Search Foundation API and its shared token-credit pricing continue to operate independently of Elastic's own product billing.
How much does a DeepSearch query cost on Jina AI?: DeepSearch counts the total tokens used across the entire reason-search-read process. A simple query may use around 10,000 tokens, while a complex multi-hop query can consume roughly 500,000 tokens — all drawn from the same shared key balance, controllable via the budget_tokens parameter.