AI Summary
About
Jina AI is a Berlin-based search-foundation company that builds the embedding, reranking, reading and search models behind retrieval-augmented generation (RAG) and AI-agent systems. Its product line — the Search Foundation API — bundles five hosted endpoints: Embeddings (multimodal, multilingual jina-embeddings-v5), Reranker (jina-reranker-v3), Reader (r.jina.ai URL-to-markdown and s.jina.ai web search), DeepSearch (an iterative reason-search-read agent, jina-deepsearch-v1), and Classifier (zero-/few-shot text classification).
The company targets developers and AI teams building search, RAG and agentic applications, with native integrations into vector stores and LLMOps frameworks (Pinecone, Qdrant, Weaviate, Milvus, MongoDB, LlamaIndex, LangChain, Haystack and more). Models are also distributed open-weight on Hugging Face and deployable through AWS SageMaker, Microsoft Azure, and (soon) Google Cloud for teams that prefer to handle billing through their cloud provider.
Commercially, Jina AI positions itself as a single drop-in API for the whole retrieval stack rather than a per-product subscription: one API key carries one token balance that every endpoint draws from, so a team can mix embeddings, reranking, reading and search without managing separate plans or invoices.
Founded in Berlin in 2020 by Han Xiao and co-founders, Jina AI raised a $30M Series A led by Canaan Partners in November 2021 (roughly $39M total funding). On October 9, 2025, Elastic (NYSE: ESTC) completed its acquisition of Jina AI, with Han Xiao becoming Elastic’s VP of AI. As of this review the Jina Search Foundation API continues to operate as a standalone product with its own token-credit pricing — the analysis below covers that standalone API, not Elastic’s broader platform billing.
Pricing summary : How Jina’s shared token-credit balance works
Jina AI uses a pure usage model built on a single shared token-credit balance. Each API key holds a pool of prepaid tokens, and every product — Embeddings, Reranker, Reader, DeepSearch and Classifier — consumes from that same pool. There are no seats and no per-product subscriptions; you buy tokens once and spend them anywhere.
- Free allotment: every new API key includes 10 million free tokens, with no credit card required, restricted to non-commercial use (CC-BY-NC).
- Paid top-ups: $50 buys 1 billion tokens (0.050 / 1M) on a Standard key; $500 buys 11 billion tokens (0.045 / 1M) on a Premium key — the larger package is ~10% cheaper per token and unlocks higher rate limits.
- Rate-limit tiers: limits scale by key class — no-key, Free, Paid and Premium — measured in requests-per-minute (RPM) and tokens-per-minute (TPM), enforced whichever threshold trips first.
- Per-product token counting differs: Embeddings and Reranker count input tokens; Reader (
r.jina.ai) counts output-response tokens; web search (s.jina.ai) charges a fixed cost starting at 10,000 tokens per request; DeepSearch counts every token across the whole reason-search-read process; the Classifier counts input (and, for zero-shot, label) tokens.
What makes this different: most retrieval vendors price each capability (embeddings vs. rerank vs. search) on its own meter — Jina collapses the entire stack onto one fungible token balance, so the buying decision is “how many tokens” rather than “which products.”
Pricing by product
All five products draw from the same shared token balance and the same three rate-limit tiers (Free / Paid / Premium). The token packages are universal; what differs per product is how tokens are counted and the rate limits that apply.
Token packages (shared across all APIs)
| Tier | Price | Included | Key mechanics |
|---|---|---|---|
| Free (Toy Experiment) | $0 | 10M tokens | One-time per new key; non-commercial only (CC-BY-NC); no card required |
| Standard (Prototype) | $50 | 1B tokens | 0.050 / 1M tokens; Standard key; basic key management + technical support |
| Premium (Production) | $500 | 11B tokens | 0.045 / 1M tokens; Premium key; higher rate limits + priority support |
Embeddings API (api.jina.ai/v1/embeddings)
| Tier | Rate limit | Token counting | Key mechanics |
|---|---|---|---|
| Free | 100 RPM & 100K TPM | Count input tokens | jina-embeddings-v5 text + omni models |
| Paid | 500 RPM & 2M TPM | Count input tokens | Standard key |
| Premium | 5,000 RPM & 50M TPM | Count input tokens | Premium key |
Reranker API (api.jina.ai/v1/rerank)
| Tier | Rate limit | Token counting | Key mechanics |
|---|---|---|---|
| Free | 100 RPM & 100K TPM | Count input tokens | jina-reranker-v3 |
| Paid | 500 RPM & 2M TPM | Count input tokens | Standard key |
| Premium | 5,000 RPM & 50M TPM | Count input tokens | Premium key |
Reader API (r.jina.ai URL-to-markdown · s.jina.ai web search)
| Tier | Reader (r.jina.ai) | Search (s.jina.ai) | Token counting |
|---|---|---|---|
| Without key | 20 RPM | blocked | Reader: output-response tokens |
| Free | 500 RPM | 100 RPM | Search: fixed cost, starting at 10,000 tokens per request |
| Paid | 500 RPM | 100 RPM | ReaderLM-v2 conversion costs 3× tokens |
| Premium | 5,000 RPM | 1,000 RPM | — |
DeepSearch API (deepsearch.jina.ai/v1/chat/completions)
| Tier | Rate limit | Token counting | Key mechanics |
|---|---|---|---|
| Free | 50 RPM | Count total tokens in the whole process | jina-deepsearch-v1; ~10K tokens for a simple query, ~500K for a complex one |
| Paid | 50 RPM | Count total tokens in the whole process | budget_tokens / team_size parameters control depth and breadth |
| Premium | 500 RPM | Count total tokens in the whole process | Highest throughput |
Classifier API (api.jina.ai/v1/train · /v1/classify)
| Tier | Rate limit | Token counting | Key mechanics |
|---|---|---|---|
| Free | 25 RPM & 25K TPM | Train: input × num_iters · Few-shot: input · Zero-shot: input + label tokens | Train + zero/few-shot |
| Paid | 125 RPM & 500K TPM | (as above) | Standard key |
| Premium | 1,250 RPM & 12M TPM | (as above) | Premium key |
Sales motions across products: PLG / self-serve token top-ups for all individual and prototype usage (Free / Standard / Premium keys); sales-led for customized Kubernetes / VPC and on-premises deployments, and for cloud-marketplace (AWS SageMaker / Azure) billing through the customer’s CSP account.
Hidden costs : where DeepSearch and Reader silently drain the shared balance
The shared token balance is simple to reason about for embeddings — but the same balance funds products whose per-call token cost is wildly different. The two archetypes below show how a clean “$0.050/1M” headline rate translates into real monthly bills, and where the shared pool gets drained faster than teams expect.
Archetype A: A RAG startup indexing a knowledge base on Embeddings + Reranker
A small team embeds a 50M-token corpus once, re-embeds ~10M tokens/month of new content, and reranks 1M user queries/month (averaging ~2,000 input tokens per rerank call):
| Line item | Monthly cost |
|---|---|
| Initial corpus embedding: 50M tokens × $0.050/1M (one-time, amortized) | ~$2.50 |
| Incremental embedding: 10M tokens × $0.050/1M | $0.50 |
| Reranker: 1M queries × ~2,000 tokens = 2B tokens × $0.050/1M | $100.00 |
| Estimated total (steady state) | ~$101/month |
The embeddings are nearly free; the reranker is the cost center because it re-tokenizes the full candidate document set on every query. At scale, switching to the $0.045/1M Premium package (the $500/11B top-up) and trimming candidate-set size matters far more than optimizing the embedding step.
Archetype B: An agent product running DeepSearch on every user question
A research-assistant app routes every user question through DeepSearch at default depth — say 20,000 complex questions/month at ~500K tokens each:
| Line item | Monthly cost |
|---|---|
| DeepSearch: 20,000 queries × 500K tokens = 10B tokens × $0.045/1M | $450.00 |
| Plus Reader/search sub-calls already counted inside DeepSearch’s total | (included) |
| Estimated total | ~$450/month |
A single DeepSearch query at ~500K tokens costs ~$0.0225 — cheap per call, but because the agent counts every token across search, page-reads, reflection and synthesis, an app that runs it on every question can exhaust an 11B-token Premium package in a single month. The budget_tokens and reasoning_effort parameters are the real cost levers here, not the package price.
Hidden costs to watch:
- Reader output, not input, is metered —
r.jina.aicounts the output markdown tokens, so reading a long page costs more than the URL implies; theReaderLM-v2high-quality mode costs 3× tokens. - Web search is a flat 10,000-token floor — every
s.jina.airequest costs at least 10,000 tokens regardless of result size, so high-frequency search loops add up fast. - DeepSearch is unbounded by default — without setting
budget_tokens, a single complex query can silently consume ~500K tokens; multi-agentteam_size > 1multiplies consumption while sharing one budget. - Free tokens are non-commercial — the 10M free allotment is CC-BY-NC; any production use requires a paid top-up, so the “free tier” is an evaluation budget, not a production runway.
Want to estimate your own Jina AI bill? Use the Jina AI pricing calculator to model your monthly cost across embedding volume, rerank query count, Reader pages and DeepSearch depth.
Pricing evolution : from open-source neural search to a shared token balance
Cadence
| Quarter | Price changes | Product / SKU additions | Notes |
|---|---|---|---|
| 2023 Q4 | 0 | 1 | Open-source 8K-context embeddings + hosted Embeddings API; per-token rate from this era is not screenshot-verified |
| 2024 Q2 | 0 | 1 | Reader API (r.jina.ai) and web search (s.jina.ai) launched on the shared token model |
| 2024 Q3 | 0 | 1 | ReaderLM small language model released (3× token cost in high-quality mode) |
| 2025 Q1 | 0 | 1 | DeepSearch API (jina-deepsearch-v1) launched, priced on total tokens per reason-search-read process |
| 2025 Q2 | 1 | 0 | 2025-05-06 new shared-balance pricing model: 10M free tokens, $50/1B Standard, $500/11B Premium; pre-date auto-recharge grandfathered |
| 2025 Q4 | 0 | 0 | 2025-10-09 acquired by Elastic (NYSE: ESTC); standalone API pricing unchanged |
Tracked range: 2023 Q4–2026 Q2. Per-token rates and credit-package figures prior to the May 6, 2025 model change could not be screenshot-verified — Wayback snapshots of jina.ai/embeddings and jina.ai/api-dashboard archived as blank client-rendered skeletons, so historical price values are recorded as unknown rather than guessed. Quarters not listed were verified stable.
Notable changes
- 2023-10 — Hosted Embeddings API launched alongside open-source
jina-embeddings-v2; a 563-point Hacker News thread marked Jina’s arrival as a credible OpenAI embeddings alternative. - 2024-04 — Reader API launched, extending the token model from pure embeddings to URL-reading and web search, and establishing the “one key, many products” shared-balance pattern.
- 2025-02 — DeepSearch launched (
jina-deepsearch-v1), the first Jina product where a single call can consume hundreds of thousands of tokens — making token budgeting a first-class API concern. - 2025-05-06 — Current shared-balance pricing model introduced (first-party attested on the live pricing page). This is the single confirmed pricing inflection in the tracked range; older subscribers were grandfathered.
- 2025-10-09 — Elastic acquisition completed; pricing left intact.
The May 6, 2025 pricing change in detail
The May 6, 2025 change is the one pricing inflection Jina attests to directly: the live pricing surface carries the notice “We introduced a new pricing model on May 6th, 2025. If you enabled auto-recharge before this date, you’ll continue to pay the old price (the one when you purchased). The new pricing only applies if you modify your auto-recharge settings or purchase a new API key.” The grandfathering clause is a deliberate trust move — it protects existing auto-recharge customers from a silent price increase, and it means two customers on the same product today can be paying different per-token rates depending on when they first enabled auto-recharge. Because the pre-change rate card is not screenshot-verifiable (the archived pages rendered blank), the exact magnitude of the change is recorded as unknown; third-party trackers describe the earlier v3-era rate as roughly $0.02/1M with ~1M free tokens per key, which would make the current $0.050/1M / 10M-free structure a re-basing rather than a simple cut or hike.
What’s unique : one fungible token balance for an entire retrieval stack
1. One shared token balance across five products — not five meters. Most retrieval vendors price embeddings, reranking, and search on separate meters with separate invoices. Jina collapses the whole stack onto a single fungible per-key token balance, so the buying decision is “how many tokens” rather than “which products.” This is a genuinely different take on usage-based pricing models: the value metric is normalized to one unit (tokens) even though the underlying products do very different work.
2. Per-product token counting differs even though the unit is shared. The clever part is that the same token unit means different things per product: Embeddings and Reranker count input tokens, Reader counts output tokens, web search charges a flat 10,000-token floor, and DeepSearch counts every token across an entire agent loop. This lets Jina price radically different cost profiles on one balance — a careful exercise in choosing the right usage metric where the headline meter stays constant but the per-call cost reflects real compute.
3. Tokens never expire and the free tier is generous but license-gated. 10M free tokens per key with no credit card is unusually generous for an embeddings API, but the CC-BY-NC restriction means it is explicitly an evaluation budget, not a production runway. This is a sharp free-tier design decision: maximize trial conversion while legally forcing commercial users to pay.
4. Grandfathered pricing as a trust mechanic. The May 6, 2025 change preserved old prices for existing auto-recharge customers — a deliberate signal that Jina would not silently reprice active workloads. Few usage-based billing vendors document grandfathering this explicitly on the live pricing page.
5. Cloud-marketplace billing as a parallel rail. Beyond the self-serve token top-up, Jina lets AWS SageMaker / Azure customers deploy models and pay through their own CSP account — a second pricing rail that bypasses the token balance entirely for enterprises that prefer consolidated cloud billing.
Strengths & weaknesses
| Strengths | Weaknesses |
|---|---|
| One shared token balance across five products is radically simpler than per-product meters | A single DeepSearch query can silently consume ~500K tokens, making spend hard to predict without budget_tokens discipline |
| 10M free tokens with no credit card is a generous, frictionless evaluation budget | Free tokens are non-commercial (CC-BY-NC) — the “free tier” cannot back a production app |
| Transparent public package prices ($50/1B, $500/11B) with tokens that never expire | The actual per-key balance, usage view and top-up confirmation are login-gated and not publicly inspectable |
| Grandfathered pricing on the May 2025 change protects existing customers from silent hikes | Historical rate card is not publicly recoverable — archived pages rendered blank, so price history is opaque |
| Premium $500/11B package is ~10% cheaper per token and unlocks higher rate limits | Reranker re-tokenizes full candidate sets, so rerank cost dominates RAG bills at query scale |
| Cloud-marketplace (AWS/Azure) and VPC deployment give enterprises a non-token billing rail | Post-Elastic-acquisition roadmap for standalone API pricing is uncertain |
Billing UX : prepaid token controls and auto top-up
Jina exposes its billing controls in the per-key API Key & Billing dashboard (login-gated) and in the public top-up flow on each product page. Named controls observed on the live pricing surfaces:
- Available tokens — a live per-key balance counter shown beside the API key on every product playground.
- Top up this API key with more tokens — recharge an existing key with the $50 (1B) or $500 (11B) package rather than creating a new key.
- Auto top-up on low token balance — when the balance drops below a set threshold (e.g.
< 1M tokens), Jina automatically recharges the saved payment method for the last-purchased package. Pre-May-6-2025 auto-recharge subscribers are grandfathered onto their original price until they change settings or buy a new key. - Two Ways to Purchase — a toggle between (a) the Jina Search Foundation API top-up and (b) deployment through a cloud service provider (AWS / Azure), where billing is handled in the CSP account.
- Rate Limit panel — a per-product RPM/TPM reference (no-key / Free / Paid / Premium) reachable from every playground, used to size which key class a workload needs.
- Usage view — a per-key usage breakdown in the dashboard (login-gated).
Currency note: depending on location, charges may be in USD, EUR or other currencies, and taxes may apply; figures above are the public USD package prices. The dashboard token balance and top-up confirmation are login-gated and were not captured.
Strategic wins : where Jina’s token-balance model paid off
1. Collapsing five products onto one token balance removed packaging friction entirely
By making one API key carry one token balance for embeddings, reranking, reading, search, deep-search and classification, Jina eliminated the most common adoption blocker in retrieval tooling: deciding which products to buy and how to budget each. A developer who buys 1B tokens can experiment across the whole stack without new contracts or invoices. This is a textbook application of usage-based pricing as a growth lever — the buying decision shrinks to a single number, which lowers the activation energy for trying additional products.
2. A 10M-token, no-card free tier maximized top-of-funnel while CC-BY-NC protected revenue
Jina’s free allotment is generous enough to run real evaluation workloads, which feeds developer adoption and the kind of organic community discussion that put its embeddings on a 563-point Hacker News thread. But the CC-BY-NC license gate means commercial users must convert to paid — so the free tier drives trials without cannibalizing production revenue. The design separates “prove it works” from “run it in production” cleanly.
3. Open-weight distribution turned the free models into a paid-API funnel
By publishing models open-weight on Hugging Face while selling the hosted API, Jina built a two-sided funnel: self-hosters validate the models for free and become advocates, while teams that don’t want to run GPUs convert to the token-balance API. This open-core distribution mirrors the land-and-expand patterns seen across developer-first AI infrastructure — the open weights are the top of the funnel, the managed API is the monetization.
4. Grandfathering the May 2025 change preserved trust during a pricing migration
Repricing live workloads is one of the riskiest moves in usage-based billing. By explicitly grandfathering pre-May-2025 auto-recharge customers onto their old price, Jina migrated to a cleaner model without forcing a price increase on its most committed users — and said so publicly on the pricing page. That transparency is itself a retention asset.
Areas to improve : where the shared balance creates blind spots
1. DeepSearch needs guardrails surfaced before the spend, not after
A single DeepSearch query can consume ~500K tokens, and the cost is invisible until the balance drops. The budget_tokens parameter exists, but it is opt-in and buried in API docs. A concrete fix: a per-key default DeepSearch budget cap plus a dashboard estimate (“this configuration will cost up to ~X tokens per query”) shown before the first call — turning cost unpredictability into a visible, controllable number.
2. The login-gated balance undercuts the otherwise public price transparency
Package prices are public, but the live per-key balance, usage breakdown and top-up confirmation all sit behind login. For a product that prides itself on transparent pricing, exposing a read-only public usage estimator (tokens-per-product, projected monthly spend) would let prospects model their bill before committing — closing the gap between transparent list prices and real cost forecasting.
3. The opaque price history is a missed trust signal
Because the pre-2025 rate card is not publicly recoverable, customers cannot see how Jina’s pricing has evolved — which matters for teams betting a production stack on it. Publishing a simple changelog of pricing changes (even just “May 6, 2025: moved to shared-balance model”) would convert an opacity weakness into a pricing-stability signal, especially valuable now that Jina sits inside a public company (Elastic).
Key takeaways
-
A single fungible meter can unify products with very different cost profiles — if the counting rules adapt. Jina charges one token unit across embeddings, reading, search and agents, but counts input tokens for some products, output tokens for others, and a flat floor for search. The lesson for other pricing teams: a shared value metric simplifies buying without forcing identical cost structures, as long as the metering rules reflect each product’s real compute.
-
A generous free tier and a license gate are not contradictory — they’re complementary. 10M free tokens drive trials; the CC-BY-NC restriction forces commercial conversion. Teams designing free tiers should separate “evaluation budget” from “production runway” deliberately rather than letting a free tier accidentally become a production subsidy.
-
Agent products break per-call cost intuition — budget controls must be first-class. DeepSearch’s ~500K-token queries show that once you sell an iterative agent, a single “request” can cost 50× a normal call. Pricing teams shipping agentic products need spend caps and budget parameters surfaced in the UI, not buried in docs.
-
Grandfathering is a cheap, high-trust way to migrate pricing models. Jina changed its entire pricing structure in May 2025 without alienating committed customers by freezing their old price. Other teams facing a pricing migration should treat explicit grandfathering as the default, not the exception.
-
Open weights plus a hosted token API is a durable developer-funnel. Publishing models open-weight while monetizing the managed API lets self-hosters and paying customers coexist in one funnel. The open models are marketing; the token-balance API is the business.
UBP implications
-
Normalizing multiple products to one consumption unit is a powerful packaging simplification — but it shifts complexity into the metering layer. Jina proves you can sell an entire stack on one token balance, but the engineering cost moves into per-product token-counting rules. For usage-based pricing strategy broadly, the question becomes whether a unified meter’s buying simplicity outweighs the metering complexity it creates.
-
Agentic products force a rethink of the request as a billing unit. When one DeepSearch call can consume 500K tokens, “per request” pricing collapses and token-budget pricing becomes mandatory. As more AI products ship iterative agents, value-metric design will increasingly need budget-bounded consumption units rather than flat per-call rates.
-
Acquisition by an incumbent is a live test of whether a clean usage model survives platform consolidation. Jina’s shared-balance API now sits inside Elastic. Whether it stays a standalone token-priced product or folds into Elastic’s platform billing will be a useful signal for how independent usage-based pricing models fare after enterprise acquisition.
Sources
- Jina AI Embeddings API & pricing (accessed 2026-06-03)
- Jina AI Reranker API (accessed 2026-06-03)
- Jina AI Reader API (accessed 2026-06-03)
- Jina AI DeepSearch API (accessed 2026-06-03)
- Jina AI Classifier API (accessed 2026-06-03)
- Jina AI API dashboard (login-gated billing) (accessed 2026-06-03)
- Jina AI news & model release log (accessed 2026-06-03)
- AIToolWorth — Jina AI pricing tiers — independent second source for token packages (accessed 2026-06-04)
- CostBench — Jina embeddings pricing — independent second source for token packages (accessed 2026-06-04)
- Browse the full pricing blueprint corpus for more usage-based pricing teardowns.
Bottom line
Jina AI runs one of the cleanest pure-usage models in retrieval tooling: a single fungible token balance that funds an entire stack — embeddings, reranking, URL-reading, web search, deep-search and classification — from one API key, with 10M free tokens to start and transparent $50/1B and $500/11B top-ups that never expire. The simplicity is real, but it hides sharp per-product cost differences: Reranker re-tokenizes full candidate sets, web search has a 10,000-token floor, and a single DeepSearch query can quietly burn ~500K tokens. The May 6, 2025 shift to this shared-balance model — explicitly grandfathered for existing customers — is the one confirmed pricing inflection, while the pre-2025 rate card stays opaque because the archives rendered blank. Now folded into Elastic, Jina’s standalone token-credit API is a live experiment in whether a clean usage model survives acquisition by a public incumbent.
Compare Jina AI’s shared-token model against other embeddings and retrieval vendors in the pricing blueprint — including the Cohere blueprint for a per-token-plus-private-deployment contrast and the Pinecone blueprint for a vector-database take on usage pricing.
Pricing timeline : Major events on a vertical axis
Each milestone below corresponds to a public pricing change, product launch, or material adjustment. Major events use a filled marker; minor adjustments use a faded one.
Current snapshot: shared token-credit balance across all Search Foundation APIs
Jina AI prices Embeddings, Reranker, Reader, DeepSearch and Classifier from one shared API-key token balance. Every new key includes 10M free tokens (non-commercial, CC-BY-NC). Paid top-ups: $50 for 1B tokens ($0.050/1M) on a Standard key, $500 for 11B tokens ($0.045/1M) on a Premium key. Rate limits scale by tier (Free / Paid / Premium). Billing dashboard balance and top-up are login-gated and were not captured.
Acquired by Elastic (NYSE: ESTC)
Elastic completes its acquisition of Jina AI; founder/CEO Han Xiao becomes Elastic's VP of AI. The standalone Jina Search Foundation API and its token-credit pricing continue to operate independently post-acquisition. No public repricing announced as of the June 2026 review.
New shared-balance pricing model introduced
Jina introduces its current pricing model: one shared per-key token balance across all Search Foundation APIs with 10M free tokens per new key, $50/1B ($0.050/1M) Standard and $500/11B ($0.045/1M) Premium top-ups, and auto top-up on low balance. Customers who enabled auto-recharge before this date are grandfathered onto their prior price. First-party attested on the live pricing page ('We introduced a new pricing model on May 6th, 2025').
DeepSearch API (jina-deepsearch-v1) launched
Jina launches DeepSearch (search.jina.ai live Feb 2; jina-deepsearch-v1 announced Feb 14, 2025): an iterative reason-search-read agent priced on total tokens consumed across the whole process — a single complex query can use ~500K tokens. Adds Free/Paid 50 RPM and Premium 500 RPM tiers to the same shared balance.
ReaderLM small language model
Jina releases ReaderLM (later ReaderLM-v2, 1.5B params) for HTML-to-markdown/JSON conversion, drawing a 199-point Hacker News thread. ReaderLM-v2 conversion is metered at 3x normal token cost on the shared balance.
Reader API (r.jina.ai) launched
Jina ships the Reader API: prepend r.jina.ai to any URL to convert a webpage to LLM-friendly markdown, and s.jina.ai for web search (SERP). Reader joins the shared token model; output-response tokens are counted, and search costs a fixed 10,000 tokens per request.
Open-source 8K-context embeddings + hosted Embeddings API
Jina launches jina-embeddings-v2, marketed as the first open-source 8K-context text embedding rivaling OpenAI, alongside a hosted Embeddings API. The launch drew a 563-point / 201-comment Hacker News thread. Per-token rates from this era are NOT screenshot-verified (Wayback snapshots of jina.ai/embeddings archived as blank JS skeletons); third-party trackers later reported the v3-era rate at roughly $0.02 / 1M tokens with ~1M free tokens per key — recorded here as indicative, exact figure unknown.
$30M Series A (Canaan Partners)
Jina AI raises a $30M Series A led by Canaan Partners (with GGV Capital, SAP.iO and others), bringing total funding to roughly $39M. Company shifts toward hosted neural-search and, later, a managed embeddings API.
Jina AI founded in Berlin
Han Xiao and co-founders launch Jina AI as an open-source neural-search company. Initial product is the open-source Jina search framework; no hosted per-token API pricing yet.
- · Jina AI collapses its entire retrieval stack — embeddings, reranking, URL-reading, web search, deep-search and classification — onto ONE shared token balance per API key, so the buying decision is 'how many tokens' rather than 'which products'. Most retrieval vendors meter each capability separately.
- · Jina AI was acquired by Elastic (NYSE: ESTC) in a deal completed October 9, 2025; founder/CEO Han Xiao became Elastic's VP of AI, yet the standalone Jina Search Foundation API and its token-credit pricing continue to operate independently.
- · Jina's free tier hands every new API key 10 million tokens with no credit card — but they are restricted to non-commercial use under a CC-BY-NC license, an unusually explicit license-gate on a free API allotment.
Questions & answers
- How much does Jina AI cost?
- Jina AI prices all of its Search Foundation APIs from one shared token balance. Every new API key includes 10 million free tokens (non-commercial). Paid top-ups are $50 for 1 billion tokens ($0.050 per 1M) on a Standard key and $500 for 11 billion tokens ($0.045 per 1M) on a Premium key.
- Does Jina AI have a free tier?
- Yes. Every new API key comes with 10 million free tokens and no credit card is required, but those free tokens are restricted to non-commercial use under a CC-BY-NC license. Free-tier rate limits apply (for example 500 RPM on the Reader API).
- How does Jina AI's token billing work across products?
- All five products — Embeddings, Reranker, Reader, DeepSearch and Classifier — draw from one shared per-key token balance. Embeddings and Reranker count input tokens; Reader counts output-response tokens; web search charges a fixed cost starting at 10,000 tokens per request; DeepSearch counts every token across the whole reason-search-read process.
- What changed in Jina AI's pricing on May 6, 2025?
- Jina AI introduced its current pricing model on May 6, 2025. Customers who enabled auto-recharge before that date are grandfathered onto their original price; the newer pricing applies only if they modify their auto-recharge settings or purchase a new API key.
- Did Elastic acquiring Jina AI change the pricing?
- Elastic (NYSE: ESTC) completed its acquisition of Jina AI on October 9, 2025, and founder Han Xiao became Elastic's VP of AI. As of the June 2026 review the standalone Jina Search Foundation API and its shared token-credit pricing continue to operate independently of Elastic's own product billing.
- How much does a DeepSearch query cost on Jina AI?
- DeepSearch counts the total tokens used across the entire reason-search-read process. A simple query may use around 10,000 tokens, while a complex multi-hop query can consume roughly 500,000 tokens — all drawn from the same shared key balance, controllable via the budget_tokens parameter.