How much does Anyscale cost per month?

Anyscale has no monthly subscription fee — you pay only for the Anyscale Credits (ACU) consumed by your compute. ACU rates per hour: CPU-only $0.0135, T4 $0.5682, L4 $0.9542, A10G $1.3635, A100 $4.9591, H100 $9.2880, H200 $10.6812. A small dev cluster of 1 A10G running 4 hours/day would cost ~$163/month.

Does Anyscale have a free tier or free trial?

Yes — new users receive $100 in free ACU credits to evaluate the platform with no credit card required. Template projects (Ray Train, Ray Serve, Ray Data starters) cost $3–$5 in credits each, so the trial covers 20+ non-trivial proof-of-value runs.

What is the difference between Anyscale Hosted and BYOC (Bring Your Own Cloud)?

Hosted runs in Anyscale's managed VPC with credit-card billing, business-hours support, and a 5-case-submission cap; available in a limited set of regions. BYOC runs in your own AWS, GCP, or Azure VPC (or on-prem), with invoice or cloud-marketplace billing, 24x7 enterprise SLAs, unlimited support cases, and any cloud region or on-prem deployment your team can stand up.

What does RayTurbo do and is it required?

RayTurbo is Anyscale's proprietary runtime extension to open-source Ray, claiming 4.5× faster LLM inference, 50% lower training cost, and 90% faster autoscale. It is included with hosted Anyscale at no separate charge — customers who want pure open-source Ray can self-host. RayTurbo's claimed performance gains are the marketed justification for the ACU markup over raw cloud compute.

Anyscale Pricing

AI Summary

Anyscale runs a pure-usage compute model denominated in Anyscale Credits (ACU), where $1 ACU = $1 USD on the published rate card; CPU-only at $0.0135/hr scales up to NVIDIA H200 at $10.6812/hr, with T4 ($0.5682), L4 ($0.9542), A10G ($1.3635), A100 ($4.9591), and H100 ($9.2880) in between.
New users receive $100 in free ACU credits to evaluate the platform; template projects (Ray Train, Ray Serve, Ray Data starter projects) cost $3–$5 in credits each, making the trial generous enough for non-trivial proof-of-value.
Two deployment modes: Anyscale-hosted (Anyscale-managed VPC, credit-card billing, business-hours support, 5 case submissions) and BYOC bring-your-own-cloud (your VPC, invoice or cloud marketplace billing, 24x7 enterprise SLAs, unlimited cases) — pricing structure differs materially between the two.
RayTurbo is Anyscale's proprietary runtime extension to open-source Ray, claiming 4.5× faster LLM inference, 50% lower training cost, and 90% faster autoscale — the marketed justification for the ACU markup over raw cloud compute.
Anyscale Endpoints (LLM inference at $1/1M tokens) launched August 2023 and was sunset January 14, 2025; the company refocused on enterprise platform sales, RayTurbo, and BYOC deployment for committed contracts.
Founded 2019 by the original Ray authors (Nishihara, Moritz, Stoica); raised a Series C at ~$1B valuation in 2021 (Andreessen Horowitz, Addition) with subsequent extension rounds maintaining the $1B+ valuation through 2025.

Pricing summary

Anyscale 2026 — ACU per-hour compute + RayTurbo runtime

Hosted (Anyscale-managed VPC) + BYOC (your cloud); $100 free trial credit; annual commit discounts

Developer

$100 free credit

Individual developers, proof-of-value

Hosted

From $0.0135 /hr (CPU)

Teams running on Anyscale-managed infra

Annual commit

Enterprise BYOC

Custom ACU + cloud

Large orgs with cloud commits and SLAs

GPU per-hour (ACU)

From $0.5682 /hr (T4)

Hosted GPU compute via Anyscale managed

RayTurbo runtime

Included in ACU

Proprietary runtime over open-source Ray

ACU = $1 USD on the rate card. Hosted prices include underlying cloud compute. BYOC prices are platform-only and customers pay cloud compute separately. Customer-reported savings: Handshake/Canva 50%, Attentive 99% on specific workloads.

About

Anyscale is a San Francisco-based AI infrastructure company founded in December 2019 by Robert Nishihara, Philipp Moritz, and Ion Stoica — three of the original UC Berkeley RISELab authors of Ray, the open-source distributed computing framework that underpins modern Python-based AI training and inference at scale. The product is a managed Ray platform: customers write standard Ray code (Ray Train, Ray Serve, Ray Data, RLlib) and Anyscale provisions the underlying cluster, applies its RayTurbo runtime optimizations, and exposes per-hour billing through an Anyscale Compute Unit (ACU) denominated 1:1 with USD.

By 2026 Anyscale runs production workloads for enterprises including Handshake, Canva, Attentive, Pinterest, Cohere, and Roblox — covering use cases that span large-scale LLM training, batch inference, RAG indexing, recommendation systems, and time-series feature engineering. The company raised a $99M Series C in late 2021 led by Addition with Andreessen Horowitz participation at ~$1B post-money valuation, with subsequent extension rounds maintaining the $1B+ valuation through 2025.

Anyscale competes with hyperscaler ML platforms (AWS Sagemaker, Vertex AI, Azure ML), distributed-training specialists (Modal, Together AI for inference, MosaicML pre-Databricks-acquisition), and the increasingly viable “build it yourself on Kubernetes + open-source Ray” path. Anyscale’s differentiation is the combination of being the maintainers-and-commercializers of Ray, the RayTurbo runtime that claims meaningful performance gains over raw open-source Ray, and a dual hosted+BYOC delivery model that lets enterprises pick managed-VPC simplicity or own-VPC control without changing tooling.

Pricing summary : How Anyscale’s ACU per-hour + commit + BYOC stack works

Anyscale runs a pure-usage compute model denominated in Anyscale Credits (ACU), where 1 ACU = $1 USD on the published rate card. Hosted customers pay an all-in ACU rate per hour that includes underlying cloud compute (AWS or GCP), the RayTurbo runtime, the Anyscale control plane, and standard support. BYOC customers pay a lower platform-only ACU rate plus their own cloud compute bill directly to AWS, GCP, or Azure — using existing GPU reservations and committed-spend agreements to materially reduce total cost.

The free tier is structured as a $100 ACU credit grant on signup, which covers 20+ template-project runs (Ray Train, Serve, Data starters at $3–$5 each). Pro-equivalent volume discounts are not published; instead, Anyscale routes mid-market and enterprise customers into annual committed-contract deals that deliver discounts proportional to commit size, with overages billing at standard ACU rates. This pure-usage + commitment hybrid is increasingly the canonical AI infrastructure pricing pattern.

What makes this different: Anyscale’s three founders are also the original Ray authors and core maintainers. This makes the implicit value proposition — “we run Ray better than anyone else can self-host it” — credible in a way that few open-source-commercialization stories are. The RayTurbo runtime quantifies that claim (4.5× inference, 50% training cost reduction, 90% faster autoscale), turning a marketing positioning into a measurable ACU markup justification.

Pricing by product

Anyscale Compute Units (ACU) — hosted per-hour rates

Instance	VRAM / type	ACU per hour	Notes
CPU-only	n/a	$0.0135	Head node, lightweight orchestration
NVIDIA T4	16 GiB	$0.5682	Low-cost inference, embeddings
NVIDIA L4	24 GiB	$0.9542	Mid-range inference, video
NVIDIA A10G	24 GiB	$1.3635	7B–13B inference, image generation
NVIDIA A100	80 GiB	$4.9591	30B–70B training and inference
NVIDIA H100	80 GiB	$9.2880	Frontier training, low-latency inference
NVIDIA H200	141 GiB	$10.6812	Largest open-model training, batch inference

Tier and deployment matrix

Feature	Developer ($100 trial)	Hosted	Enterprise BYOC
ACU rate	Published	Published	Lower (platform-only)
Regions	Limited	Limited	Any cloud / region / on-prem
Infrastructure	Anyscale-managed	Anyscale-managed	Your VPC
GPU sourcing	Anyscale cloud-list	Anyscale cloud-list	Your committed-spend agreements
Billing	Credit-card	Monthly credit-card	Invoice or cloud marketplace
Support	Business hours, 5 cases	Business hours, 5 cases	24x7 enterprise SLAs, unlimited
Free credit	$100 ACU	None	None
Annual commit discounts	n/a	Yes	Yes (larger)

Customer-reported workload savings

Customer	Reported savings	Workload type
Handshake	~50%	Distributed training + recommendations
Canva	~50%	Inference and batch processing
Attentive	~99% on specific workload	Batch ML pipeline migration

Sales motions across products: PLG / self-serve for Developer trial and Hosted credit-card customers; sales-led for Enterprise BYOC and annual committed contracts. All prices accessed 2026-05-29 from anyscale.com/pricing.

Hidden costs : What Anyscale customers actually pay beyond the ACU rate card

Archetype A: ML startup running RayTrain training jobs on hosted Anyscale

A growth-stage AI startup running distributed Ray Train jobs on A10G clusters for fine-tuning, with traffic concentrated in weekly training runs:

Line item	Monthly cost
A10G cluster (4 nodes × 8h × 4 runs/mo × $1.3635)	$174
Head node CPU ($0.0135 × 80 hours active)	$1.08
Storage for model checkpoints (S3, billed separately)	~$25
Cross-region data transfer (negligible at this scale)	<$5
Estimated total	~$205/month

The $100 trial credit covers roughly the first month of this workload. After the trial, real costs scale linearly with cluster hours — and the customer learns that warm-pool autoscale (RayTurbo’s 90% improvement) means the cluster downsizes faster between jobs than a hand-rolled Ray cluster would, materially reducing idle billing.

Archetype B: Mid-market enterprise on BYOC with annual commit

A 200-person enterprise running production Ray Serve inference on BYOC inside their AWS Reserved Instance pool, with a $250K annual commit:

Line item	Monthly cost
Anyscale ACU (platform-only on BYOC) — committed	$20,833
AWS compute (their RI pool, billed direct)	~$45,000
Anyscale overage above commit (occasional spike)	$0–$3,000
24x7 enterprise SLA support	Included in commit
Estimated total	~$66,000–$69,000/month

The platform-only ACU rate on BYOC is materially lower than the all-in hosted rate — Anyscale doesn’t publish the BYOC discount schedule, but the trade-off is real: customers buy a $250K annual platform commit and use it on their own RI pool. For organizations with existing AWS Enterprise Discount Programs (EDP), this can deliver the 50% infrastructure savings that Handshake and Canva publicly cite.

Want to estimate your own Anyscale bill? Use the Anyscale pricing calculator to model ACU consumption by instance type, weekly active hours, and BYOC versus hosted deployment.

Pricing evolution : Anyscale’s pricing history from managed-Ray to RayTurbo-led enterprise

Cadence

Quarter	Price changes	Product / SKU additions	Notes
2019 Q4	0	1	Anyscale founded; initial managed-Ray product
2021 Q4	0	0	Series C ($99M) at ~$1B; funded Endpoints + BYOC
2023 Q3	0	1	Anyscale Endpoints launched at $1/1M tokens
2024 Q1	0	1	RayTurbo runtime launched as proprietary differentiator
2024 Q3	1	0	ACU rate card published (CPU → H100)
2025 Q1	0	-1	Anyscale Endpoints sunset January 14, 2025
2025 Q2	0	1	H200 GPU availability added at $10.6812/hr
2025 Q4	0	1	AWS + GCP Marketplace billing for BYOC contracts

Tracked range: 2019 Q4–2025 Q4. Quarters not listed above were verified stable (0 price changes, 0 SKU additions).

Notable changes

2023-08-22 — Anyscale Endpoints launched at $1 per million tokens, briefly entering the per-token LLM inference market alongside Together AI and Fireworks.
2024-03-19 — RayTurbo runtime launched, quantifying performance gains (4.5× inference, 50% training cost, 90% autoscale) to justify the ACU markup.
2024-09-10 — Full ACU rate card published across CPU through H100; transparency play to enable self-serve mid-market evaluation.
2025-01-14 — Anyscale Endpoints sunset; strategic refocus onto enterprise platform and BYOC.
2025-11-19 — AWS Marketplace and GCP Marketplace billing added for BYOC, letting enterprise customers consume Anyscale spend through existing cloud-provider commit agreements.

The 2025 Endpoints sunset in detail

When Anyscale Endpoints launched in August 2023, the per-token LLM inference market was an open contest among Together AI, Fireworks AI, Anyscale, and a long tail of smaller providers. The product priced at $1 per million tokens for Llama 2 7B — competitive but not dramatically cheaper than peers, with the differentiating pitch being “the same Ray Serve infrastructure that runs production inference at Pinterest now runs your LLM.”

By late 2024, the inference market had bifurcated: first-party model providers (DeepSeek, Mistral, Cohere) shipped competitive per-token rates and OpenAI-compatible APIs, while specialized inference clouds (Together AI, Fireworks, Baseten) invested heavily in latency optimization and model-specific quantization. Anyscale’s product was neither the cheapest nor the fastest — and its Endpoints customers were mostly evaluation traffic rather than production workloads. The January 2025 sunset reflected a strategic decision to refocus on the larger and stickier enterprise-platform market where Anyscale’s Ray credibility creates more durable competitive advantage than per-token inference middleware.

What’s unique : Anyscale’s distinctive pricing mechanics

1. ACU = $1 USD denomination removes credit-translation cognitive load. Most cloud platforms denominate credits in opaque points (AWS credits, GCP credits) or in non-dollar quantities (Snowflake credits at variable USD rates per edition). Anyscale’s 1:1 ACU = $1 USD means a customer reading “this notebook costs 5 ACU” can immediately read $5. This transparent unit economics reduces conversion friction for self-serve developers and finance-team review of enterprise bills.

2. RayTurbo as the quantified markup justification. When customers question why hosted Anyscale costs more per A100 hour than raw cloud A100 list, the answer is RayTurbo’s claimed 4.5× inference / 50% training cost / 90% autoscale gains — measurable performance multipliers that, if accurate, make the all-in ACU rate net-cheaper than self-hosted Ray. Few infrastructure commercializations quantify their value-add this explicitly.

3. Two-mode delivery (Hosted vs BYOC) at materially different prices. Hosted ACU includes cloud compute; BYOC ACU is platform-only and the customer pays cloud directly. This dual-mode pricing lets customers with existing AWS Enterprise Discount Programs or GPU reservations capture those discounts without giving them back to a hosted provider — and Anyscale captures the platform-revenue regardless. The pricing architecture accommodates both managed-simplicity buyers and committed-spend optimizers.

4. Endpoints sunset signaled strategic discipline rather than weakness. When competitors held inference middleware as a defensive must-have product, Anyscale’s January 2025 decision to sunset Endpoints freed engineering and GTM capacity to focus on enterprise platform sales — where Ray creator-credibility creates a defensible moat. Many infrastructure companies refuse to kill underperforming products; Anyscale’s willingness to do so improved the focus of the remaining roadmap.

5. Cloud Marketplace billing for BYOC consolidates procurement. Letting enterprises pay Anyscale via AWS Marketplace or GCP Marketplace means the Anyscale spend counts against existing cloud commits and goes through existing procurement workflows. For Fortune-500 buyers with multi-million-dollar cloud-spend agreements, this is a meaningful procurement-friction reduction that locks in cloud-marketplace spend without requiring new vendor onboarding.

Strengths & weaknesses

Strengths	Weaknesses
Ray creator-and-commercializer credibility unique among inference platforms	RayTurbo performance claims (4.5×, 50%, 90%) require customer validation — not third-party benchmarked publicly
1:1 ACU = USD denomination removes billing-unit cognitive overhead	BYOC platform-only ACU rates are not published — must contact sales
Two-mode delivery (Hosted + BYOC) accommodates both managed and committed-spend buyers	Endpoints sunset (Jan 2025) means customers wanting per-token LLM inference must go elsewhere
$100 free trial is generous enough for non-trivial proof-of-value	Hosted support capped at 5 case submissions — 24x7 SLAs only on Enterprise BYOC
Customer-reported savings (Handshake 50%, Canva 50%, Attentive 99%) credible at scale	Hosted regions are limited; global enterprise customers must adopt BYOC
Cloud-marketplace billing on BYOC consolidates procurement for large customers	Pricing page does not surface RayTurbo performance benchmarks — claims live in blog content rather than rate-card supporting docs

Billing UX : Anyscale’s account controls and payment experience

Self-serve signup with $100 ACU trial — Sign up at console.anyscale.com with email; $100 in free credits applied automatically. No credit card required to start template projects.
ACU usage console — Real-time view of ACU consumption per cluster, per workload type (Train, Serve, Data), and per node. Historical usage downloads available as CSV.
Per-cluster cost meter — Each running Anyscale cluster shows live ACU burn rate and projected cost-to-completion based on workload type.
Spend alerts and caps — Configurable threshold alerts at $X ACU spend per workspace per period; hard spend caps available on hosted (not on BYOC commits).
Payment methods — Credit card on Developer and Hosted; ACH, wire transfer, invoice, and AWS/GCP Marketplace billing on Enterprise BYOC.
Annual commit + overage — Enterprise BYOC contracts include annual ACU commits with negotiated discount; overage above commit bills at standard ACU rates without renegotiation.
Support case management — In-app support case submission with SLA targets shown per tier (business-hours on Hosted, 24x7 on Enterprise BYOC).
Workspace and project RBAC — Workspace-level RBAC available on all tiers; SOC 2 audit-log exports on Enterprise.
Cloud-cost separation on BYOC — BYOC customers see Anyscale ACU billing and AWS/GCP/Azure compute billing as separate line items, which simplifies finance reconciliation against existing cloud-commit agreements.

Strategic wins : Why Anyscale’s pricing decisions worked

1. Ray maintainer-commercializer alignment as the foundational moat

Anyscale’s founders are the original Ray authors and continue as core maintainers — making the implicit “we run Ray better than anyone” pitch credible in a way that few open-source-commercialization stories achieve. For enterprise buyers comparing Anyscale to “we’ll self-host Ray on our K8s,” the maintainer alignment creates structural switching cost advantages: bugs land first on the commercial roadmap, performance optimizations ship to Anyscale first, and the same engineers writing Ray write RayTurbo.

2. Quantified RayTurbo claims as the markup-justification anchor

Rather than positioning the ACU markup as “the cost of management convenience,” Anyscale frames it as net-savings via RayTurbo (4.5× inference, 50% training cost, 90% autoscale). This quantified-value-metric framing lets enterprise procurement leaders run a defensible math: “we’d run 1.5× as long on raw cloud Ray; therefore the ACU premium under-prices the savings.” Whether or not customers validate the multipliers independently, the framing wins.

3. Two-mode pricing (Hosted + BYOC) captures both managed and committed-spend buyers

Hosted ACU includes underlying cloud compute and wins managed-simplicity buyers. BYOC ACU is platform-only and wins committed-spend optimizers who already have AWS EDP or GCP committed-use discount agreements. Most infrastructure commercializations force a single delivery mode and lose one buyer segment; Anyscale’s dual-mode pricing architecture doesn’t.

4. Endpoints sunset as a strategic discipline signal

When competitors held loss-making inference middleware as a must-have defensive product, Anyscale’s willingness to kill Endpoints in January 2025 signaled strategic discipline — and freed engineering capacity to focus on RayTurbo and enterprise platform. The decision likely reduced 2025 revenue (lost Endpoints customers) but improved long-term focus on the larger, stickier platform market. Most companies cannot make this trade-off; the ones that can earn enterprise procurement trust.

Areas to improve : Gaps in Anyscale’s pricing approach

1. BYOC platform-only ACU rates should be published

Today the published ACU rate card covers only hosted (cloud-compute-inclusive) pricing. BYOC platform-only rates — which can be materially lower since customers pay cloud compute directly — are gated behind a sales call. Publishing a tiered BYOC schedule (e.g., $X/ACU at 100K annual commit, $Y/ACU at 1M) would let large customers self-qualify into BYOC without sales friction and likely accelerate enterprise pipeline conversion.

2. RayTurbo benchmarks should live on the pricing page

The 4.5× / 50% / 90% performance claims live in Anyscale blog content rather than as rate-card supporting evidence. For procurement leaders evaluating the ACU markup, surfacing benchmark methodology and reproducibility next to the rate card would convert more “we’ll self-host Ray” deals. Without that proximity, the markup argument requires sales-led handholding rather than self-serve qualification.

3. No published path between Hosted credit-card and Enterprise BYOC

Hosted is credit-card billing, business-hours support, 5 case max. Enterprise BYOC is invoice or marketplace, 24x7 SLA, unlimited cases — but there is no published middle tier (e.g., 24x7 support without BYOC, or invoice billing without committing to BYOC). Mid-market customers growing past credit-card billing must jump straight to Enterprise BYOC, which is friction-heavy. A bridge tier would smooth the pricing transition and reduce churn at the boundary.

4. Spend-cap mechanics differ between Hosted and BYOC

Hosted supports hard spend caps; BYOC commits enforce overage at standard ACU rate but have no hard-cap mechanism (the assumption being that enterprise customers want overage availability rather than service stops). For mid-market BYOC customers, this asymmetry can produce unexpected overage bills. Adding an opt-in hard cap on BYOC overage — even at the cost of service throttling — would reduce bill-shock for the segment most sensitive to it.

Monetization stack & signals : how Anyscale builds & buys its revenue engine

Buys 0 Builds 0 2 signal roles

The read — where the monetization investment is going

Hybrid on one spine: the undisclosed 1:1 ACU meter feeds both self-serve credit-card billing and sales-led commits, with the Customer Engineer the expand lever and no deal-desk/RevOps hire yet. The Ray Data PM tell-role below marks where that monetization frontier actually sits.

Stack — build vs buy

Unconfirmed · 1

Metering Metering inferred

What the hiring reveals

View open roles

Senior / Staff Product Manager - Ray Data Monetization Jun 19, 2026

JD is explicitly a build-vs-buy-the-OSS-boundary role — 'balancing open source growth with commercial differentiation' and drawing 'the subtle line between growth and commercialization' — confirming the monetization frontier is the proprietary Anyscale RunTime layer that justifies the ACU markup over free Ray, not a packaging tweak.
Customer Engineer Customer success Jun 19, 2026

Post-sale role scoped to 'onboard, adopt and grow on Anyscale... and driving consumption' — on a pure-usage ACU model, consumption growth IS the revenue metric, so CS is staffed as a land-and-expand lever rather than ticket triage.

Signals reviewed Jun 2026 · derived from public job posts

Job postings fill and close over time — once a posting is filled we keep it as a dated citation (the quoted evidence remains); use View open roles for current listings.

Key takeaways

Open-source maintainer credibility is the most defensible commercialization moat. Anyscale’s founders being the original Ray authors creates durable competitive advantage that competitors with similar managed-Ray offerings cannot replicate. Infrastructure commercializations that lack maintainer alignment must invest disproportionately in performance benchmarks and customer-success motion to compensate.
Quantified runtime claims (RayTurbo) reframe markup as savings. Anyscale’s 4.5× / 50% / 90% framing turns the ACU markup conversation from “convenience tax” into “net cost reduction.” For any value-metric-priced infrastructure product, quantified performance claims attached to the rate card materially improve procurement conversion versus generic “managed service” positioning.
Two-mode delivery (managed + BYOC) accommodates both buyer archetypes. Hosted ACU wins managed-simplicity buyers; BYOC platform-only ACU wins committed-spend optimizers. The dual structure captures both segments without forcing a delivery choice on customers who have legitimate reasons to prefer either path.
Strategic sunsets (Endpoints) build long-term procurement trust. Anyscale’s willingness to kill its per-token inference product in January 2025 — accepting short-term revenue loss — improved roadmap focus and signaled discipline to enterprise buyers. For infrastructure platforms building decade-long customer relationships, this kind of pruning is more valuable than the lost revenue.
Cloud-marketplace billing as procurement-friction reduction matters for enterprise. Letting BYOC customers pay Anyscale via AWS or GCP Marketplace means the Anyscale spend consolidates with existing cloud-vendor commits — eliminating the new-vendor onboarding workflow that delays Fortune-500 procurement. For enterprise-platform companies, marketplace integration is now table stakes.

UBP implications

Maintainer-commercializer alignment is the structural moat for OSS-derived usage-based platforms. When the people writing the open-source project also run the commercial product, the implicit pitch (“we run it best”) creates durable competitive advantage that pure-play commercial competitors cannot replicate without acquiring the maintainers themselves.
Quantified performance multipliers can carry an infrastructure markup if the math holds. Anyscale’s RayTurbo claims (4.5× inference, 50% training, 90% autoscale) turn the ACU markup conversation into a net-savings calculation. Other pure-usage infrastructure platforms considering managed-service markups should invest in published, reproducible performance benchmarks that attach directly to the rate card.
Two-mode pricing (managed + customer-cloud) is the canonical enterprise UBP delivery model. As enterprise cloud spend consolidates into committed-spend agreements (AWS EDP, GCP CUD), infrastructure platforms that force managed-only delivery lose committed-spend buyers. The BYOC platform-only price + customer-direct cloud spend model is becoming the default enterprise architecture.

Sources

Anyscale pricing page (accessed 2026-05-29)
Anyscale docs (accessed 2026-05-29)
Anyscale blog — Introducing RayTurbo (accessed 2026-05-29)
Anyscale blog — Endpoints sunset announcement (accessed 2026-05-29)
Ray project (open-source) (accessed 2026-05-29)
Anyscale customer stories — Handshake, Canva, Attentive (accessed 2026-05-29)
Related infra blueprint — Baseten
Related infra blueprint — Fireworks AI

Bottom line

Anyscale priced its managed-Ray platform around two structural ideas: that the original Ray authors should commercialize their own framework better than anyone else can, and that the ACU markup over raw cloud compute should be net-zero or net-negative once RayTurbo’s claimed 4.5× / 50% / 90% gains are realized. The two-mode delivery (Hosted with cloud compute included, BYOC with platform-only ACU and customer-direct cloud billing) accommodates both managed-simplicity and committed-spend buyers without forcing a delivery choice.

For enterprise AI engineering leaders running distributed training or inference at scale, Anyscale is the most credible “we maintain the framework and run it for you” platform on the market — and the 2025 sunset of Anyscale Endpoints signaled the kind of strategic discipline that improves long-term procurement trust. The remaining gaps (unpublished BYOC discount schedules, RayTurbo benchmarks living in blog rather than rate-card-adjacent docs, no mid-tier between Hosted and Enterprise BYOC) are GTM polish problems rather than structural pricing flaws.

Compare with peers via the blueprint corpus, or model your own spend with the Anyscale pricing calculator.

Pricing timeline : Major events on a vertical axis

Each milestone below corresponds to a public pricing change, product launch, or material adjustment. Major events use a filled marker; minor adjustments use a faded one.

Cloud Marketplace Billing for BYOC

Nov 2025

Anyscale added support for AWS Marketplace and GCP Marketplace billing on BYOC contracts, letting enterprise customers consume committed cloud spend through their existing cloud-provider agreements while paying Anyscale via marketplace settlement.

H200 GPU Availability Added

Jun 2025

Anyscale added NVIDIA H200 (141GB) at $10.6812/hr ACU — first availability of Hopper-refresh-class GPU on the hosted platform. Positioned for frontier model training and large-scale batch inference.

Anyscale Endpoints Sunset

Jan 2025

Anyscale sunset Anyscale Endpoints, ending its participation in the per-token LLM inference middleware market. Existing customers were migrated to RayTurbo + Ray Serve dedicated deployments or pointed to first-party model providers. The company refocused on enterprise platform sales and BYOC commitments.

ACU (Anyscale Credit) Rate Card Published

Sep 2024

Anyscale published its full per-hour ACU rate card across CPU, T4, L4, A10G, A100, H100, and H200 — denominated 1:1 with USD. Established price transparency for the hosted compute SKU.

RayTurbo Runtime Launched

Mar 2024

Anyscale launched RayTurbo — a proprietary runtime extension to open-source Ray claiming 4.5× faster LLM inference, 50% lower training cost, and 90% faster autoscale. Positioned the ACU markup as net-savings for customers compared to raw cloud + open-source Ray.

Anyscale Endpoints Launched at $1/1M Tokens

Aug 2023

Anyscale launched Anyscale Endpoints — a managed LLM inference API for Llama 2, Code Llama, and Mistral models — at $1 per million tokens. Positioned to compete with Together AI and Fireworks AI for the per-token inference market.

Series C ($99M) at ~$1B Valuation

Dec 2021

Anyscale raised a $99M Series C led by Addition with Andreessen Horowitz participation, reaching ~$1B post-money valuation. Funded the launch of Anyscale Endpoints, the managed Ray platform GA, and the enterprise BYOC offering.

Anyscale Founded

Dec 2019

Robert Nishihara, Philipp Moritz, and Ion Stoica (UC Berkeley RISELab — original Ray authors) founded Anyscale to commercialize the Ray distributed computing framework. Initial product was a hosted managed-Ray platform with simple per-hour compute pricing on top of underlying cloud rates.

Trivia

· Anyscale's ACU (Anyscale Compute Unit) is denominated 1:1 with USD on the published rate card — meaning a $4.9591/hour A100 line item literally bills $4.9591 in cash per hour, with cloud-list compute already included for hosted customers.
· Anyscale was founded in 2019 by Robert Nishihara, Philipp Moritz, and Ion Stoica — three of the original UC Berkeley RISELab authors of Ray — making it the rare commercial product where the open-source maintainers, the company founders, and the lead committers are the same people.
· Anyscale Endpoints (LLM inference at $1/1M tokens) launched August 2023 to compete with Together AI and Fireworks; the product was sunset on January 14, 2025 as Anyscale pivoted to RayTurbo and the broader enterprise platform — one of the highest-profile product sunsets in inference middleware.

Questions & answers

How much does Anyscale cost per month?: Anyscale has no monthly subscription fee — you pay only for the Anyscale Credits (ACU) consumed by your compute. ACU rates per hour: CPU-only $0.0135, T4 $0.5682, L4 $0.9542, A10G $1.3635, A100 $4.9591, H100 $9.2880, H200 $10.6812. A small dev cluster of 1 A10G running 4 hours/day would cost ~$163/month.
What is an Anyscale Compute Unit (ACU)?: An ACU is Anyscale's billing unit, denominated 1:1 with USD on the published rate card. For Anyscale-hosted customers, ACU includes underlying cloud compute, RayTurbo runtime, the Anyscale control plane, and standard support. For BYOC customers, the ACU rate is lower (platform-only) and the cloud-compute bill goes directly to the customer's AWS/GCP/Azure account.
Does Anyscale have a free tier or free trial?: Yes — new users receive $100 in free ACU credits to evaluate the platform with no credit card required. Template projects (Ray Train, Ray Serve, Ray Data starters) cost $3–$5 in credits each, so the trial covers 20+ non-trivial proof-of-value runs.
What is the difference between Anyscale Hosted and BYOC (Bring Your Own Cloud)?: Hosted runs in Anyscale's managed VPC with credit-card billing, business-hours support, and a 5-case-submission cap; available in a limited set of regions. BYOC runs in your own AWS, GCP, or Azure VPC (or on-prem), with invoice or cloud-marketplace billing, 24x7 enterprise SLAs, unlimited support cases, and any cloud region or on-prem deployment your team can stand up.
What does RayTurbo do and is it required?: RayTurbo is Anyscale's proprietary runtime extension to open-source Ray, claiming 4.5× faster LLM inference, 50% lower training cost, and 90% faster autoscale. It is included with hosted Anyscale at no separate charge — customers who want pure open-source Ray can self-host. RayTurbo's claimed performance gains are the marketed justification for the ACU markup over raw cloud compute.
What happened to Anyscale Endpoints?: Anyscale Endpoints — the managed per-token LLM inference API launched August 2023 at $1/1M tokens — was sunset on January 14, 2025. Existing customers were migrated to RayTurbo + Ray Serve dedicated deployments or pointed to first-party providers (DeepSeek, Together AI, Fireworks). The sunset reflected Anyscale's strategic refocus on enterprise platform sales and BYOC commitments.