AI Summary
About
Anyscale is a San Francisco-based AI infrastructure company founded in December 2019 by Robert Nishihara, Philipp Moritz, and Ion Stoica — three of the original UC Berkeley RISELab authors of Ray, the open-source distributed computing framework that underpins modern Python-based AI training and inference at scale. The product is a managed Ray platform: customers write standard Ray code (Ray Train, Ray Serve, Ray Data, RLlib) and Anyscale provisions the underlying cluster, applies its RayTurbo runtime optimizations, and exposes per-hour billing through an Anyscale Compute Unit (ACU) denominated 1:1 with USD.
By 2026 Anyscale runs production workloads for enterprises including Handshake, Canva, Attentive, Pinterest, Cohere, and Roblox — covering use cases that span large-scale LLM training, batch inference, RAG indexing, recommendation systems, and time-series feature engineering. The company raised a $99M Series C in late 2021 led by Addition with Andreessen Horowitz participation at ~$1B post-money valuation, with subsequent extension rounds maintaining the $1B+ valuation through 2025.
Anyscale competes with hyperscaler ML platforms (AWS Sagemaker, Vertex AI, Azure ML), distributed-training specialists (Modal, Together AI for inference, MosaicML pre-Databricks-acquisition), and the increasingly viable “build it yourself on Kubernetes + open-source Ray” path. Anyscale’s differentiation is the combination of being the maintainers-and-commercializers of Ray, the RayTurbo runtime that claims meaningful performance gains over raw open-source Ray, and a dual hosted+BYOC delivery model that lets enterprises pick managed-VPC simplicity or own-VPC control without changing tooling.
Pricing summary : How Anyscale’s ACU per-hour + commit + BYOC stack works
Anyscale runs a pure-usage compute model denominated in Anyscale Credits (ACU), where 1 ACU = $1 USD on the published rate card. Hosted customers pay an all-in ACU rate per hour that includes underlying cloud compute (AWS or GCP), the RayTurbo runtime, the Anyscale control plane, and standard support. BYOC customers pay a lower platform-only ACU rate plus their own cloud compute bill directly to AWS, GCP, or Azure — using existing GPU reservations and committed-spend agreements to materially reduce total cost.
The free tier is structured as a $100 ACU credit grant on signup, which covers 20+ template-project runs (Ray Train, Serve, Data starters at $3–$5 each). Pro-equivalent volume discounts are not published; instead, Anyscale routes mid-market and enterprise customers into annual committed-contract deals that deliver discounts proportional to commit size, with overages billing at standard ACU rates. This pure-usage + commitment hybrid is increasingly the canonical AI infrastructure pricing pattern.
What makes this different: Anyscale’s three founders are also the original Ray authors and core maintainers. This makes the implicit value proposition — “we run Ray better than anyone else can self-host it” — credible in a way that few open-source-commercialization stories are. The RayTurbo runtime quantifies that claim (4.5× inference, 50% training cost reduction, 90% faster autoscale), turning a marketing positioning into a measurable ACU markup justification.
Pricing by product
Anyscale Compute Units (ACU) — hosted per-hour rates
| Instance | VRAM / type | ACU per hour | Notes |
|---|---|---|---|
| CPU-only | n/a | $0.0135 | Head node, lightweight orchestration |
| NVIDIA T4 | 16 GiB | $0.5682 | Low-cost inference, embeddings |
| NVIDIA L4 | 24 GiB | $0.9542 | Mid-range inference, video |
| NVIDIA A10G | 24 GiB | $1.3635 | 7B–13B inference, image generation |
| NVIDIA A100 | 80 GiB | $4.9591 | 30B–70B training and inference |
| NVIDIA H100 | 80 GiB | $9.2880 | Frontier training, low-latency inference |
| NVIDIA H200 | 141 GiB | $10.6812 | Largest open-model training, batch inference |
Tier and deployment matrix
| Feature | Developer ($100 trial) | Hosted | Enterprise BYOC |
|---|---|---|---|
| ACU rate | Published | Published | Lower (platform-only) |
| Regions | Limited | Limited | Any cloud / region / on-prem |
| Infrastructure | Anyscale-managed | Anyscale-managed | Your VPC |
| GPU sourcing | Anyscale cloud-list | Anyscale cloud-list | Your committed-spend agreements |
| Billing | Credit-card | Monthly credit-card | Invoice or cloud marketplace |
| Support | Business hours, 5 cases | Business hours, 5 cases | 24x7 enterprise SLAs, unlimited |
| Free credit | $100 ACU | None | None |
| Annual commit discounts | n/a | Yes | Yes (larger) |
Customer-reported workload savings
| Customer | Reported savings | Workload type |
|---|---|---|
| Handshake | ~50% | Distributed training + recommendations |
| Canva | ~50% | Inference and batch processing |
| Attentive | ~99% on specific workload | Batch ML pipeline migration |
Sales motions across products: PLG / self-serve for Developer trial and Hosted credit-card customers; sales-led for Enterprise BYOC and annual committed contracts. All prices accessed 2026-05-29 from anyscale.com/pricing.
Hidden costs : What Anyscale customers actually pay beyond the ACU rate card
Archetype A: ML startup running RayTrain training jobs on hosted Anyscale
A growth-stage AI startup running distributed Ray Train jobs on A10G clusters for fine-tuning, with traffic concentrated in weekly training runs:
| Line item | Monthly cost |
|---|---|
| A10G cluster (4 nodes × 8h × 4 runs/mo × $1.3635) | $174 |
| Head node CPU ($0.0135 × 80 hours active) | $1.08 |
| Storage for model checkpoints (S3, billed separately) | ~$25 |
| Cross-region data transfer (negligible at this scale) | <$5 |
| Estimated total | ~$205/month |
The $100 trial credit covers roughly the first month of this workload. After the trial, real costs scale linearly with cluster hours — and the customer learns that warm-pool autoscale (RayTurbo’s 90% improvement) means the cluster downsizes faster between jobs than a hand-rolled Ray cluster would, materially reducing idle billing.
Archetype B: Mid-market enterprise on BYOC with annual commit
A 200-person enterprise running production Ray Serve inference on BYOC inside their AWS Reserved Instance pool, with a $250K annual commit:
| Line item | Monthly cost |
|---|---|
| Anyscale ACU (platform-only on BYOC) — committed | $20,833 |
| AWS compute (their RI pool, billed direct) | ~$45,000 |
| Anyscale overage above commit (occasional spike) | $0–$3,000 |
| 24x7 enterprise SLA support | Included in commit |
| Estimated total | ~$66,000–$69,000/month |
The platform-only ACU rate on BYOC is materially lower than the all-in hosted rate — Anyscale doesn’t publish the BYOC discount schedule, but the trade-off is real: customers buy a $250K annual platform commit and use it on their own RI pool. For organizations with existing AWS Enterprise Discount Programs (EDP), this can deliver the 50% infrastructure savings that Handshake and Canva publicly cite.
Want to estimate your own Anyscale bill? Use the Anyscale pricing calculator to model ACU consumption by instance type, weekly active hours, and BYOC versus hosted deployment.
Pricing evolution : Anyscale’s pricing history from managed-Ray to RayTurbo-led enterprise
Cadence
| Quarter | Price changes | Product / SKU additions | Notes |
|---|---|---|---|
| 2019 Q4 | 0 | 1 | Anyscale founded; initial managed-Ray product |
| 2021 Q4 | 0 | 0 | Series C ($99M) at ~$1B; funded Endpoints + BYOC |
| 2023 Q3 | 0 | 1 | Anyscale Endpoints launched at $1/1M tokens |
| 2024 Q1 | 0 | 1 | RayTurbo runtime launched as proprietary differentiator |
| 2024 Q3 | 1 | 0 | ACU rate card published (CPU → H100) |
| 2025 Q1 | 0 | -1 | Anyscale Endpoints sunset January 14, 2025 |
| 2025 Q2 | 0 | 1 | H200 GPU availability added at $10.6812/hr |
| 2025 Q4 | 0 | 1 | AWS + GCP Marketplace billing for BYOC contracts |
Tracked range: 2019 Q4–2025 Q4. Quarters not listed above were verified stable (0 price changes, 0 SKU additions).
Notable changes
- 2023-08-22 — Anyscale Endpoints launched at $1 per million tokens, briefly entering the per-token LLM inference market alongside Together AI and Fireworks.
- 2024-03-19 — RayTurbo runtime launched, quantifying performance gains (4.5× inference, 50% training cost, 90% autoscale) to justify the ACU markup.
- 2024-09-10 — Full ACU rate card published across CPU through H100; transparency play to enable self-serve mid-market evaluation.
- 2025-01-14 — Anyscale Endpoints sunset; strategic refocus onto enterprise platform and BYOC.
- 2025-11-19 — AWS Marketplace and GCP Marketplace billing added for BYOC, letting enterprise customers consume Anyscale spend through existing cloud-provider commit agreements.
The 2025 Endpoints sunset in detail
When Anyscale Endpoints launched in August 2023, the per-token LLM inference market was an open contest among Together AI, Fireworks AI, Anyscale, and a long tail of smaller providers. The product priced at $1 per million tokens for Llama 2 7B — competitive but not dramatically cheaper than peers, with the differentiating pitch being “the same Ray Serve infrastructure that runs production inference at Pinterest now runs your LLM.”
By late 2024, the inference market had bifurcated: first-party model providers (DeepSeek, Mistral, Cohere) shipped competitive per-token rates and OpenAI-compatible APIs, while specialized inference clouds (Together AI, Fireworks, Baseten) invested heavily in latency optimization and model-specific quantization. Anyscale’s product was neither the cheapest nor the fastest — and its Endpoints customers were mostly evaluation traffic rather than production workloads. The January 2025 sunset reflected a strategic decision to refocus on the larger and stickier enterprise-platform market where Anyscale’s Ray credibility creates more durable competitive advantage than per-token inference middleware.
What’s unique : Anyscale’s distinctive pricing mechanics
1. ACU = $1 USD denomination removes credit-translation cognitive load. Most cloud platforms denominate credits in opaque points (AWS credits, GCP credits) or in non-dollar quantities (Snowflake credits at variable USD rates per edition). Anyscale’s 1:1 ACU = $1 USD means a customer reading “this notebook costs 5 ACU” can immediately read $5. This transparent unit economics reduces conversion friction for self-serve developers and finance-team review of enterprise bills.
2. RayTurbo as the quantified markup justification. When customers question why hosted Anyscale costs more per A100 hour than raw cloud A100 list, the answer is RayTurbo’s claimed 4.5× inference / 50% training cost / 90% autoscale gains — measurable performance multipliers that, if accurate, make the all-in ACU rate net-cheaper than self-hosted Ray. Few infrastructure commercializations quantify their value-add this explicitly.
3. Two-mode delivery (Hosted vs BYOC) at materially different prices. Hosted ACU includes cloud compute; BYOC ACU is platform-only and the customer pays cloud directly. This dual-mode pricing lets customers with existing AWS Enterprise Discount Programs or GPU reservations capture those discounts without giving them back to a hosted provider — and Anyscale captures the platform-revenue regardless. The pricing architecture accommodates both managed-simplicity buyers and committed-spend optimizers.
4. Endpoints sunset signaled strategic discipline rather than weakness. When competitors held inference middleware as a defensive must-have product, Anyscale’s January 2025 decision to sunset Endpoints freed engineering and GTM capacity to focus on enterprise platform sales — where Ray creator-credibility creates a defensible moat. Many infrastructure companies refuse to kill underperforming products; Anyscale’s willingness to do so improved the focus of the remaining roadmap.
5. Cloud Marketplace billing for BYOC consolidates procurement. Letting enterprises pay Anyscale via AWS Marketplace or GCP Marketplace means the Anyscale spend counts against existing cloud commits and goes through existing procurement workflows. For Fortune-500 buyers with multi-million-dollar cloud-spend agreements, this is a meaningful procurement-friction reduction that locks in cloud-marketplace spend without requiring new vendor onboarding.
Strengths & weaknesses
| Strengths | Weaknesses |
|---|---|
| Ray creator-and-commercializer credibility unique among inference platforms | RayTurbo performance claims (4.5×, 50%, 90%) require customer validation — not third-party benchmarked publicly |
| 1:1 ACU = USD denomination removes billing-unit cognitive overhead | BYOC platform-only ACU rates are not published — must contact sales |
| Two-mode delivery (Hosted + BYOC) accommodates both managed and committed-spend buyers | Endpoints sunset (Jan 2025) means customers wanting per-token LLM inference must go elsewhere |
| $100 free trial is generous enough for non-trivial proof-of-value | Hosted support capped at 5 case submissions — 24x7 SLAs only on Enterprise BYOC |
| Customer-reported savings (Handshake 50%, Canva 50%, Attentive 99%) credible at scale | Hosted regions are limited; global enterprise customers must adopt BYOC |
| Cloud-marketplace billing on BYOC consolidates procurement for large customers | Pricing page does not surface RayTurbo performance benchmarks — claims live in blog content rather than rate-card supporting docs |
Billing UX : Anyscale’s account controls and payment experience
- Self-serve signup with $100 ACU trial — Sign up at
console.anyscale.comwith email; $100 in free credits applied automatically. No credit card required to start template projects. - ACU usage console — Real-time view of ACU consumption per cluster, per workload type (Train, Serve, Data), and per node. Historical usage downloads available as CSV.
- Per-cluster cost meter — Each running Anyscale cluster shows live ACU burn rate and projected cost-to-completion based on workload type.
- Spend alerts and caps — Configurable threshold alerts at $X ACU spend per workspace per period; hard spend caps available on hosted (not on BYOC commits).
- Payment methods — Credit card on Developer and Hosted; ACH, wire transfer, invoice, and AWS/GCP Marketplace billing on Enterprise BYOC.
- Annual commit + overage — Enterprise BYOC contracts include annual ACU commits with negotiated discount; overage above commit bills at standard ACU rates without renegotiation.
- Support case management — In-app support case submission with SLA targets shown per tier (business-hours on Hosted, 24x7 on Enterprise BYOC).
- Workspace and project RBAC — Workspace-level RBAC available on all tiers; SOC 2 audit-log exports on Enterprise.
- Cloud-cost separation on BYOC — BYOC customers see Anyscale ACU billing and AWS/GCP/Azure compute billing as separate line items, which simplifies finance reconciliation against existing cloud-commit agreements.
Strategic wins : Why Anyscale’s pricing decisions worked
1. Ray maintainer-commercializer alignment as the foundational moat
Anyscale’s founders are the original Ray authors and continue as core maintainers — making the implicit “we run Ray better than anyone” pitch credible in a way that few open-source-commercialization stories achieve. For enterprise buyers comparing Anyscale to “we’ll self-host Ray on our K8s,” the maintainer alignment creates structural switching cost advantages: bugs land first on the commercial roadmap, performance optimizations ship to Anyscale first, and the same engineers writing Ray write RayTurbo.
2. Quantified RayTurbo claims as the markup-justification anchor
Rather than positioning the ACU markup as “the cost of management convenience,” Anyscale frames it as net-savings via RayTurbo (4.5× inference, 50% training cost, 90% autoscale). This quantified-value-metric framing lets enterprise procurement leaders run a defensible math: “we’d run 1.5× as long on raw cloud Ray; therefore the ACU premium under-prices the savings.” Whether or not customers validate the multipliers independently, the framing wins.
3. Two-mode pricing (Hosted + BYOC) captures both managed and committed-spend buyers
Hosted ACU includes underlying cloud compute and wins managed-simplicity buyers. BYOC ACU is platform-only and wins committed-spend optimizers who already have AWS EDP or GCP committed-use discount agreements. Most infrastructure commercializations force a single delivery mode and lose one buyer segment; Anyscale’s dual-mode pricing architecture doesn’t.
4. Endpoints sunset as a strategic discipline signal
When competitors held loss-making inference middleware as a must-have defensive product, Anyscale’s willingness to kill Endpoints in January 2025 signaled strategic discipline — and freed engineering capacity to focus on RayTurbo and enterprise platform. The decision likely reduced 2025 revenue (lost Endpoints customers) but improved long-term focus on the larger, stickier platform market. Most companies cannot make this trade-off; the ones that can earn enterprise procurement trust.
Areas to improve : Gaps in Anyscale’s pricing approach
1. BYOC platform-only ACU rates should be published
Today the published ACU rate card covers only hosted (cloud-compute-inclusive) pricing. BYOC platform-only rates — which can be materially lower since customers pay cloud compute directly — are gated behind a sales call. Publishing a tiered BYOC schedule (e.g., $X/ACU at 100K annual commit, $Y/ACU at 1M) would let large customers self-qualify into BYOC without sales friction and likely accelerate enterprise pipeline conversion.
2. RayTurbo benchmarks should live on the pricing page
The 4.5× / 50% / 90% performance claims live in Anyscale blog content rather than as rate-card supporting evidence. For procurement leaders evaluating the ACU markup, surfacing benchmark methodology and reproducibility next to the rate card would convert more “we’ll self-host Ray” deals. Without that proximity, the markup argument requires sales-led handholding rather than self-serve qualification.
3. No published path between Hosted credit-card and Enterprise BYOC
Hosted is credit-card billing, business-hours support, 5 case max. Enterprise BYOC is invoice or marketplace, 24x7 SLA, unlimited cases — but there is no published middle tier (e.g., 24x7 support without BYOC, or invoice billing without committing to BYOC). Mid-market customers growing past credit-card billing must jump straight to Enterprise BYOC, which is friction-heavy. A bridge tier would smooth the pricing transition and reduce churn at the boundary.
4. Spend-cap mechanics differ between Hosted and BYOC
Hosted supports hard spend caps; BYOC commits enforce overage at standard ACU rate but have no hard-cap mechanism (the assumption being that enterprise customers want overage availability rather than service stops). For mid-market BYOC customers, this asymmetry can produce unexpected overage bills. Adding an opt-in hard cap on BYOC overage — even at the cost of service throttling — would reduce bill-shock for the segment most sensitive to it.
Key takeaways
-
Open-source maintainer credibility is the most defensible commercialization moat. Anyscale’s founders being the original Ray authors creates durable competitive advantage that competitors with similar managed-Ray offerings cannot replicate. Infrastructure commercializations that lack maintainer alignment must invest disproportionately in performance benchmarks and customer-success motion to compensate.
-
Quantified runtime claims (RayTurbo) reframe markup as savings. Anyscale’s 4.5× / 50% / 90% framing turns the ACU markup conversation from “convenience tax” into “net cost reduction.” For any value-metric-priced infrastructure product, quantified performance claims attached to the rate card materially improve procurement conversion versus generic “managed service” positioning.
-
Two-mode delivery (managed + BYOC) accommodates both buyer archetypes. Hosted ACU wins managed-simplicity buyers; BYOC platform-only ACU wins committed-spend optimizers. The dual structure captures both segments without forcing a delivery choice on customers who have legitimate reasons to prefer either path.
-
Strategic sunsets (Endpoints) build long-term procurement trust. Anyscale’s willingness to kill its per-token inference product in January 2025 — accepting short-term revenue loss — improved roadmap focus and signaled discipline to enterprise buyers. For infrastructure platforms building decade-long customer relationships, this kind of pruning is more valuable than the lost revenue.
-
Cloud-marketplace billing as procurement-friction reduction matters for enterprise. Letting BYOC customers pay Anyscale via AWS or GCP Marketplace means the Anyscale spend consolidates with existing cloud-vendor commits — eliminating the new-vendor onboarding workflow that delays Fortune-500 procurement. For enterprise-platform companies, marketplace integration is now table stakes.
UBP implications
-
Maintainer-commercializer alignment is the structural moat for OSS-derived usage-based platforms. When the people writing the open-source project also run the commercial product, the implicit pitch (“we run it best”) creates durable competitive advantage that pure-play commercial competitors cannot replicate without acquiring the maintainers themselves.
-
Quantified performance multipliers can carry an infrastructure markup if the math holds. Anyscale’s RayTurbo claims (4.5× inference, 50% training, 90% autoscale) turn the ACU markup conversation into a net-savings calculation. Other pure-usage infrastructure platforms considering managed-service markups should invest in published, reproducible performance benchmarks that attach directly to the rate card.
-
Two-mode pricing (managed + customer-cloud) is the canonical enterprise UBP delivery model. As enterprise cloud spend consolidates into committed-spend agreements (AWS EDP, GCP CUD), infrastructure platforms that force managed-only delivery lose committed-spend buyers. The BYOC platform-only price + customer-direct cloud spend model is becoming the default enterprise architecture.
Sources
- Anyscale pricing page (accessed 2026-05-29)
- Anyscale docs (accessed 2026-05-29)
- Anyscale blog — Introducing RayTurbo (accessed 2026-05-29)
- Anyscale blog — Endpoints sunset announcement (accessed 2026-05-29)
- Ray project (open-source) (accessed 2026-05-29)
- Anyscale customer stories — Handshake, Canva, Attentive (accessed 2026-05-29)
- Related infra blueprint — Baseten
- Related infra blueprint — Fireworks AI
Bottom line
Anyscale priced its managed-Ray platform around two structural ideas: that the original Ray authors should commercialize their own framework better than anyone else can, and that the ACU markup over raw cloud compute should be net-zero or net-negative once RayTurbo’s claimed 4.5× / 50% / 90% gains are realized. The two-mode delivery (Hosted with cloud compute included, BYOC with platform-only ACU and customer-direct cloud billing) accommodates both managed-simplicity and committed-spend buyers without forcing a delivery choice.
For enterprise AI engineering leaders running distributed training or inference at scale, Anyscale is the most credible “we maintain the framework and run it for you” platform on the market — and the 2025 sunset of Anyscale Endpoints signaled the kind of strategic discipline that improves long-term procurement trust. The remaining gaps (unpublished BYOC discount schedules, RayTurbo benchmarks living in blog rather than rate-card-adjacent docs, no mid-tier between Hosted and Enterprise BYOC) are GTM polish problems rather than structural pricing flaws.
Compare with peers via the blueprint corpus, or model your own spend with the Anyscale pricing calculator.
Pricing timeline : Major events on a vertical axis
Each milestone below corresponds to a public pricing change, product launch, or material adjustment. Major events use a filled marker; minor adjustments use a faded one.
Cloud Marketplace Billing for BYOC
Anyscale added support for AWS Marketplace and GCP Marketplace billing on BYOC contracts, letting enterprise customers consume committed cloud spend through their existing cloud-provider agreements while paying Anyscale via marketplace settlement.
H200 GPU Availability Added
Anyscale added NVIDIA H200 (141GB) at $10.6812/hr ACU — first availability of Hopper-refresh-class GPU on the hosted platform. Positioned for frontier model training and large-scale batch inference.
Anyscale Endpoints Sunset
Anyscale sunset Anyscale Endpoints, ending its participation in the per-token LLM inference middleware market. Existing customers were migrated to RayTurbo + Ray Serve dedicated deployments or pointed to first-party model providers. The company refocused on enterprise platform sales and BYOC commitments.
ACU (Anyscale Credit) Rate Card Published
Anyscale published its full per-hour ACU rate card across CPU, T4, L4, A10G, A100, H100, and H200 — denominated 1:1 with USD. Established price transparency for the hosted compute SKU.
RayTurbo Runtime Launched
Anyscale launched RayTurbo — a proprietary runtime extension to open-source Ray claiming 4.5× faster LLM inference, 50% lower training cost, and 90% faster autoscale. Positioned the ACU markup as net-savings for customers compared to raw cloud + open-source Ray.
Anyscale Endpoints Launched at $1/1M Tokens
Anyscale launched Anyscale Endpoints — a managed LLM inference API for Llama 2, Code Llama, and Mistral models — at $1 per million tokens. Positioned to compete with Together AI and Fireworks AI for the per-token inference market.
Series C ($99M) at ~$1B Valuation
Anyscale raised a $99M Series C led by Addition with Andreessen Horowitz participation, reaching ~$1B post-money valuation. Funded the launch of Anyscale Endpoints, the managed Ray platform GA, and the enterprise BYOC offering.
Anyscale Founded
Robert Nishihara, Philipp Moritz, and Ion Stoica (UC Berkeley RISELab — original Ray authors) founded Anyscale to commercialize the Ray distributed computing framework. Initial product was a hosted managed-Ray platform with simple per-hour compute pricing on top of underlying cloud rates.
- · Anyscale's ACU (Anyscale Compute Unit) is denominated 1:1 with USD on the published rate card — meaning a $4.9591/hour A100 line item literally bills $4.9591 in cash per hour, with cloud-list compute already included for hosted customers.
- · Anyscale was founded in 2019 by Robert Nishihara, Philipp Moritz, and Ion Stoica — three of the original UC Berkeley RISELab authors of Ray — making it the rare commercial product where the open-source maintainers, the company founders, and the lead committers are the same people.
- · Anyscale Endpoints (LLM inference at $1/1M tokens) launched August 2023 to compete with Together AI and Fireworks; the product was sunset on January 14, 2025 as Anyscale pivoted to RayTurbo and the broader enterprise platform — one of the highest-profile product sunsets in inference middleware.
Questions & answers
- How much does Anyscale cost per month?
- Anyscale has no monthly subscription fee — you pay only for the Anyscale Credits (ACU) consumed by your compute. ACU rates per hour: CPU-only $0.0135, T4 $0.5682, L4 $0.9542, A10G $1.3635, A100 $4.9591, H100 $9.2880, H200 $10.6812. A small dev cluster of 1 A10G running 4 hours/day would cost ~$163/month.
- What is an Anyscale Compute Unit (ACU)?
- An ACU is Anyscale's billing unit, denominated 1:1 with USD on the published rate card. For Anyscale-hosted customers, ACU includes underlying cloud compute, RayTurbo runtime, the Anyscale control plane, and standard support. For BYOC customers, the ACU rate is lower (platform-only) and the cloud-compute bill goes directly to the customer's AWS/GCP/Azure account.
- Does Anyscale have a free tier or free trial?
- Yes — new users receive $100 in free ACU credits to evaluate the platform with no credit card required. Template projects (Ray Train, Ray Serve, Ray Data starters) cost $3–$5 in credits each, so the trial covers 20+ non-trivial proof-of-value runs.
- What is the difference between Anyscale Hosted and BYOC (Bring Your Own Cloud)?
- Hosted runs in Anyscale's managed VPC with credit-card billing, business-hours support, and a 5-case-submission cap; available in a limited set of regions. BYOC runs in your own AWS, GCP, or Azure VPC (or on-prem), with invoice or cloud-marketplace billing, 24x7 enterprise SLAs, unlimited support cases, and any cloud region or on-prem deployment your team can stand up.
- What does RayTurbo do and is it required?
- RayTurbo is Anyscale's proprietary runtime extension to open-source Ray, claiming 4.5× faster LLM inference, 50% lower training cost, and 90% faster autoscale. It is included with hosted Anyscale at no separate charge — customers who want pure open-source Ray can self-host. RayTurbo's claimed performance gains are the marketed justification for the ACU markup over raw cloud compute.
- What happened to Anyscale Endpoints?
- Anyscale Endpoints — the managed per-token LLM inference API launched August 2023 at $1/1M tokens — was sunset on January 14, 2025. Existing customers were migrated to RayTurbo + Ray Serve dedicated deployments or pointed to first-party providers (DeepSeek, Together AI, Fireworks). The sunset reflected Anyscale's strategic refocus on enterprise platform sales and BYOC commitments.