How much does Modal cost per month?

Modal's Starter tier is free with $30/month in compute credits, plus pay-as-you-go above that. Team plan is $250/month + compute (with $100/month in credits). A typical mid-volume workload running an A100 80GB ($0.000694/sec) for 4 hours/day would cost about $300/month in compute on top of the plan fee.

What GPU rates does Modal publish?

Modal publishes per-second rates: T4 $0.000164, L4 $0.000222, A10 $0.000306, L40S $0.000542, A100 40GB $0.000583, A100 80GB $0.000694, RTX PRO 6000 $0.000842, H100 $0.001097, H200 $0.001261, B200 $0.001736, B300 $0.001972. All bill per second of active use — idle containers cost only memory and storage. Region selection adds 1.5–1.75× and non-preemptible execution 3× on top of base rates.

What does Modal include in the free Starter tier?

Starter is free with $30/month in compute credits, 3 workspace seats, 100 container concurrency, and 10 GPU concurrency. Credits cover roughly 27 hours of T4 or 7.5 hours of H100. Sufficient for evaluation and small-scale production; no credit card required.

How is Modal's CPU and memory billed?

CPU is $0.0000131 per physical core per second (minimum 0.125 cores per task); memory is $0.00000222 per GiB per second. A typical lightweight function using 1 vCPU and 4 GiB for 60 seconds costs about $0.0008. Persistent volume storage is $0.09/GiB-month with 1 TiB/month free.

Does Modal offer startup credits?

Yes — qualifying startups and academic researchers can receive up to $10,000 in Modal credits. Application is via Modal's startup program; eligibility typically requires early-stage status, ML/AI focus, or academic affiliation. Among the most generous credit grants in serverless GPU.

How does Modal Team plan differ from Enterprise?

Team ($250/mo + compute, $100/mo credits) offers unlimited seats, 1,000 container concurrency, 50 GPU concurrency, and standard compliance (SOC 2). Enterprise is quote-based and adds custom volume discounts, dedicated capacity reservations, advanced security (audit logs, RBAC), VPC deployment, and marketplace billing via AWS/GCP.

Modal Pricing

AI Summary

Modal runs a per-second pure-usage compute model: GPU rates from T4 at $0.000164/sec to B300 at $0.001972/sec, with B200 at $0.001736/sec, H200 at $0.001261/sec, H100 at $0.001097/sec, RTX PRO 6000 at $0.000842/sec, A100 80GB at $0.000694/sec, A100 40GB at $0.000583/sec, L40S at $0.000542/sec, A10 at $0.000306/sec, and L4 at $0.000222/sec.
CPU is billed at $0.0000131/physical-core/sec (minimum 0.125 cores); memory at $0.00000222/GiB/sec; persistent volume storage at $0.09/GiB-month with 1 TiB/month free. Region selection is billed at 1.5–1.75× base prices and non-preemptible execution at 3× base prices. Sandbox + Notebooks compute is metered separately at $0.00003942/core/sec CPU and $0.00000667/GiB/sec memory.
Three plan tiers: Starter (free, $30/mo in credits, 3 workspace seats, 100 containers, 10 GPU concurrency); Team ($250/mo + compute, $100/mo in credits, unlimited seats, 1,000 containers, 50 GPU concurrency); Enterprise (custom, volume discounts, enhanced security).
Modal grants up to $10,000 in credits to qualifying startups and academic researchers; AWS and GCP marketplace integration available for committed-spend customers.
Founded 2021 by Erik Bernhardsson (creator of Luigi and Annoy, ex-Spotify ML) and Akshat Bubna; raised Series A ($16M) led by Redpoint in 2023 and Series B ($31M+ reported) in 2024.
Marketing emphasizes Python-native developer experience with `@app.function()` decorators that wrap Python code as serverless functions — making Modal one of the most opinionated 'Python as the only language' serverless platforms.

Pricing summary

Modal 2026 — Per-second serverless compute

Starter ($0 + $30 credits) → Team ($250/mo + compute) → Enterprise (custom). H100 at $0.001097/sec.

Starter

$0 /mo + usage

Solo developers, prototypes, evaluation

$100 credits/mo

Team

$250 /mo + compute

Growth-stage teams, sustained production

Annual commit

Enterprise

Custom

Compliance-heavy, large-scale orgs

GPU per-second

From $0.000164 /sec (T4)

Per-second metered GPU compute

CPU + Memory + Storage

$0.0000131 /core/sec

Non-GPU function execution

Per-second billing granularity on every SKU. Starter free with $30/month credits; Team adds unlimited seats and higher concurrency. AWS/GCP Marketplace billing available on Enterprise commits.

About

Modal is a New York-based serverless compute platform founded in September 2021 by Erik Bernhardsson (creator of Luigi and Annoy, ex-Spotify ML lead) and Akshat Bubna. The product is an opinionated Python-native serverless cloud: customers wrap functions with @app.function() decorators specifying GPU type, container image, memory, and concurrency, and Modal handles container build, image registry, scheduling, autoscaling, and per-second billing without exposing Kubernetes or infrastructure abstractions. The platform supports both interactive workloads (Jupyter, web endpoints) and batch jobs (training, batch inference, ETL).

By 2026 Modal serves Suno, Substack, Ramp, Notion, Anthropic (for select internal workloads), and roughly 1,000 paying customers spanning AI-native startups, academic research labs, and mid-market product engineering teams running Python data pipelines. The company raised a $16M Series A in February 2023 led by Redpoint with Lux Capital participation, followed by a Series B (reportedly $31M+) in mid-2024. The startup credit program grants up to $10,000 to qualifying companies — one of the most generous in the serverless GPU category.

Modal competes with RunPod, Replicate, Baseten (for AI inference), Lambda Labs, and serverless platforms like AWS Lambda + GPU options. Its differentiation is per-second billing granularity (finest in the category), Python-native developer ergonomics (@app.function() decorators rather than Docker + Kubernetes manifests), and a founder whose creation of widely-deployed open-source data tooling (Luigi, Annoy) gives the developer-experience pitch unusual credibility.

Pricing summary : How Modal’s per-second compute + plan-tier stack works

Modal runs a hybrid pricing model: pure-usage per-second compute (GPU, CPU, memory) plus flat monthly subscription fees that unlock concurrency limits, seat counts, and included credit grants. Starter is free with $30/month in compute credits, 3 workspace seats, 100 container concurrency, and 10 GPU concurrency — enough for evaluation and small production. Team is $250/month + compute with $100/month in credits, unlimited seats, 1,000 container concurrency, and 50 GPU concurrency. Enterprise is quote-based with volume discounts, dedicated capacity, VPC deployment, and AWS/GCP Marketplace billing.

The compute rate card is denominated per second across every SKU: T4 at $0.000164/sec, L4 at $0.000222, A10 at $0.000306, L40S at $0.000542, A100 40GB at $0.000583, A100 80GB at $0.000694, RTX PRO 6000 at $0.000842, H100 at $0.001097, H200 at $0.001261, B200 at $0.001736, and the newest B300 at $0.001972. CPU is $0.0000131/physical-core/sec (minimum 0.125 cores); memory is $0.00000222/GiB/sec; persistent volume storage is $0.09/GiB-month with 1 TiB/month free. This per-second granularity is two orders of magnitude finer than the per-minute industry standard.

Two per-second modifiers ride on top of the base rate card: region selection costs 1.5–1.75× base prices, and non-preemptible (guaranteed, non-interruptible) execution costs 3× base prices — both apply across Starter, Team, and Enterprise. Modal’s Sandbox + Notebooks product is metered separately at a higher rate: CPU at $0.00003942/physical-core/sec and memory at $0.00000667/GiB/sec (GPU follows the standard rate card), reflecting the burst-oriented, over-allocation-free profile of interactive sandbox and notebook workloads.

What makes this different: Modal is the only Python-native serverless platform where @app.function() decorators are the primary interface — there is no Docker file or Kubernetes manifest in the canonical workflow. For data scientists and ML engineers who write Python but not YAML, this developer-experience choice is materially better than alternatives — and the founder’s open-source pedigree (Luigi, Annoy) makes the experience pitch credible.

Pricing by product

Compute per second (the canonical SKU)

Resource	Per-second rate
Nvidia T4 (16GB)	$0.000164
Nvidia L4 (24GB)	$0.000222
Nvidia A10 (24GB)	$0.000306
Nvidia L40S (48GB)	$0.000542
Nvidia A100 (40GB)	$0.000583
Nvidia A100 (80GB)	$0.000694
Nvidia RTX PRO 6000	$0.000842
Nvidia H100 (80GB)	$0.001097
Nvidia H200 (141GB)	$0.001261
Nvidia B200	$0.001736
Nvidia B300	$0.001972
CPU (physical core)	$0.0000131 (min 0.125 cores)
Memory	$0.00000222/GiB

Per-second rate modifiers (apply on all plans): region selection is billed at 1.5–1.75× base prices; non-preemptible (guaranteed, non-interruptible) execution is billed at 3× base prices.

Sandbox + Notebooks (separately metered compute)

Resource	Per-second rate
CPU (physical core)	$0.00003942 (min 0.125 cores)
Memory	$0.00000667/GiB
GPU	Standard per-second rate card (above)

Storage (persistent volumes)

Resource	Rate
First 1 TiB/month	Free
Above 1 TiB	$0.09/GiB-month

Plan tiers

Tier	Monthly fee	Included credits	Seats	Container concurrency	GPU concurrency	Deployed apps	Log retention
Starter	$0	$30/month	Up to 3	100	10	200	1 day
Team	$250 + compute	$100/month	Unlimited	1,000	50	1,000	30 days
Enterprise	Custom	Custom	Unlimited	Custom	Custom	1,000	Custom

Sales motions across products: PLG / self-serve for Starter and Team; sales-led for Enterprise commits, private-Slack support, and AWS/GCP marketplace billing. All prices accessed 2026-07-14 from modal.com/pricing.

Hidden costs : What Modal customers actually pay beyond the per-second rates

Archetype A: Solo developer on Starter running batch inference on H100

A solo developer running occasional H100 batch jobs (~30 minutes/day, scale-to-zero in between):

Line item	Monthly cost
H100 compute (30 min/day × 30 × 60 × $0.001097)	$59
CPU + memory overhead (1 core, 16 GiB, ~5% overhead)	$4
Volume storage for model weights (50 GiB, under 1 TiB free)	$0
Egress (small response payloads, negligible)	<$1
Subtotal compute	$63
Starter credit ($30/mo)	-$30
Estimated total	~$33/month

The Starter $30 credit covers roughly half the bill for typical small batch workloads. Pure scale-to-zero on per-second granularity means a 5-minute idle period costs nothing — a meaningful advantage over per-minute or per-hour platforms.

Archetype B: Mid-market team on Team plan running mixed inference + training

A 20-person AI engineering team on Team plan running production inference (A100 80GB during business hours) plus weekly training (H100 cluster):

Line item	Monthly cost
Team plan fee	$250
A100 80GB inference (8h/day × 30 × 60 × 60 × $0.000694)	$360
H100 training (4h/week × 4 × 60 × 60 × $0.001097)	$63
CPU + memory for orchestration (~$20)	$20
Storage (200 GiB models, under 1 TiB free)	$0
Subtotal	$693
Team credit ($100/mo)	-$100
Estimated total	~$593/month

For sustained mid-market workloads, Team plan economics are reasonable — $250 plan fee covers seats and higher concurrency limits, $100 credit offsets ~13% of compute, and per-second billing keeps idle costs near zero. The fixed $250 plan fee is the main differentiator from pure pay-as-you-go competitors.

Want to estimate your own Modal bill? Use the Modal pricing calculator to model GPU seconds, CPU core-seconds, memory, and storage by workload profile.

Pricing evolution : Modal’s pricing history from Python serverless to multi-tier platform

Cadence

Quarter	Price changes	Product / SKU additions	Notes
2021 Q3	0	1	Modal founded; closed beta
2023 Q1	0	1	Public beta + Series A ($16M) + per-second pricing
2024 Q1	0	1	Team plan launched at $250/mo + compute
2024 Q2	1	0	Series B + H100 per-second pricing published
2025 Q1	0	1	Persistent volumes + 1 TiB/mo free storage
2025 Q3	0	1	B200 + H200 + L40S per-second pricing added
2025 Q4	0	0	Startup credit program expanded to $10k
2026 Q1	0	1	AWS + GCP Marketplace billing for Enterprise
2026 Q3	0	2	B300 GPU + separately-metered Sandbox/Notebooks tier; region & non-preemptible modifiers itemized

Tracked range: 2021 Q3–2026 Q3. Quarters not listed above were verified stable (0 price changes, 0 SKU additions).

Notable changes

2023-02-15 — Public beta launch + Series A; established per-second CPU/GPU/memory billing as the canonical pricing model.
2024-01-22 — Team plan launched at $250/month + compute with $100/month in credits and higher concurrency limits.
2024-06-10 — Series B + H100 per-second pricing at $0.001097/sec ($3.95/hour) published as competitive differentiator.
2025-03-19 — Persistent volumes launched at $0.09/GiB-month with 1 TiB/month free; vertically integrated storage with compute.
2025-09-04 — B200, H200, L40S added at per-second rates; maintained per-second granularity across new generations.
2025-12-11 — Startup credit program expanded to $10,000 grants; most generous in the category.
2026-02-25 — AWS + GCP Marketplace billing added for Enterprise contracts.
2026-07-14 — B300 (Blackwell Ultra) added at $0.001972/sec as the new top-end SKU above B200, extending the per-second rate card to the latest Blackwell generation without breaking granularity. Two structural additions accompanied it: a separately-metered Sandbox + Notebooks compute tier (CPU $0.00003942/core/sec, memory $0.00000667/GiB/sec — roughly 3× the standard function rates, pricing the burst-oriented interactive profile explicitly), and the first on-page itemization of two per-second modifiers — region selection at 1.5–1.75× and non-preemptible (guaranteed) execution at 3× base prices. All existing GPU/CPU/memory/storage rates and plan fees ($0 / $250 / Custom) were unchanged, so the headline economics held while the rate card gained resolution.

What’s unique : Modal’s distinctive pricing mechanics

1. Per-second billing across every SKU (not just GPU). Modal bills GPU, CPU, and memory all per second — the finest billing granularity in any major cloud compute product. For bursty serverless workloads where containers spin up for 2–10 seconds and shut down, this granularity makes scale-to-zero economics genuinely cheap, in a way that per-minute or per-hour billing cannot match.

2. Python-decorator-as-deployment-API is the canonical interface. Most serverless platforms accept Docker images and Kubernetes manifests; Modal’s primary interface is @app.function(gpu="H100") in a Python file. For data scientists and ML engineers who write Python but not infrastructure-as-code, this is a meaningful developer-experience cost advantage — and the founder’s open-source pedigree (Luigi, Annoy) makes the experience pitch credible.

3. Subscription-plus-compute hybrid (Team $250/mo + compute). Most serverless GPU competitors offer pure pay-as-you-go. Modal’s Team plan adds a $250/month flat fee for seats, higher concurrency, Slack support, and $100 in compute credits. This subscription-plus-usage hybrid is unusual in the category and signals Modal’s bet that mid-market teams will pay for predictable plan economics even when usage is variable.

4. Generous startup credit program (up to $10k). Modal grants up to $10,000 in credits to qualifying startups and academic researchers — substantially more generous than competitors’ typical $30–$500 trial offers. This credit-led PLG strategy prioritizes long-term developer adoption over short-term revenue from evaluation traffic.

5. Vertical storage integration with free 1 TiB/month tier. Persistent volumes at $0.09/GiB-month with 1 TiB free per month absorb model-weight and dataset storage without forcing customers to S3. For ML workloads where checkpoints and datasets compose most storage, this bundling reduces vendor sprawl and simplifies developer workflow — at the cost of multi-cloud flexibility.

Strengths & weaknesses

Strengths	Weaknesses
Finest billing granularity in cloud compute (per second, every SKU)	Network egress and function invocation charges not itemized on pricing page
Python-decorator-as-deployment-API is the most opinionated DX in serverless GPU	Python-only orientation limits TAM for teams using Go, Rust, or polyglot stacks
Founder credibility from Luigi + Annoy open-source pedigree	Team plan $250/mo + compute is higher than pure pay-as-you-go competitors
Startup credit program (up to $10k) among most generous in category	Storage tier ($0.09/GiB-month) above free 1 TiB is roughly 4× S3 Standard pricing
Subscription-plus-usage hybrid (Team) captures predictable mid-market spend	Cold-start latency on infrequently-used containers (~3–8 seconds)
Free 1 TiB/month storage absorbs ML model weights without S3	Concurrency limits gate scale-up — exceeding requires plan upgrade or commit
Region and non-preemptible rate multipliers now published on the pricing page (2026-07-14)	Headline per-second rates assume preemptible, in-default-region scheduling — guaranteed (non-preemptible) execution costs 3× base and region selection 1.5–1.75×

Billing UX : Modal’s account controls and payment experience

Self-serve signup — Sign up at modal.com/signup with email or GitHub; Starter tier $30 credits applied automatically. No credit card required to deploy first function.
Real-time per-second cost meter — Console shows per-function per-container cost in real time; each task has a live cost projection.
Workspace and project organization — Workspace-level usage aggregation; per-environment (dev / staging / prod) separation supported.
Workspace- and Environment-level budgets — Spend budgets configurable at both the Workspace and per-Environment (dev / staging / prod) level; incremental usage is auto-charged the first time certain thresholds are exceeded within a cycle.
Programmatic billing reports (Team & Enterprise) — modal.billing APIs and the modal billing CLI export tabular spend-over-time reports broken down by App or resource (pre-credit / pre-reservation figures).
Cost-attribution tags — Key-value tags on Apps (tags={"team": "llm-platform"}) let spend be allocated across teams and projects in billing reports.
Payment methods — Stripe-hosted payment management (card / ACH) on Starter and Team; invoiced billing, international bank transfer, and split invoices available to Enterprise customers with a usage commitment; AWS/GCP Marketplace committed-spend billing on Enterprise.
Credit grants — Startup and academic credit grants up to $10k applied to workspace balance; offset against compute, storage, and Team plan fees.
Annual commit pricing — Enterprise customers receive volume discounts in exchange for annual usage commitments and dedicated capacity reservations.
Audit logging + RBAC — Workspace-level RBAC on Team+; SOC 2 audit-log exports on Enterprise.
Multi-region availability — US standard; EU and APAC regions on Enterprise with VPC deployment.

Strategic wins : Why Modal’s pricing decisions worked

1. Per-second billing granularity as the structural cost moat

By billing GPU, CPU, and memory per second rather than per minute or hour, Modal made scale-to-zero economics genuinely cheap — and forced competitors to either match the granularity or accept the comparison gap. For bursty serverless workloads, the unit-economics advantage is real and reproducible. Most legacy cloud platforms cannot match per-second billing without re-architecting their metering infrastructure.

2. Python-decorator-as-deployment-API removed the YAML / Docker friction barrier

Most serverless platforms require Dockerfile + Kubernetes manifests; Modal’s @app.function() decorator hides the infrastructure entirely. For data scientists and ML engineers — Modal’s primary target persona — this is a meaningful productivity gain. The fact that Erik Bernhardsson created Luigi and Annoy makes the developer-experience promise credible in a way that pure-marketing positioning cannot replicate.

3. Generous startup credit program as the developer-led GTM engine

The up-to-$10k startup credit grants signal Modal’s bet on long-term developer adoption over short-term revenue from evaluation traffic. AI-native startups that build their initial product on Modal-funded compute tend to stay on Modal at scale — a classic PLG flywheel that competitors with smaller trial credits cannot match.

4. Team plan ($250/mo + compute) captures predictable mid-market revenue

Most serverless GPU competitors are pure pay-as-you-go, leaving predictable revenue uncaptured. The Team plan flat fee covers seats, concurrency, and Slack support — capturing customers who value predictability even at variable usage. This hybrid pricing is also a leading indicator of mature mid-market product-market fit.

Areas to improve : Gaps in Modal’s pricing approach

1. Egress and function-invocation charges are still not itemized

The pricing page lists compute, memory, and storage rates but not network egress or per-function-invocation charges. Modal moved in the right direction on 2026-07-14 by itemizing the region (1.5–1.75×) and non-preemptible (3×) rate modifiers that had previously been implicit — a genuine transparency step that turned two hidden multipliers into published line items. But egress and per-invocation costs remain the largest un-itemized items on the page, so for high-volume customers the residual opacity still creates bill-shock risk and erodes the per-second transparency advantage. Publishing egress rates (even at a “first X GB/month free, then $Y/GB” structure) would close the remaining transparency gap versus AWS, GCP, and Azure rate cards.

2. Python-only orientation limits TAM

The @app.function() decorator pattern is Python-only. Teams using Go, Rust, or polyglot stacks cannot use Modal as their primary serverless platform. Adding a generic container interface (with Docker images as input, similar to RunPod and Replicate) would broaden TAM without abandoning the Python-native DX as the canonical path.

3. Storage above 1 TiB is expensive relative to S3

At $0.09/GiB-month, Modal’s storage tier is roughly 4× S3 Standard. For ML workloads with multi-TiB dataset storage, customers will need to either offload to S3 (introducing the vendor sprawl Modal’s free 1 TiB was designed to prevent) or accept significantly higher storage spend. A tiered storage rate (cheaper for archival, premium for hot) would close the cost gap.

4. Concurrency limits create scale-up friction

Starter caps at 100 container concurrency / 10 GPU concurrency; Team at 1,000 / 50. Customers hitting these limits during burst events face either degraded service or a plan-upgrade scramble. Soft caps with overage billing (rather than hard caps with service throttling) would smooth the scale-up experience and reduce churn at the boundary.

Monetization stack & signals : how Modal builds & buys its revenue engine

Buys 2 Builds 1 1 signal role

The read — where the monetization investment is going

Modal buys its analytics warehouse (Snowflake + dbt, confirmed on its own eng blog) while the GPU-cost instrumentation behind its per-second pricing stays in-house — fitting for a serverless GPU vendor whose gross margin IS its own fleet utilization. The signal worth watching is the Business Operations hire below: a self-serve/PLG company spinning up its first deal desk plus a dedicated pricing-and-packaging analytics function is the canonical sales-led-onto-self-serve-core inflection.

Stack — build vs buy

Builds in-house · 1

In-house GPU-utilization / cost instrumentation Cost & FinOps inferred Blog Feb 2025

“At Modal, we have our own GPU utilization challenges to solve and we help our users solve theirs — an educational guide; an internal cost/metering system is implied by the fleet-margin model, not disclosed as a named build.”

Buys (vendor) · 2

Snowflake Data platform Blog 1 Job post 2 Jun 2026

“one of our most important data loading use cases is copying our production read replica Postgres instance to Snowflake, our data warehouse.”
dbt Data platform Blog Sep 2024

“After our data has been loaded into Snowflake, we still need to transform it to make it analysis ready. dbt is the de facto standard for this — Modal's own transformation layer over Snowflake.”

Unconfirmed · 2

rev-rec Revenue recognition inferred Job post Jun 2026

“Hands-on experience with QuickBooks, NetSuite, Ramp, and other systems — Controller JD (named in a candidate-skills basket alongside QuickBooks; not confirmed as Modal's operated GL).”
Metering Metering inferred

What the hiring reveals

View open roles

Business Operations Manager Deal deskMonetization Jun 22, 2026

Modal's pure-PLG self-serve core is acquiring a sales-led spine: one BizOps generalist owns BOTH the pricing-and-packaging analytics and standing up the first deal desk — the textbook inflection where a usage-billed product layers negotiated enterprise contracts onto its self-serve meter.

“Drive in-depth quantitative analyses to inform our pricing and packaging strategy. Help spin up our deal desk and streamline enterprise deals.”

3 more matched roles — supporting evidence

Member of Technical Staff - Product (Growth) Growth May 18, 2026
VP, Finance RevOps seen Feb 24, 2026
Controller RevOps seen Sep 19, 2025

Signals reviewed Jun 2026 · derived from public job posts, engineering blogs

Job postings fill and close over time — once a posting is filled we keep it as a dated citation (the quoted evidence remains); use View open roles for current listings.

Key takeaways

Per-second billing granularity is a structural cost moat that legacy clouds cannot match. Modal’s decision to bill GPU, CPU, and memory per second forced a category-wide expectation shift — competitors now have to either match or justify per-minute / per-hour billing. For usage-based platforms where scale-to-zero is a marketing claim, billing granularity must match the claim.
Founder open-source pedigree is the most credible developer-experience trust anchor. Erik Bernhardsson’s authorship of Luigi and Annoy gives the Python-native DX promise weight that pure-engineering teams cannot replicate. Infrastructure commercializations targeting developer audiences should consider founder open-source visibility as a structural advantage.
Subscription-plus-usage hybrids capture predictable mid-market revenue. The Team plan ($250/mo + compute) is unusual among serverless GPU competitors and likely reflects mature mid-market product-market fit. Other usage-based platforms considering subscription components should look at Modal’s Team economics as a reference.
Generous startup credit programs are PLG levers, not lost revenue. Modal’s up-to-$10k credit grants signal long-term thinking: AI-native startups that build on Modal-funded compute tend to stay at scale. Trial credit generosity is a strategic GTM choice, not just a marketing line item.
Vertical bundling (storage with compute) simplifies developer experience at the cost of multi-cloud flexibility. Modal’s free 1 TiB/month storage absorbs ML model weights and reduces S3 dependence — a developer-experience win that comes at the cost of locking customers into Modal’s storage economics. For value-metric design, the bundling trade-off is structural.

UBP implications

Per-second billing is becoming the new transparency floor for serverless compute. Per-minute or per-hour billing is increasingly perceived as legacy-cloud opacity. New usage-based platforms entering serverless GPU should default to per-second granularity as a baseline expectation.
Subscription-plus-usage hybrids capture revenue that pure pay-as-you-go leaves on the table. Mid-market customers value predictable plan economics even when usage is variable; the Modal Team plan demonstrates the segment exists. Other usage-based platforms should consider hybrid SKUs for the predictable-spend mid-market.
Developer-experience as a pricing-adjacent value metric. Modal’s Python-decorator API is not a pricing mechanic, but it shapes which customers can use the platform — and which can therefore become paying customers. For value-metric pricing, the developer-experience surface area is as load-bearing as the rate card.

Sources

Modal pricing page (accessed 2026-05-30)
Modal docs — billing (accessed 2026-05-30)
Modal blog — Series B announcement (accessed 2026-05-29)
Modal startup credit program (accessed 2026-05-29)
Erik Bernhardsson — Luigi and Annoy author (accessed 2026-05-29)
Related infra blueprint — Baseten
Related infra blueprint — Replicate
Blueprint corpus index

Bottom line

Modal priced its serverless compute platform around three structural ideas: per-second billing granularity (finest in the category, GPU + CPU + memory) that makes scale-to-zero economics genuinely cheap, Python-decorator-as-deployment-API (@app.function(gpu="H100")) as the canonical interface for data scientists and ML engineers who write Python but not YAML, and founder credibility from Erik Bernhardsson’s open-source pedigree (Luigi, Annoy) that makes the developer-experience promise believable. The subscription-plus-usage hybrid (Team plan at $250/mo + compute) captures predictable mid-market revenue, and the up-to-$10k startup credit grant program drives long-term developer adoption.

For AI-native engineering teams that prioritize Python developer experience and per-second cost transparency over multi-cloud flexibility, Modal is the most ergonomic serverless GPU platform on the market. The remaining gaps (egress not itemized, Python-only orientation, storage above 1 TiB expensive, concurrency hard-caps) are TAM-expansion and transparency-polish problems rather than structural pricing flaws.

Compare with peers via the blueprint corpus, or model your own spend with the Modal pricing calculator.

Pricing timeline : Major events on a vertical axis

Each milestone below corresponds to a public pricing change, product launch, or material adjustment. Major events use a filled marker; minor adjustments use a faded one.

B300 GPU + Sandbox/Notebooks Pricing + Rate Modifiers

Jul 2026

Modal added the NVIDIA B300 (Blackwell Ultra) at $0.001972/sec — its new top-end GPU SKU above the B200 ($0.001736/sec). A separately-metered Sandbox + Notebooks compute tier appeared at $0.00003942/core/sec CPU and $0.00000667/GiB/sec memory (roughly 3× the standard function rates, reflecting burst-oriented interactive workloads). The pricing page also now itemizes two per-second modifiers: region selection at 1.5–1.75× base prices and non-preemptible (guaranteed) execution at 3× base prices. All existing GPU, CPU, memory, storage rates and plan fees ($0 / $250 / Custom) were unchanged.

B300 GPU + Sandbox/Notebooks Pricing + Rate Modifiers screenshot 1

AWS + GCP Marketplace Billing for Enterprise

Feb 2026

Modal added support for AWS Marketplace and GCP Marketplace billing on Enterprise contracts, letting customers consume Modal spend through existing cloud-provider committed-spend agreements. Reduces procurement friction for Fortune-500 buyers with EDP or CUD agreements.

AWS + GCP Marketplace Billing for Enterprise screenshot 1

Startup Credit Program Expanded to $10k

Dec 2025

Modal expanded its startup credit grant program to up to $10,000 in credits for qualifying companies and academic researchers. Established Modal as the most generous credit-grant program among serverless GPU competitors, reflecting developer-led GTM emphasis.

B200 + H200 + L40S Added at Per-Second Rates

Sep 2025

Modal added NVIDIA B200 (180GB) at $0.001736/sec, H200 (141GB) at $0.001261/sec, and L40S (48GB) at $0.000542/sec. The Blackwell-class addition came alongside continued per-second billing rather than per-hour, preserving granularity advantage at the high end.

Persistent Volumes + 1 TiB/mo Free Storage

Mar 2025

Modal launched persistent volumes at $0.09/GiB-month with 1 TiB/month free. The free tier absorbed model-weight storage for most production workloads, vertically integrating storage with compute. Eliminated S3-dependence for many use cases.

Series B + H100 Per-Second Pricing

Jun 2024

Modal raised a $31M+ Series B (reportedly led by Lux Capital with Redpoint and others) and published H100 per-second pricing at $0.001097/sec — roughly $3.95/hour, undercutting major competitors' on-demand H100 rates. Per-second granularity emphasized as a competitive differentiator.

Team Plan Launched at $250/mo + Compute

Jan 2024

Modal introduced its Team plan at $250/month plus compute, with $100/month in included credits. The plan added unlimited workspace seats, 1,000 container concurrency, and 50 GPU concurrency — bridging the gap between free Starter and quote-based Enterprise tiers.

Public Beta + Series A ($16M)

Feb 2023

Modal went into public beta and raised a $16M Series A led by Redpoint Ventures. Pricing model established: per-second CPU, GPU, and memory billing with a free tier of $30/month credits. Positioned as the developer-experience leader in serverless compute.

Modal Founded

Sep 2021

Erik Bernhardsson (creator of Luigi and Annoy, ex-Spotify ML) and Akshat Bubna founded Modal Labs to build serverless infrastructure for Python data and ML workloads. Initial closed beta launched with the @app.function() decorator pattern as the canonical interface.

Trivia

· Modal's per-second billing granularity ($0.001097/sec on H100) means a 5-second cold start costs $0.0055 — among the finest billing granularity in any cloud compute product, two orders of magnitude finer than the per-minute industry standard.
· Modal was founded in 2021 by Erik Bernhardsson (ex-Spotify ML, creator of Luigi and Annoy) — making it one of the few infrastructure platforms where the founder is the primary author of widely-deployed open-source data tooling, lending unusual credibility to the developer-experience pitch.
· Modal grants up to $10,000 in credits to qualifying startups and academic researchers — one of the most generous credit programs in cloud compute, reflecting the founder's bet on developer-led GTM rather than enterprise sales-led growth.

Questions & answers

How much does Modal cost per month?: Modal's Starter tier is free with $30/month in compute credits, plus pay-as-you-go above that. Team plan is $250/month + compute (with $100/month in credits). A typical mid-volume workload running an A100 80GB ($0.000694/sec) for 4 hours/day would cost about $300/month in compute on top of the plan fee.
What GPU rates does Modal publish?: Modal publishes per-second rates: T4 $0.000164, L4 $0.000222, A10 $0.000306, L40S $0.000542, A100 40GB $0.000583, A100 80GB $0.000694, RTX PRO 6000 $0.000842, H100 $0.001097, H200 $0.001261, B200 $0.001736, B300 $0.001972. All bill per second of active use — idle containers cost only memory and storage. Region selection adds 1.5–1.75× and non-preemptible execution 3× on top of base rates.
What does Modal include in the free Starter tier?: Starter is free with $30/month in compute credits, 3 workspace seats, 100 container concurrency, and 10 GPU concurrency. Credits cover roughly 27 hours of T4 or 7.5 hours of H100. Sufficient for evaluation and small-scale production; no credit card required.
How is Modal's CPU and memory billed?: CPU is $0.0000131 per physical core per second (minimum 0.125 cores per task); memory is $0.00000222 per GiB per second. A typical lightweight function using 1 vCPU and 4 GiB for 60 seconds costs about $0.0008. Persistent volume storage is $0.09/GiB-month with 1 TiB/month free.
Does Modal offer startup credits?: Yes — qualifying startups and academic researchers can receive up to $10,000 in Modal credits. Application is via Modal's startup program; eligibility typically requires early-stage status, ML/AI focus, or academic affiliation. Among the most generous credit grants in serverless GPU.
How does Modal Team plan differ from Enterprise?: Team ($250/mo + compute, $100/mo credits) offers unlimited seats, 1,000 container concurrency, 50 GPU concurrency, and standard compliance (SOC 2). Enterprise is quote-based and adds custom volume discounts, dedicated capacity reservations, advanced security (audit logs, RBAC), VPC deployment, and marketplace billing via AWS/GCP.