Parallel token generation and on-device SLMs are breaking token-based AI billing. Flat subscriptions cover local inference; usage overages handle cloud fallback.