Modal
Cloud platform for serverless GPU computing that deploys ML inference and batch jobs as Python functions.
Modal deploys Python functions as serverless GPU jobs – no Kubernetes, no Docker, just decorated functions.
Explanation
Modal eliminates infrastructure management: Python functions are decorated with @app.function and executed in the cloud with GPU access. Features include container caching, secrets management, and web endpoints.
Marketing Relevance
Modal is ideal for ML teams needing GPU compute without Kubernetes or cloud infrastructure expertise.
Common Pitfalls
Vendor lock-in with Modal-specific APIs. Cold starts with infrequent calls. Costs with intensive usage.
Origin & History
Modal was founded in 2021 by Erik Bernhardsson (formerly Spotify). The platform quickly gained traction in the ML community through simple GPU provisioning. Series B funding 2024 over $100M.
Comparisons & Differences
Modal vs. Replicate
Replicate specializes in model hosting with Cog; Modal is a general serverless GPU platform for arbitrary code.
Modal vs. AWS Lambda
Lambda is CPU-only serverless; Modal offers GPU-serverless with container image support and ML optimizations.