Vaclav Kosar's face photo
Vaclav Kosar
Software And Machine Learning Blog

BentoML vs Cortex - ML Serving Showdown

To find the best model serving tool, compare open-source MLOps platforms BentoML and Cortex.

BentoML and Cortex logo

Note, however, that I have much more experience with Cortex.


  • both deploy and serve models via API
  • both support major ML frameworks (TensorFlow, Pytorch)
  • both have good documentation


Language Python - easier to modify, integrate? Go
Deployment Delegated to other tools: Docker, Kubernetes, AWS Lambda, SageMaker, GCP, Azule ML and more. Works currently only with local Docker and AWS EKS (GCP is planned)
Service configuration via decorators directly in Python code via YAML files, or Cortex Serving Client for Python
Service packaging and distribution Can be packaged, saved via Python command to management repository with a web dashboard or PyPI Packaging only via Docker images without explicit support
Horizontal scaling Configured separately in other clustering tools. Working on an opinionated Kubernetes deployment. Is part of Cortex configs. But is thus less flexible (private cloud, HTTPS deploy requires custom scripts)
User interface CLI, Web UI CLI
Metrics Prometheus metrics AWS EKS metrics
API Auto-Docs Swagger/OpenAPI N/A
User support Responsive unpaid Slack Channel, but Slack is not the best tool for support Very responsive Gitter
Suggest anything else?

My Experience with Cortex

Here is a blog post on Cortex use at GLAMI. Consider using this Cortex client for Python, which is a Python wrapper around Cortex CLI that we use at GLAMI for MLOps.


11 May 2020