Skip to main content


Scale large Deep Learning models, deliver blazing fast inferences and optimize infraestructure costs with the Stochastic tools.

1. Basic concepts

Stochastic Acceleration Platform aims to simplify the life cycle of a Deep Learning model. From uploading and versioning the model, through training, compression and acceleration to putting it into production.

The platform revolves around four basic concepts: models, datasets, optimization jobs and deployments.

In this guide we will be using the graphical interface, but you can use the Python library and the CLI.

2. Advantages of using the Stochastic Platform

  • Save your time. Focus your effort on what matters. Build a model and we will take care of making it work.

  • Reduce your engineering costs. In a few hours you will have a model ready for production.

  • Reduce the latency and increase the throughput of your model.

  • The previous point implies a reduction in costs. The more requests your model can receive, the fewer replicas of your model you will need.

  • Monitor your model.

  • Right now, we are only supporting HuggingFace, PyTorch and ONNX models. But we are planning to support TensorFlow models soon. Contact us to get early access to TensorFlow models!

3. Setup a Stochastic account

Before starting using the platform, you will have to sign up for a free account at

After signing up you should see a dashboard, where you can monitor the number of jobs you are running, the costs you have saved, etc. Currently we only support desktop screens.


We are giving you a $20 credit that you can spend on the platform. Keep in mind that you will have some storage and job execution limits. If you want to remove these limits go to your profile (you should see it in the bottom left corner). Then go to the billing section and enter your payment details.