Courselet Title: Using cloud servers for GPU-based inference
Author: Fraida Fund
Contact Email: ffund@nyu.edu
Machine learning models are most often trained in the cloud, on powerful centralized servers with specialized resources (like GPU acceleration) for training machine learning models. These servers are also well-resources for inference, making predictions on new data.
In this experiment, we will use a cloud server equipped with GPU acceleration for fast inference in an image classification context.
This notebook assumes you already have a “lease” available for an RTX6000 GPU server on the CHI@UC testbed. Then, it will show you how to:
launch a server using that lease
attach an IP address to the server, so that you can access it over SSH
install some fundamental machine learning libraries on the server
use a pre-trained image classification model to do inference on the server
optimize the model for fast inference on NVIDIA GPUs, and measure reduced inference times
delete the server
Available at https://github.com/teaching-on-testbeds/cloud-gpu-inference
Link to Artifact: https://chameleoncloud.org/experiment/share/3546a1d7-ea72-4b58-80eb-8cd95ff8965b
Apply for a FOUNT badge to add your courselet to the table!