PyDataTT Meetup #25: Deploying Models on Open source Inference Servers

 On March 30th, 2023 we welcome Mark Moyou, a Data Scientist at NVIDIA you've met in  prior talk, to give another eye-opening session. This time, he is teaching us more about Deploying Models on Open-Source Inference Servers.

Description: In this talk we will discuss some best practices for reducing inference times and increasing model throughput when deploying machine learning models on GPUs. We will explore how model compression and speedup is accomplished through building hardware specific inference engines. Then, by leveraging open-source inference servers we can maximize the throughput on said GPUs by hosting multiple optimized models and take advantage of multiple backends for Pytorch, Tensorflow and Python based models.