machine learning model profiling tensorflow

Notes on Model Performance Profiling

Hannes Hapke 12 Jan 2023

Model latency is critical to a successful roll-out of a production machine learning model. No one wants to wait, especially not customers using a machine learning model. My notes cover tools to investigate the model performance to detect bottlenecks within the model graph.

Optimize TensorFlow performance using the Profiler
TensorFlow Profiler: Profile model performance
Colab notebook about Profiling with TensorBoard

Notes on Model Performance Profiling

Speculative Decoding with vLLM using Gemma

Deploying Google's Gemma on Vertex AI

Speculative Decoding with vLLM

Notes on Model Performance Profiling

You may also like

Speculative Decoding with vLLM using Gemma

Deploying Google's Gemma on Vertex AI

Speculative Decoding with vLLM