CPU, GPU Inference
Created: 03 Jan 2023, 11:27 AM | Modified: =dateformat(this.file.mtime,"dd MMM yyyy, hh:mm a")
Tags: knowledge, GeneralDL
Technically during deployment, might not be necessary to have GPUs for inference!
e.g. if on edge, most likely real time operation. So would only need 1 input each time. Hence unlikely require batches of input data into the model.
Which means that the GPU utililisation will be low, since not fully utilising the max parallelism of the GPU with large batches
But then again, using GPU for inference will also be faster
⇒ is not a direct obvious choice whether to immediately use GPU for inference!