LLMs on a Homelab Without a GPU? Here's What I Found
I’ve recently been experimenting more with the power of Large Language Models (LLMs) and how they can supercharge my productivity. This especially has helped with the problem of a blank page wondering where to start writing – I can dump my unsorted thoughts into a chatbot and get a skeleton of a document or implementation in a few minutes. It’s by no means perfect what comes out, but it’s (usually) a very good start. I’ve been playing with both ChatGPT, and a series of local models running on my laptop with Ollama, Fabric, and OpenWebUI.
I wanted to be able to continue to use some of these self-hosted models while I’m out on the go, so I wondered - could I host these on my homelab server? My main homelab server runs an AMD Ryzen 5600G processor which includes an integrated GPU. Can this be used to improve the performance of a local model when I do not have a dedicated GPU available?
That doesn’t seem possible right now though, at least not in an efficient way. This post is a point-in-time snapshot of what I found when looking in to this.