Tag

Open Source Inference

Stories with this tag. Sections and all tags live in the Topics menu; for full-text use search.

Co-occur with these stories — for navigation and internal links.

How I Built a Free Local LLM Pipeline on a 10-Year-Old GTX 1080 with llama.cpp

A ten-year-old GTX 1080 and a Vulkan-powered llama.cpp setup deliver 15 tokens per second on 26-billion-parameter models — proving that self-hosted local LLMs can be both free and capable.
May 10, 2026
local llm llama.cpp self-hosted AI GPU passthrough Gemma open source inference Mixture of Experts