local llms changed how i use home assistant and now my smart devices actually listen
At a glance:
- Local LLMs enable voice-driven Home Assistant control using smartphones and lightweight AI tools.
- A smartphone-based voice satellite adds wake word detection without extra hardware.
- MCP servers expand Home Assistant control with nuanced, tool-rich LLM interactions.
What happened
As a Home Assistant aficionado, I’ve got dozens of apps and integrations set up for my IoT management hub, and that’s before you include all the community offerings I’ve grabbed from HACS. While I consider my dashboards, cards, and device-specific integrations borderline essential, the AI-specific tools have a special place in my setup. After all, few things can match the convenience of an LLM-controlled Home Assistant. In fact, I’ve got two different AI-powered pipelines set up in my Home Assistant hub—one that lets me manage my smart devices using my voice, while another bridges the gap between local AI apps and HASS by tossing MCP servers into the mix. My LLMs serve as the foundation of my HASS voice assistant pipeline.
Starting with the workflow I rely on practically every day, I’ve got a local voice assistant that initiates audio processing once it hears a wake word, performs the necessary Home Assistant tasks, and reports back with an AI-generated voice. Creating this setup a few months ago would’ve required dedicated microphones, speakers, and ESP32/Raspberry Pi systems, though my current iteration of the HASS voice assistant simply needs an LLM provider, a smartphone, and a handful of other lightweight AI tools that run on my Home Assistant server. On the AI side, I use a GTX 1080-powered Ollama LXC to host the models, and it runs on a separate Proxmox node. Specifically, I rely on the Qwen3 (8B) model, as it understands most of my queries without hallucinating the answers, and I’ve used the Ollama integration built into HASS to pair them together. For the speech-to-text processing tasks, I’ve deployed faster-whisper on the machine hosting Home Assistant, with Piper acting as the text-to-speech agent (and running on the same system, no less).
Without anything else, this setup is great for querying LLMs about my HASS devices via the Assist section. However, tossing an old smartphone in the mix is what added wake word detection facility to my voice assistant pipeline. After a recent update, the Home Assistant Companion app can finally detect hot words (though there’s only three of them to choose from) and process voice instructions locally, meaning I don’t need additional hardware for the task. Combine this phone-based HASS satellite with some LLMs, text-to-speech, and speech-to-text agents, and you can control your Home Assistant hub with simple voice commands as I do.
How it works in practice
If I want to check device statistics, toggle some IoT products, or modify the elements of my Home Assistant shopping list, my voice-controlled pipeline works really well. However, I often work on automations, blueprints, and other complex tasks that can be a bit of a pain to configure with voice commands alone. That’s where MCP servers come in handy, as they let my local LLMs access dozens of tools and API calls to control every aspect of my Home Assistant server. Although Home Assistant includes an MCP bridge, I haven’t had the best luck setting it up on my everyday system. Instead, I rely on the HA-MCP Server repository, which is not only easy to configure, but includes dozens of extra tools compared to Home Assistant’s official offering. Connecting it to LM Studio was just as simple, though I had to expand the context length on my LLMs quite a bit before loading the MCP server.
Speaking of models, I primarily use Qwen 3.5 (9B) when controlling Home Assistant from LM Studio. It’s far more accurate than other LLMs in this range, while being small enough to fit inside my RTX 3080 Ti even with a larger context window. I’ve used it to define new automations and tinker with entities using simple conversational messages, and it hasn’t let me down yet. Of course, the biggest caveat of an LLM-controlled setup is that I could potentially brick my Home Assistant hub with a single misinterpreted input. I tend to use detailed prompts to avoid such a scenario, but I’ve disabled a bunch of high-level access tools on the MCP server to placate my worrywart self. That way, the LLM would be denied access any time it tried to update the Core Home Assistant packages, delete devices, or write new data to specific config files. It’s a lot safer than it sounds, and remains handy when I want to set up quick trigger-action workflows without sifting through menus finding the right entities.
Integration details and ecosystem
Home Assistant OS runs on Windows, macOS, and Linux, with official iOS and Android compatibility. This broad platform support underpins the flexibility of the voice-assisted workflow I’ve built, allowing the smartphone satellite and server-side components to work in tandem regardless of host OS. The combination of Ollama for local model hosting, the Home Assistant Companion app for wake word detection, and HA-MCP Server for extended tool access creates a multi-layered control surface that is both powerful and configurable. These moving parts let me iterate quickly on automations while keeping the voice pipeline responsive and contextually aware.
What to watch next
As local LLM tooling matures, tighter integration between voice assistants, MCP servers, and Home Assistant could reduce configuration overhead and unlock more nuanced automations. I expect future updates to the Home Assistant Companion app to expand hot word options and improve offline reliability, while model hosts like Ollama may offer smaller yet more capable variants that fit even tighter hardware constraints. For now, the setup demonstrates how a carefully chosen stack of local AI tools can turn a smartphone into a capable Home Assistant satellite without sacrificing control or safety.
FAQ
Which models and tools are used in the described Home Assistant voice assistant pipeline?
What smartphone capabilities are leveraged for wake word detection in this setup?
How does the HA-MCP Server repository enhance control over Home Assistant compared to the built-in MCP bridge?
More in the feed
Prepared by the editorial stack from public data and external sources.
Original article