Home Assistant Voice Preview Edition: Local vs Cloud Smart Home Voice Control
Review of Home Assistant Voice Preview Edition, voice control hardware for your smart home. I'll explain what it is, how to set it up, what it can and cannot do, compare local voice vs. cloud LLM conversation agents, and offer tips to optimize your experience with it.
Introduction
I can’t believe I’m saying this, but this device made me appreciate my Amazon Echo even more. Let me explain.
This is the Home Assistant Voice Preview Edition, the first dedicated voice hardware for Home Assistant. It’s a huge step forward for open-source, local home automation, but is it ready to replace your current smart speaker?
No, it’s not. But…don’t be quick to dismiss it.
I’m going to show you everything you need to know about Home Assistant Voice: what it is, how to set it up, what it can and cannot do, compare local voice vs. cloud LLM conversation agents, offer tips to optimize your experience with it, and show you why it’s called Preview Edition.
Main Points
Unboxing
Home Assistant Voice Preview Edition features an injection-moulded case that is transparent on the sides and bottom. There is a USB-C port for power, and a stereo 3.5MM output for connecting to an external speaker for improved audio quality. The device also has a physical switch to mute the voice assistant from listening. You can also digitally mute the device from within Home Assistant.
The top of Home Assistant Preview Edition has a click wheel similar to an iPod Classic. Also on top are two microphones. Combined with an XMOS chip, this allows the device to hear the wake word from across the room. It occupies about the same footprint as an Amazon Echo Dot (3rd Gen), but is much thinner and lighter, and the expense of audio quality.
Note that it does not include a USB-C cable or charging brick.
Setup
There is a dedicated quick start guide for Home Assistant Preview Edition, but you'll follow these steps:
- Connect the device to power. You should then see a twinkling white light.
- Open up the Home Assistant companion app on your phone, or visit Home Assistant on your computer if your Home Assistant server has a Bluetooth connection.
- Go to Settings > Devices & services > and click Add on the auto-discovered device.
- Enter your the log-in credentials for your home's 2.4GHz WiFi network.
- Once connected to WiFi, click OK so set up with ESPHome.
- Click your device from the list, and then Submit.
- From the device settings, choose to use either Home Assistant Cloud or DIY. The former requires a subscription. The latter is free, but requires installing the Piper (text-to-speech) and Whisper (speech-to-text) add-ons.
- Select your desired wake word from pre-defined options, such as, "Okay Nabu" or "Hey Jarvis."
- Choose your assistant. This could be the default Assist in Home Assistant, or a local or cloud LLM if you connected one.
- Choose your preferred voice from the drop-down menu.
Testing Overview
I’ll test Home Assistant Voice Preview Edition for both accuracy and response time with both local and cloud LLM conversation agents.
For local voice, I used Home Assistant’s native Assist running on a Beelink S12 Pro Mini PC. This is an N100 with 16GB of memory. Your response times may vary if you are using different hardware.
For the cloud LLM, I used Google Gemini. You could also use ChatGPT, but you may incur a cost.
I gave both the local and cloud LLM conversation agents a combination of general knowledge and smart home requests. No special prep was done: I did not take the extra effort to expose more entities to voice assist, or re-name devices. This was intentional to simulate the out-of-box experience that you might expect, without having to do a bunch of additional configuration.
Accuracy was measured by the conversation agent’s ability to successfully execute my prompt as I intended. Response time was measured from the time I finished saying my prompt, to the time the conversation agent began its reply.
Testing with a Cloud LLM and with Local Voice
Test Results
Overall — and as expected — the cloud LLM achieved a much higher pass rate, correctly addressing 11 out of 19 prompts, for a passing rate of 58%. This compared with local voice correctly answering just 7 of my 19 prompts, for a passing rate of 37%. While the cloud LLM had a pass rate that was 2,100bps higher, I do not consider a 58% passing rate to be a tremendous success.
On the flip side, the average response time for local voice was 3.89 seconds. That was 25% faster than the average response time of 5.21 seconds for the cloud LLM. However, I don’t consider 3.89 seconds to be a total victory. Response times for setting timers or executing basic smart home commands is basically instantaneous using an Amazon Echo or a Sonos smart speaker. Response times for local voice and cloud LLM felt painfully slow by comparison on this device.
Additionally, and also as expected, the cloud LLM was smarter at answering a range of prompts, from general knowledge questions, to understanding my true intent when it came to smart home requests. The cloud LLM just had a better understanding of context. It knew that a lock or unlock command directed toward my door was in reference to a smart lock. It also knew that the device it spoke from was physically located in my basement bedroom. Local voice failed on both accounts.
Additionally, local voice often misunderstood the word “is” in my prompts. For example, I might say, “Is the garage open?” to which it would reply, “Sorry, I am not aware of an area called ‘is.’”
And many times, local voice would begin a reply only to cut itself off mid-sentence or mid-word without me saying anything, which seemed like a bug.
I also noticed another frustrating issue that I don’t experience with Amazon’s voice assistant. If the conversation agent was answering my voice prompt but talking for a long time, I would interrupt it by saying the wake word. While this stopped it from continuing to speak, it did not continue to listen for new instructions, even though I just said the wake word. This meant I had to say the wake word twice to make it stop speaking and be ready to listen for my new request.
And in case you’re wondering about music playback, those results were no more encouraging.
Tips
Before you give up hope on local voice, there are some things you can do to improve your experience.
First, you can take the time to confirm all desired entities are exposed to voice assist, and that each device or entity had a friendly name that reflects your natural way of speaking.
One example of this in my testing was the garage door. In my Home Assistant, the device is named “ratgdov2.5i f61565.” I will never remember that name when making a voice command. Re-naming this to “Garage Door” should help improve the passing rate. But, with literally thousands of entities in my Home Assistant instance, this effort may take some time.
My advice is this: Now that Home Assistant is offering its own voice hardware, it’s a good to opportunity to name any new devices that you add in a way that reflects how you might refer to it a natural conversation.
However, even with this effort you may still run into issues. My front door lock is named “Front Door Lock” in Home Assistant. While the cloud LLM was generally able understand a lock or unlock command by just calling it, “Front Door,” local voice was not. Even when I referred to it as “Front Door Lock,” local voice still failed to understand.
Second, if you are using an LLM, I suggest experimenting with the instructions you give it. For example, you might try giving it instructions to provide shorter replies to avoid run-on responses. Though I will say, I gave it instructions to provide time and dates in a human readable format, but it gave the time of day including the seconds, which was confusing.
Finally, and perhaps most important, take the time to get your lights in order. I suspect “turn on” or “turn off” lights will be one of the most used voice commands. But you may quickly run into at least two issues.
- You may have devices that are lights in your mind, but are actually switched in Home Assistant. If left that way, they will be ignored from your light-related requests.
- Chances are you have far more lights exposed to Home Assistant than you realize. You might tell it to “Turn on the living room lights” and be surprised to see the red light on your motion sensor turn on, and the green LED on your plant sensor light up as well. Cleaning up the entities exposed to voice assist can help here.
Final Thoughts
There is a reason that this Home Assistant Voice hardware is called Preview Edition. It is an early, open-source product that requires testing at scale to help it improve. It is not certainly not ready to replace your Amazon Echo, Apple HomePod, or Google Nest smart speakers.
While some of the results were disappointing, most of this was honestly expected. However, I’m also confident that today is the worst it will ever be, and that it will only improve over time in response to user feedback and developer support.
Watch on YouTube
Featured Tech
Home Assistant Voice Preview Edition: https://www.home-assistant.io/voice-pe/
Beelink S12 Pro Mini PC: https://amzn.to/49WrA9U