Home Assistant Voice Preview Edition
Home Assistant’s voice device is a $60 box that’s both focused and evolving.
Credit: Home Assistant Foundation
Home Assistant announced today the availability of the Voice Preview Edition, its own design of a living-room-friendly box to offer voice assistance with home automation. Having used it for a few weeks, it seems like a great start, at least for those comfortable with digging into the settings. That's why Home Assistant is calling it a "Preview Edition."
Using its privacy-minded Nabu Casa cloud—or your own capable computer—to handle the processing, the Voice Preview Edition (VPE) ($60/60 euros, available today) has the rough footprint of a modern Apple TV but is thinner. It works similarly to an Amazon Echo, Google Assistant, or Apple Siri device, but with a more focused goal. Start with a wake word—the default, and most well-trained version, is "Okay, Nabu," but "Hey, Jarvis" and "Hey, Mycroft" are available. Follow that with a command, typically something that targets a smart home device: "Turn on living room lights," "Set thermostat to 68," "Activate TV time." And then, that thing usually happens.
"That thing" is primarily controlling devices, scenes, and automations around your home, set up in Home Assistant. That means you have to have assigned them a name or alias that you can remember. Coming up with naming schemes is something you end up doing in big-tech smart home systems, too, but it's a bit more important with the VPE.
You won't need to start over with all your gear if you've got a Google Home, Alexa, or Apple Home ecosystem, at least. Home Assistant has good "bridge" options built into it for connecting all the devices you've set up and named inside those ecosystems.
It's important to have a decently organized smart home set up with a VPE box, because it doesn't really do much else, for better or worse. Unless you hook it up to an AI model.
The voice device that is intentionally not very chatty
The VPE box can run timers (with neat LED ring progress indicators), and with a little bit of settings tweaking, you can connect it to Home Assistant's built-in shopping lists and task lists or most any other plug-in or extension of your system. If you're willing to mess with LLMs—like ChatGPT or Google's Gemini—locally or through cloud subscriptions, you could trigger prompts with your voice, though performance will vary.
What else does Home Assistant's hardware do? Nothing, at least by default. It listens for its prompt, it passes them onto a Home Assistant server, and that's it. You can't ask it how tall Buffalo Bills quarterback Josh Allen is or how many consecutive Super Bowls the Bills lost. It won't do simple math calculations or metric conversions. It cannot tell you whether you should pack an umbrella tomorrow or a good substitute if you're out of eggs.
For some people either hesitant to bring a voice device into their home or fatigued by the failures of supposedly "smart" assistants that can seem quite dumb, this might be perfect. When the Home Assistant VPE hears me clearly (more on that in a moment), it almost always understands what I'm saying, so long as I remember what I named everything.
There were times during the month-long period when I muted Google Assistant and stuck with Home Assistant that I missed the ability to ask questions I would normally just look up on a search engine. The upside is that I didn't have to sit through 15 seconds of Google explaining at length something I didn't ask for.
If you want the VPE to automatically fall back to AI for answering non-home-specific questions, you can set that up. And that's something we'll likely dig into for a future post.
The hardware
Home Assistant's Voice Preview Edition device, with Apple TV (4K, 2022) for scale. Kevin Purdy
As a product you want to keep somewhere it can hear you, the Home Assistant VPE blends in, is reasonably small, and has more useful buttons and switches than the competition. It looks entirely innocuous sitting on a bookshelf, entertainment center, kitchen counter, or wall mount. It's quite nice to pay for a functional device that has absolutely no branding visible.
There are four neat things on top. First is two microphone inputs, which are pretty important. There's an LED ring that shows you the VPE is listening by spinning, then spinning the other way to show that it's "thinking" and reversing again when responding. A button in the middle can activate the device without speech or cancel a response.
Best of all, there is a physically rotating dial wheel around the button. It feels great to spin, even if it's not something you'll need to do very often.
Around the sides is clear plastic, with speaker holes on three sides. The speakers are built specifically for voice clarity, according to Home Assistant, and I agree. I can always hear what the VPE is trying to tell me, at any distance in my living room.
There's a hardware mute switch on one side, with USB-C inputs (power and connection) and a stereo headphone/speaker jack. On the bottom is a grove port for deeper development.
Hearing is still the challenge
The last quasi-official way to get a smart speaker experience with Home Assistant was the ESP32 S3 Box 3, which was okay or decent in a very quiet room or at dining room table distance. The VPE is a notable improvement over that device in both input and output. If I make a small effort to speak clearly and enunciate, it catches me pretty much everywhere in my open-plan living room/dining room/kitchen. It's not too bad at working around music or TV sound, either, so long as that speaker is not between me and the VPE box. It is best with its default wake phrase, "Okay, Nabu," because that's the most trained and sampled by the Open Wake Word community.
And yet, every smart speaker I've had in my home at some point—a Google Home/Nest Mini, Amazon Echo (full-size or Dot), Apple HomePod (original), the microphones on Sonos speakers—has seemed better at catching its wake word, given similar placement as the VPE. After all, Home Assistant, a not-for-profit foundation, cannot subsidize powerful microphone arrays with advertising, Prime memberships, or profitable computer hardware ecosystems. I don't have lab tests to prove this, just my own experiences—with my particular voice, accent, phrasing, room shape, and noise levels.
I’ve been using this device with pre-release firmware and software, and it’s under active development, so it will almost certainly get better. But as a device you can buy and set up right now, it’s very close—but not quite—to the level of the big ecosystems. It is notably better than the hodgepodge of other devices you can technically use with Home Assistant voice prompts.
Is it better for my privacy that the VPE is not great at being triggered by ambient speech in the room? Maybe. At the same time, I'm more likely to switch away from said big-tech voice devices only if I don't feel like I have to say everything twice or three times.
It’s fun to craft your own voice system
I've been able to use the VPE on a bookshelf in my living room for weeks, asking it to turn on lights, adjust thermostats, set scenes with blinds and speakers, and other automations, and the successes are far more common than failures. I still want to test some different placements and try out local hardware processing (requiring an Intel N100 or better for common languages), since I've only tested it with Home Assistant's cloud servers, the generally faster solution.
The best things about the VPE are not the things you'll notice by looking at or speaking to it. It's a smart speaker that seems a lot more reasonable for private places, especially if you're running on local hardware. It's not a smart speaker that is going to read you an entire Wikipedia page when it misunderstands what you want. And it doesn't demand you to use an app tied into an ecosystem to use, other than the web app running off your Home Assistant server.
Paulus Schoutsen said on the VPE's launch stream that the VPE might not be the best choice for someone switching over from an established Google/Amazon/Apple ecosystem. That might be true, but I think the VPE also works as a single-user device at a desk, or for anyone who's been waiting to step into voice but concerned about privacy, ecosystem lock-in, or their kids' demands to play Taylor Swift songs on repeat.
This post was update at 5 p.m. to note the author's wake word experience may relate to his voice and room characteristics.