Basement AI lab captures 10,000 hours of brain scans to train thought-to-text AI models — largest known neural dataset collected from thousands of humans over six months

2 hours ago 7
A keyboard modified for Conduit's project, with several keys removed.
(Image credit: Conduit)

A San Francisco start-up has spent the past six months running one of the more unusual data projects in AI. Conduit says it has collected roughly 10,000 hours of non-invasive neural data from “thousands of unique individuals” in a basement studio, forming what it believes is the largest neuro-language dataset assembled to date. The company is using the recordings to train thought-to-text AI models that attempt to decode semantic content from brain activity in the seconds before a participant speaks or types.

Participants sit for two-hour sessions in small booths and converse freely with an LLM through speech or typing on “simplified” keyboards. Early sessions relied on rigid tasks, but Conduit shifted to personalized back-and-forth conversation after noticing that engagement strongly influenced data quality. The goal is to maximize the amount of natural language produced during each recording while maintaining tight time alignment between text, audio, and neural signals.

A render of a training headset concept designed by Conduit.

(Image credit: Conduit)

Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.

Google Preferred Source

Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.

Luke James is a freelance writer and journalist.  Although his background is in legal, he has a personal interest in all things tech, especially hardware and microelectronics, and anything regulatory. 

Read Entire Article