A research team from Carnegie Mellon University built an AI model called LegoGPT that outputs valid LEGO designs from text inputs. According to the team’s research paper that's posted on GitHub, they trained “an autoregressive large language model to predict the next brick to add via next-token prediction,” but the key takeway is that the AI LLM creates LEGO designs from scratch.
The AI was trained on a dataset with more than 47,000 LEGO structures that build over 28,000 unique 3D objects, including bookshelves, tables, chairs, cars, ships, guitars, and more. This was then used to train the AI model, allowing it to create unique and original designs solely from text inputs.
The tool is available for free on GitHub, and you can pair this with a computer vision model or image processing AI. For example, you can take a photo of your available LEGO bricks and let the AI give you a multitude of unique options for building with what you already have.
The team added a validity check and physics-aware rollback during autoregressive inference, ensuring that the final output will always be valid (i.e., no overlapping bricks) and stable (i.e., no floating bricks). Furthermore, LegoGPT’s final output can be built by both humans and robots.
This is how the team created the dataset — StableText2Lego — used to train LegoGPT: a text prompt input is first converted into a ShapeNetCore mesh. This is then plugged into a 20 x 20 x 20 voxel grid from which the initial LEGO brick layout is determined.
This layout is then varied while still keeping the overall shape, and then unstable designs are filtered out from the final output. Those left are then rendered in 24 different viewpoints, and then GPT-4o is used to generate descriptions for the final output.
This is how it creates a new design through text: LegoGPT converts the text into a LEGO design, which is then converted into text tokens ordered from bottom to top. Instructions are then created to pair the structured LEGO bricks with annotations explaining the design, so that the AI will understand the relationships between the text prompt and the physical bricks.
From there, LegoGPT predicts the next brick needed to build the design using an autoregressive model. That means it will verify a brick’s validity at each step, checking if it is well-formatted, exists in the library, and does not overlap with existing bricks. This will continue until the design is completed, after which its stability is tested.
If the AI determines that the output is unstable, it will roll back to the last stable state and continue generating from that point. Once it gets a stable final output, then the design is complete.
If you want to play with the AI yourself, the team released its dataset, code, and models, making it easier for anyone to fork the team’s work. One development we can see is if someone converts this into a downloadable AI app with a customizable brick library.
Follow Tom's Hardware on Google News to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button.