Researchers teach machines how to follow Lego's instructions

Researchers teach machines how to follow Lego’s instructions

Lego’s enduring appeal comes not from the complexity of the sets, nor from cool miniature versions of pop culture icons, but from the creation process itself, transforming a box that looks Random pieces in a completed form. It is a satisfying experience, another experience Robots may steal from you one daythanks for the Researchers at Stanford University.

Lego instruction manuals are a key lesson in how to visually convey the assembly process to a construction worker, regardless of their background, level of experience, or the language they speak. Pay close attention to the pieces needed and the differences between one photo of the partially assembled model and the next, and you can see where all the pieces need to go before moving on to the next step. Lego has refined and refined the design of its instruction manuals over the years, but as easy as it is for humans to follow, machines are only learning how to interpret the manuals step by step.

One of the biggest challenges when it comes to machine learning to build with Lego is interpreting 2D images of 3D models in traditional printed instruction manuals (though,Lego models can now be assembled at all through the company’s mobile app, which provides full 3D models of each step that can be rotated and examined from any angle). Humans can look at a picture of a Lego brick and instantly identify its 3D structure in order to find it in a pile of bricks, but for robots to do so, Researchers at Stanford University had to do it Develop a new learning-based framework they call From Evidence to Actionable Plan Network – or MEPNet, for shortt – like Detailed in a recently published paper.

The neural network not only has to extrapolate the 3D shape, shape, and structure of the individual pieces defined in the manual for each step, but also needs to interpret the general shape of the semi-assembled models present at each step, regardless of their orientation. Depending on where a piece needs to be added, Lego guides will often provide a picture of a semi-assembled model from a completely different perspective than the previous step. The MEPNet framework should decode what it sees, and how it relates to the generated 3D model as described in the previous steps.

Image of LEGO instructions converted into a 3D model by machine learning

screenshot: Ruochen Wang, Yunzhi Zhang, Jiayuan Mao, Chin-Yi Cheng, Jiajun Wu

The frame then needs to determine where the new pieces at each step will fit into the previously created 3D model By comparing the next iteration of the semi-assembled model with the previous one. Lego brochures don’t use arrows to indicate part position, and will at most use a slightly different color to indicate where new pieces should be placed – which may be too subtle to be detected from a scan of a printed page. The MEPNet framework should define this on its own, but what Makes the process a little easier a Unique feature Lego bricks: the buttons on the top, and the counters on the bottom side that allow them to be tightly attached to each other. MEPNet understands the positional limitations of how Lego blocks can be stacked and installed based on the location of the piece’s studs, helping to narrow down where they attach in the semi-assembled model.

Can you drop a stack of plastic bricks and a guide in front of the robot arm and expect to be back in a completed form in a few hours? Not quite yet. The aim of this research was to translate the 2D images of a Lego guide into assembly steps that the machine can functionally understand. Teaching the robot to handle and assemble Lego blocks is another challenge entirely – that’s just the first step – although we’re not sure if any Lego fans would like to stake the actual building process on the device.

As this quest could have more interesting applications, it will likely automatically convert old Lego help guides into 3D interactive building guides that are included in the Lego mobile app now. With a better understanding of translating 2D images into 3D brick structures, this framework can be used to develop software that can translate images of any object and issue instructions on how to turn it into a Lego model.


Leave a Comment

Your email address will not be published.