Science & Technology
A method for designing neural networks optimally suited for certain tasks
With the right building blocks, machine-learning models can more accurately perform tasks like fraud detection or spam filtering
Written by Adam Zewe, MIT News Office
Neural networks, a type of machine-learning model, are being used to help humans complete a wide variety of tasks, from predicting if someone’s credit score is high enough to qualify for a loan to diagnosing whether a patient has a certain disease. But researchers still have only a limited understanding of how these models work. Whether a given model is optimal for certain task remains an open question.
MIT researchers have found some answers. They conducted an analysis of neural networks and proved that they can be designed so they are “optimal,” meaning they minimize the probability of misclassifying borrowers or patients into the wrong category when the networks are given a lot of labeled training data. To achieve optimality, these networks must be built with a specific architecture.
The researchers discovered that, in certain situations, the building blocks that enable a neural network to be optimal are not the ones developers use in practice. These optimal building blocks, derived through the new analysis, are unconventional and haven’t been considered before, the researchers say.
In a paper published this week in the Proceedings of the National Academy of Sciences, they describe these optimal building blocks, called activation functions, and show how they can be used to design neural networks that achieve better performance on any dataset. The results hold even as the neural networks grow very large. This work could help developers select the correct activation function, enabling them to build neural networks that classify data more accurately in a wide range of application areas, explains senior author Caroline Uhler, a professor in the Department of Electrical Engineering and Computer Science (EECS).
“While these are new activation functions that have never been used before, they are simple functions that someone could actually implement for a particular problem. This work really shows the importance of having theoretical proofs. If you go after a principled understanding of these models, that can actually lead you to new activation functions that you would otherwise never have thought of,” says Uhler, who is also co-director of the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard, and a researcher at MIT’s Laboratory for Information and Decision Systems (LIDS) and its Institute for Data, Systems and Society (IDSS).
Joining Uhler on the paper are lead author Adityanarayanan Radhakrishnan, an EECS graduate student and an Eric and Wendy Schmidt Center Fellow, and Mikhail Belkin, a professor in the Halicioğlu Data Science Institute at the University of California at San Diego.
A neural network is a type of machine-learning model that is loosely based on the human brain. Many layers of interconnected nodes, or neurons, process data. Researchers train a network to complete a task by showing it millions of examples from a dataset.
For instance, a network that has been trained to classify images into categories, say dogs and cats, is given an image that has been encoded as numbers. The network performs a series of complex multiplication operations, layer by layer, until the result is just one number. If that number is positive, the network classifies the image a dog, and if it is negative, a cat.
Activation functions help the network learn complex patterns in the input data. They do this by applying a transformation to the output of one layer before data are sent to the next layer. When researchers build a neural network, they select one activation function to use. They also choose the width of the network (how many neurons are in each layer) and the depth (how many layers are in the network.)
“It turns out that, if you take the standard activation functions that people use in practice, and keep increasing the depth of the network, it gives you really terrible performance. We show that if you design with different activation functions, as you get more data, your network will get better and better,” says Radhakrishnan.
He and his collaborators studied a situation in which a neural network is infinitely deep and wide — which means the network is built by continually adding more layers and more nodes — and is trained to perform classification tasks. In classification, the network learns to place data inputs into separate categories.
“A clean picture”
After conducting a detailed analysis, the researchers determined that there are only three ways this kind of network can learn to classify inputs. One method classifies an input based on the majority of inputs in the training data; if there are more dogs than cats, it will decide every new input is a dog. Another method classifies by choosing the label (dog or cat) of the training data point that most resembles the new input.
The third method classifies a new input based on a weighted average of all the training data points that are similar to it. Their analysis shows that this is the only method of the three that leads to optimal performance. They identified a set of activation functions that always use this optimal classification method.
“That was one of the most surprising things — no matter what you choose for an activation function, it is just going to be one of these three classifiers. We have formulas that will tell you explicitly which of these three it is going to be. It is a very clean picture,” he says.
They tested this theory on a several classification benchmarking tasks and found that it led to improved performance in many cases. Neural network builders could use their formulas to select an activation function that yields improved classification performance, Radhakrishnan says.
In the future, the researchers want to use what they’ve learned to analyze situations where they have a limited amount of data and for networks that are not infinitely wide or deep. They also want to apply this analysis to situations where data do not have labels.
“In deep learning, we want to build theoretically grounded models so we can reliably deploy them in some mission-critical setting. This is a promising approach at getting toward something like that — building architectures in a theoretically grounded way that translates into better results in practice,” he says.
This work was supported, in part, by the National Science Foundation, Office of Naval Research, the MIT-IBM Watson AI Lab, the Eric and Wendy Schmidt Center at the Broad Institute, and a Simons Investigator Award.
Science & Technology
Speedy robo-gripper reflexively organizes cluttered spaces
Rather than start from scratch after a failed attempt, the pick-and-place robot adapts in the moment to get a better hold
Written by Jennifer Chu, MIT News Office
When manipulating an arcade claw, a player can plan all she wants. But once she presses the joystick button, it’s a game of wait-and-see. If the claw misses its target, she’ll have to start from scratch for another chance at a prize.
The slow and deliberate approach of the arcade claw is similar to state-of-the-art pick-and-place robots, which use high-level planners to process visual images and plan out a series of moves to grab for an object. If a gripper misses its mark, it’s back to the starting point, where the controller must map out a new plan.
Looking to give robots a more nimble, human-like touch, MIT engineers have now developed a gripper that grasps by reflex. Rather than start from scratch after a failed attempt, the team’s robot adapts in the moment to reflexively roll, palm, or pinch an object to get a better hold. It’s able to carry out these “last centimeter” adjustments (a riff on the “last mile” delivery problem) without engaging a higher-level planner, much like how a person might fumble in the dark for a bedside glass without much conscious thought.
The new design is the first to incorporate reflexes into a robotic planning architecture. For now, the system is a proof of concept and provides a general organizational structure for embedding reflexes into a robotic system. Going forward, the researchers plan to program more complex reflexes to enable nimble, adaptable machines that can work with and among humans in ever-changing settings.
“In environments where people live and work, there’s always going to be uncertainty,” says Andrew SaLoutos, a graduate student in MIT’s Department of Mechanical Engineering. “Someone could put something new on a desk or move something in the break room or add an extra dish to the sink. We’re hoping a robot with reflexes could adapt and work with this kind of uncertainty.”
SaLoutos and his colleagues will present a paper on their design in May at the IEEE International Conference on Robotics and Automation (ICRA). His MIT co-authors include postdoc Hongmin Kim, graduate student Elijah Stanger-Jones, Menglong Guo SM ’22, and professor of mechanical engineering Sangbae Kim, the director of the Biomimetic Robotics Laboratory at MIT.
High and low
Many modern robotic grippers are designed for relatively slow and precise tasks, such as repetitively fitting together the same parts on a a factory assembly line. These systems depend on visual data from onboard cameras; processing that data limits a robot’s reaction time, particularly if it needs to recover from a failed grasp.
“There’s no way to short-circuit out and say, oh shoot, I have to do something now and react quickly,” SaLoutos says. “Their only recourse is just to start again. And that takes a lot of time computationally.”
In their new work, Kim’s team built a more reflexive and reactive platform, using fast, responsive actuators that they originally developed for the group’s mini cheetah — a nimble, four-legged robot designed to run, leap, and quickly adapt its gait to various types of terrain.
The team’s design includes a high-speed arm and two lightweight, multijointed fingers. In addition to a camera mounted to the base of the arm, the team incorporated custom high-bandwidth sensors at the fingertips that instantly record the force and location of any contact as well as the proximity of the finger to surrounding objects more than 200 times per second.
The researchers designed the robotic system such that a high-level planner initially processes visual data of a scene, marking an object’s current location where the gripper should pick the object up, and the location where the robot should place it down. Then, the planner sets a path for the arm to reach out and grasp the object. At this point, the reflexive controller takes over.
If the gripper fails to grab hold of the object, rather than back out and start again as most grippers do, the team wrote an algorithm that instructs the robot to quickly act out any of three grasp maneuvers, which they call “reflexes,” in response to real-time measurements at the fingertips. The three reflexes kick in within the last centimeter of the robot approaching an object and enable the fingers to grab, pinch, or drag an object until it has a better hold.
They programmed the reflexes to be carried out without having to involve the high-level planner. Instead, the reflexes are organized at a lower decision-making level, so that they can respond as if by instinct, rather than having to carefully evaluate the situation to plan an optimal fix.
“It’s like how, instead of having the CEO micromanage and plan every single thing in your company, you build a trust system and delegate some tasks to lower-level divisions,” Kim says. “It may not be optimal, but it helps the company react much more quickly. In many cases, waiting for the optimal solution makes the situation much worse or irrecoverable.”
Cleaning via reflex
The team demonstrated the gripper’s reflexes by clearing a cluttered shelf. They set a variety of household objects on a shelf, including a bowl, a cup, a can, an apple, and a bag of coffee grounds. They showed that the robot was able to quickly adapt its grasp to each object’s particular shape and, in the case of the coffee grounds, squishiness. Out of 117 attempts, the gripper quickly and successfully picked and placed objects more than 90 percent of the time, without having to back out and start over after a failed grasp.
A second experiment showed how the robot could also react in the moment. When researchers shifted a cup’s position, the gripper, despite having no visual update of the new location, was able to readjust and essentially feel around until it sensed the cup in its grasp. Compared to a baseline grasping controller, the gripper’s reflexes increased the area of successful grasps by over 55 percent.
Now, the engineers are working to include more complex reflexes and grasp maneuvers in the system, with a view toward building a general pick-and-place robot capable of adapting to cluttered and constantly changing spaces.
“Picking up a cup from a clean table — that specific problem in robotics was solved 30 years ago,” Kim notes. “But a more general approach, like picking up toys in a toybox, or even a book from a library shelf, has not been solved. Now with reflexes, we think we can one day pick and place in every possible way, so that a robot could potentially clean up the house.”
This research was supported, in part, by Advanced Robotics Lab of LG Electronics and the Toyota Research Institute.
Science & Technology
Researchers 3D print a miniature vacuum pump
The device would be a key component of a portable mass spectrometer that could help monitor pollutants or perform medical diagnoses in remote parts of the world
Written by Adam Zewe, MIT News Office
Mass spectrometers are extremely precise chemical analyzers that have many applications, from evaluating the safety of drinking water to detecting toxins in a patient’s blood. But building an inexpensive, portable mass spectrometer that could be deployed in remote locations remains a challenge, partly due to the difficulty of miniaturizing the vacuum pump it needs to operate.
MIT researchers utilized additive manufacturing to take a major step toward solving this problem. They 3D printed a miniature version of a type of vacuum pump, known as a peristaltic pump, that is about the size of a human fist.
Their pump can create and maintain a vacuum that has an order of magnitude lower pressure than another type of commonly used pump. The unique design, which can be printed in one pass on a multimaterial 3D printer, prevents fluid or gas from leaking while minimizing heat from friction during the pumping process. This increases the lifetime of the device.
This pump could be incorporated into a portable mass spectrometer used to monitor soil contamination in isolated parts of the world, for instance. The device could also be ideal for use in geological survey equipment bound for Mars, since it would be cheaper to launch the lightweight pump into space.
“We are talking about very inexpensive hardware that is also very capable. With mass spectrometers, the 500-pound gorilla in the room has always been the issue of pumps. What we have shown here is groundbreaking, but it is only possible because it is 3D-printed. If we wanted to do this the standard way, we wouldn’t have been anywhere close,” says Luis Fernando Velásquez-García, a principal scientist in MIT’s Microsystems Technology Laboratories (MTL) and senior author of a paper describing the new pump.
Velásquez-García is joined on the paper by lead author Han-Joo Lee, a former MIT postdoc; and Jorge Cañada Pérez-Sala, an electrical engineering and computer science graduate student. The paper appears today in Additive Manufacturing.
As a sample is pumped through a mass spectrometer, it is hit with an electric charge to turn its atoms into ions. An electromagnetic field manipulates these ions in a vacuum so their masses can be determined. This information can be used to identify the molecules in the sample. Maintaining the vacuum is key because if the ions collide with gas molecules from the air, their dynamics will change.
Peristaltic pumps are commonly used to move fluids or gases that would contaminate the pump’s components, such as reactive chemicals. The substance is entirely contained within a flexible tube that is looped around a set of rollers. The rollers squeeze the tube against its housing as they rotate. The pinched parts of the tube expand in the wake of the rollers, creating a vacuum that draws the liquid or gas through the tube.
While the pumps do create a vacuum, design problems have limited their use in mass spectrometers. The tube material redistributes when force is applied by the rollers, leading to gaps that cause leaks. This problem can be overcome by operating the pump rapidly, forcing the fluid through faster than it can leak out. But this causes excessive heat that damages the pump, and the gaps remain. To fully seal the tube and create the vacuum needed for a mass spectrometer, the mechanism must exert additional force to squeeze the bulged areas, causing more damage, explains Velásquez-García.
An additive solution
He and his team rethought the peristaltic pump design from the bottom up, looking for ways they could use additive manufacturing to make improvements. First, by using a multimaterial 3D printer, they were able to make the flexible tube out of a special type of hyperelastic material that can withstand a huge amount of deformation.
Then, through an iterative design process, they determined that adding notches to the walls of the tube would reduce the stress on the material when squeezed. With notches, the tube material does not need to redistribute to counteract the force from the rollers.
The precision afforded by 3D printing enabled the researchers to produce the exact notch size needed to eliminate the gaps. They were also able to vary the tube’s thickness so the walls are stronger in areas where connectors attach, further reducing stress on the material.
Using a multimaterial 3D printer, they printed the entire tube in one pass, which is important since postassembly can introduce defects that can cause leaks. To do this, they had to find a way to print the narrow, flexible tube vertically while preventing it from wobbling during the process. In the end, they created a lightweight structure that stabilizes the tube during printing but can be easily peeled off later without damaging the device.
“One of the key advantages of using 3D printing is that it allows us to aggressively prototype. If you do this work in a clean room, where a lot of these miniaturized pumps are made, it takes a lot of time. If you want to make a change, you have to start the entire process over. In this case, we can print our pump in a matter of hours, and every time it can be a new design,” Velásquez-García says.
Portable, yet performant
When they tested their final design, the researchers found that it was able to create a vacuum that had an order of magnitude lower pressure than state-of-the-art diaphragm pumps. Lower pressure yields a higher-quality vacuum. To reach that same pressure with standard pumps, one would need to connect three in a series, Velásquez-García says.
The pump reached a maximum temperature of 50 degrees Celsius, half that of state-of-the-art pumps used in other studies, and only required half as much force to fully seal the tube.
In the future, the researchers plan to explore ways to further reduce the maximum temperature, which would enable the tube to actuate faster, creating a better vacuum and increasing the flow rate. They are also working to 3D print an entire miniaturized mass spectrometer. As they develop that device, they will continue fine-tuning the specifications of the peristaltic pump.
“Some people think that when you 3D print something there must be some kind of tradeoff. But here our group has shown that is not the case. It really is a new paradigm. Additive manufacturing is not going to solve all the problems of the world, but it is a solution that has real legs,” Velásquez-García says.
This work was supported, in part, by the Empiriko Corporation.
Science & Technology
Miniscule device could help preserve the battery life of tiny sensors
Researchers demonstrate a low-power “wake-up” receiver one-tenth the size of other devices
Written by Adam Zewe, MIT News Office
Scientists are striving to develop ever-smaller internet-of-things devices, like sensors tinier than a fingertip that could make nearly any object trackable. These diminutive sensors have miniscule batteries which are often nearly impossible to replace, so engineers incorporate wake-up receivers that keep devices in low-power “sleep” mode when not in use, preserving battery life.
Researchers at MIT have developed a new wake-up receiver that is less than one-tenth the size of previous devices and consumes only a few microwatts of power. Their receiver also incorporates a low-power, built-in authentication system, which protects the device from a certain type of attack that could quickly drain its battery.
Many common types of wake-up receivers are built on the centimeter scale since their antennas must be proportional to the size of the radio waves they use to communicate. Instead, the MIT team built a receiver that utilizes terahertz waves, which are about one-tenth the length of radio waves. Their chip is barely more than 1 square millimeter in size.
They used their wake-up receiver to demonstrate effective, wireless communication with a signal source that was several meters away, showcasing a range that would enable their chip to be used in miniaturized sensors.
For instance, the wake-up receiver could be incorporated into microrobots that monitor environmental changes in areas that are either too small or hazardous for other robots to reach. Also, since the device uses terahertz waves, it could be utilized in emerging applications, such as field-deployable radio networks that work as swarms to collect localized data.
“By using terahertz frequencies, we can make an antenna that is only a few hundred micrometers on each side, which is a very small size. This means we can integrate these antennas to the chip, creating a fully integrated solution. Ultimately, this enabled us to build a very small wake-up receiver that could be attached to tiny sensors or radios,” says Eunseok Lee, an electrical engineering and computer science (EECS) graduate student and lead author of a paper on the wake-up receiver.
Lee wrote the paper with his co-advisors and senior authors Anantha Chandrakasan, dean of the MIT School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science, who leads the Energy-Efficient Circuits and Systems Group, and Ruonan Han, an associate professor in EECS, who leads the Terahertz Integrated Electronics Group in the Research Laboratory of Electronics; as well as others at MIT, the Indian Institute of Science, and Boston University. The research is being presented at the IEEE Custom Integrated Circuits Conference.
Scaling down the receiver
Terahertz waves, found on the electromagnetic spectrum between microwaves and infrared light, have very high frequencies and travel much faster than radio waves. Sometimes called “pencil beams,” terahertz waves travel in a more direct path than other signals, which makes them more secure, Lee explains.
However, the waves have such high frequencies that terahertz receivers often multiply the terahertz signal by another signal to alter the frequency, a process known as frequency mixing modulation. Terahertz mixing consumes a great deal of power.
Instead, Lee and his collaborators developed a zero-power-consumption detector that can detect terahertz waves without the need for frequency mixing. The detector uses a pair of tiny transistors as antennas, which consume very little power.
Even with both antennas on the chip, their wake-up receiver was only 1.54 square millimeters in size and consumed less than 3 microwatts of power. This dual-antenna setup maximizes performance and makes it easier to read signals.
Once received, their chip amplifies a terahertz signal and then converts analog data into a digital signal for processing. This digital signal carries a token, which is a string of bits (0s and 1s). If the token corresponds to the wake-up receiver’s token, it will activate the device.
Ramping up security
In most wake-up receivers, the same token is reused multiple times, so an eavesdropping attacker could figure out what it is. Then the hacker could send a signal that would activate the device over and over again, using what is called a denial-of-sleep attack.
“With a wake-up receiver, the lifetime of a device could be improved from one day to one month, for instance, but an attacker could use a denial-of-sleep attack to drain that entire battery life in even less than a day. That is why we put authentication into our wake-up receiver,” he explains.
They added an authentication block that utilizes an algorithm to randomize the device’s token each time, using a key that is shared with trusted senders. This key acts like a password — if a sender knows the password, they can send a signal with the right token. The researchers do this using a technique known as lightweight cryptography, which ensures the entire authentication process only consumes a few extra nanowatts of power.
They tested their device by sending terahertz signals to the wake-up receiver as they increased the distance between the chip and the terahertz source. In this way, they tested the sensitivity of their receiver — the minimum signal power needed for the device to successfully detect a signal. Signals that travel farther have less power.
“We achieved 5- to 10-meter longer distance demonstrations than others, using a device with a very small size and microwatt level power consumption,” Lee says.
But to be most effective, terahertz waves need to hit the detector dead-on. If the chip is at an angle, some of the signal will be lost. So, the researchers paired their device with a terahertz beam-steerable array, recently developed by the Han group, to precisely direct the terahertz waves. Using this technique, communication could be sent to multiple chips with minimal signal loss.
In the future, Lee and his collaborators want to tackle this problem of signal degradation. If they can find a way to maintain signal strength when receiver chips move or tilt slightly, they could increase the performance of these devices. They also want to demonstrate their wake-up receiver in very small sensors and fine-tune the technology for use in real-world devices.
“We have developed a rich technology portfolio for future millimeter-sized sensing, tagging, and authentication platforms, including terahertz backscattering, energy harvesting, and electrical beam steering and focusing. Now, this portfolio is more complete with Eunseok’s first-ever terahertz wake-up receiver, which is critical to save the extremely limited energy available on those mini platforms,” Han says.
Additional co-authors include Muhammad Ibrahim Wasiq Khan PhD ’22; Xibi Chen, an EECS graduate student; Ustav Banerjee PhD ’21, an assistant professor at the Indian Institute of Science; Nathan Monroe PhD ’22; and Rabia Tugce Yazicigil, an assistant professor of electrical and computer engineering at Boston University.
Business & Economy1 year ago
NSE Academy Limited collaborates with HDFC Mutual Fund for financial awareness program
Edu News2 months ago
Innovative Ideas and Breakthroughs from NMIMS MPSTME Civil Engineering
Science & Technology2 months ago
3D-printed revolving devices can sense how they are moving
Business & Economy10 months ago
Using artificial intelligence to control digital manufacturing
Edu News1 year ago
Technique protects privacy when making online recommendations
Edu News12 months ago
Astronomers discover a multiplanet system nearby
Edu News1 year ago
Search reveals eight new sources of black hole echoes
Edu News12 months ago
Stronger security for smart devices