Monthly Archives: October 2016
Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory have developed a new computational model of a neural circuit in the brain, which could shed light on the biological role of inhibitory neurons — neurons that keep other neurons from firing.
The model describes a neural circuit consisting of an array of input neurons and an equivalent number of output neurons. The circuit performs what neuroscientists call a “winner-take-all” operation, in which signals from multiple input neurons induce a signal in just one output neuron.
Using the tools of theoretical computer science, the researchers prove that, within the context of their model, a certain configuration of inhibitory neurons provides the most efficient means of enacting a winner-take-all operation. Because the model makes empirical predictions about the behavior of inhibitory neurons in the brain, it offers a good example of the way in which computational analysis could aid neuroscience.
The researchers will present their results this week at the conference on Innovations in Theoretical Computer Science. Nancy Lynch, the NEC Professor of Software Science and Engineering at MIT, is the senior author on the paper. She’s joined by Merav Parter, a postdoc in her group, and Cameron Musco, an MIT graduate student in electrical engineering and computer science.
For years, Lynch’s group has studied communication and resource allocation in ad hoc networks — networks whose members are continually leaving and rejoining. But recently, the team has begun using the tools of network analysis to investigate biological phenomena.
“There’s a close correspondence between the behavior of networks of computers or other devices like mobile phones and that of biological systems,” Lynch says. “We’re trying to find problems that can benefit from this distributed-computing perspective, focusing on algorithms for which we can prove mathematical properties.”
In recent years, artificial neural networks — computer models roughly based on the structure of the brain — have been responsible for some of the most rapid improvement in artificial-intelligence systems, from speech transcription to face recognition software.
An artificial neural network consists of “nodes” that, like individual neurons, have limited information-processing power but are densely interconnected. Data are fed into the first layer of nodes. If the data received by a given node meet some threshold criterion — for instance, if it exceeds a particular value — the node “fires,” or sends signals along all of its outgoing connections.
Each of those outgoing connections, however, has an associated “weight,” which can augment or diminish a signal. Each node in the next layer of the network receives weighted signals from multiple nodes in the first layer; it adds them together, and again, if their sum exceeds some threshold, it fires. Its outgoing signals pass to the next layer, and so on.
In artificial-intelligence applications, a neural network is “trained” on sample data, constantly adjusting its weights and firing thresholds until the output of its final layer consistently represents the solution to some computational problem.
Lynch, Parter, and Musco made several modifications to this design to make it more biologically plausible. The first was the addition of inhibitory “neurons.” In a standard artificial neural network, the values of the weights on the connections are usually positive or capable of being either positive or negative. But in the brain, some neurons appear to play a purely inhibitory role, preventing other neurons from firing. The MIT researchers modeled those neurons as nodes whose connections have only negative weights.
Many artificial-intelligence applications also use “feed-forward” networks, in which signals pass through the network in only one direction, from the first layer, which receives input data, to the last layer, which provides the result of a computation. But connections in the brain are much more complex. Lynch, Parter, and Musco’s circuit thus includes feedback: Signals from the output neurons pass to the inhibitory neurons, whose output in turn passes back to the output neurons. The signaling of the output neurons also feeds back on itself, which proves essential to enacting the winner-take-all strategy.
“When you’re part of a community, you want to leave it better than you found it,” says Keertan Kini, an MEng student in the Department of Electrical Engineering, or Course 6. That philosophy has guided Kini throughout his years at MIT, as he works to improve policy both inside and out of MIT.
As a member of the Undergraduate Student Advisory Group, former chair of the Course 6 Underground Guide Committee, member of the Internet Policy Research Initiative (IPRI), and of the Advanced Network Architecture group, Kini’s research focus has been in finding ways that technology and policy can work together. As Kini puts it, “there can be unintended consequences when you don’t have technology makers who are talking to policymakers and you don’t have policymakers talking to technologists.” His goal is to allow them to talk to each other.
At 14, Kini first started to get interested in politics. He volunteered for President Obama’s 2008 campaign, making calls and putting up posters. “That was the point I became civically engaged,” says Kini. After that, he was campaigning for a ballot initiative to raise more funding for his high school, and he hasn’t stopped being interested in public policy since.
High school was also where Kini became interested in computer science. He took a computer science class in high school on the recommendation of his sister, and in his senior year, he started watching computer science lectures on MIT OpenCourseWare (OCW) by Hal Abelson, a professor in MIT’s Department of Electrical Engineering and Computer Science.
“That lecture reframed what computer science was. I loved it,” Kini recalls. “The professor said ‘it’s not about computers, and it’s not about science’. It might be an art or engineering, but it’s not science, because what we’re working with are idealized components, and ultimately the power of what we can actually achieve with them is not based so much on physical limitations so much as the limitations of the mind.”
In part thanks to Abelson’s OCW lectures, Kini came to MIT to study electrical engineering and computer science. Kini is currently pursuing an MEng in electrical engineering and computer science, a fifth-year master’s program following his undergraduate studies in electrical engineering and computer science.
Combining two disciplines
Kini set his policy interest to the side his freshman year, until he took 6.805J (Foundations of Information Policy), with Abelson, the same professor who inspired Kini to study computer science. After taking Abelson’s course, Kini joined him and Daniel Weitzner, a principal research scientist in the Computer Science and Artificial Intelligence Laboratory, in putting together a big data and privacy workshop for the White House in the wake of the Edward Snowden leak of classified information from the National Security Agency. Four years later, Kini is now a teaching assistant for 6.805J.
With Weitzner as his advisor, Kini went on to work on a SuperUROP, an advanced version of the Undergraduate Research Opportunities Program in which students take on their own research project for a full year. Kini’s project focused on making it easier for organizations that had experienced a cybersecurity breach to share how the breach happened with other organizations, without accidentally sharing private or confidential information as well.
Typically, when a security breach happens, there is a “human bottleneck,” as Kini puts it. Humans have to manually check all information they share with other organizations to ensure they don’t share private information or get themselves into legal hot water. The process is time-consuming, slowing down the improvement of cybersecurity for all organizations involved. Kini created a prototype of a system that could automatically screen information about cybersecurity breaches, determining what data had to be checked by a human, and what was safe to send along.
Machines that predict the future, robots that patch wounds, and wireless emotion-detectors are just a few of the exciting projects that came out of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) this year. Here’s a sampling of 16 highlights from 2016 that span the many computer science disciplines that make up CSAIL.
Robots for exploring Mars — and your stomach
- A team led by CSAIL director Daniela Rus developed an ingestible origami robot that unfolds in the stomach to patch wounds and remove swallowed batteries.
- Researchers are working on NASA’s humanoid robot, “Valkyrie,” who will be programmed for trips into outer space and to autonomously perform tasks.
- A 3-D printed robot was made of both solids and liquids and printed in one single step, with no assembly required.
Keeping data safe and secure
- CSAIL hosted a cyber summit that convened members of academia, industry, and government, including featured speakers Admiral Michael Rogers, director of the National Security Agency; and Andrew McCabe, deputy director of the Federal Bureau of Investigation.
- Researchers came up with a system for staying anonymous online that uses less bandwidth to transfer large files between anonymous users.
- A deep-learning system called AI2 was shown to be able to predict 85 percent of cyberattacks with the help of some human input.
Advancements in computer vision
- A new imaging technique called Interactive Dynamic Video lets you reach in and “touch” objects in videos using a normal camera.
- Researchers from CSAIL and Israel’s Weizmann Institute of Science produced a movie display called Cinema 3D that uses special lenses and mirrors to allow viewers to watch 3-D movies in a theater without having to wear those clunky 3-D glasses.
- A new deep-learning algorithm can predict human interactions more accurately than ever before, by training itself on footage from TV shows like “Desperate Housewives” and “The Office.”
- A group from MIT and Harvard University developed an algorithm that may help astronomers produce the first image of a black hole, stitching together telescope data to essentially turn the planet into one large telescope dish.
Tech to help with health
- A team produced a robot that can help schedule and assign tasks by learning from humans, in fields like medicine and the military.
- Researchers came up with an algorithm for identifying organs in fetal MRI scans to extensively evaluate prenatal health.
- A wireless device called EQ-Radio can tell if you’re excited, happy, angry, or sad, by measuring breathing and heart rhythms.
One way to handle big data is to shrink it. If you can identify a small subset of your data set that preserves its salient mathematical relationships, you may be able to perform useful analyses on it that would be prohibitively time consuming on the full set.
The methods for creating such “coresets” vary according to application, however. Last week, at the Annual Conference on Neural Information Processing Systems, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory and the University of Haifa in Israel presented a new coreset-generation technique that’s tailored to a whole family of data analysis tools with applications in natural-language processing, computer vision, signal processing, recommendation systems, weather prediction, finance, and neuroscience, among many others.
“These are all very general algorithms that are used in so many applications,” says Daniela Rus, the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science at MIT and senior author on the new paper. “They’re fundamental to so many problems. By figuring out the coreset for a huge matrix for one of these tools, you can enable computations that at the moment are simply not possible.”
As an example, in their paper the researchers apply their technique to a matrix — that is, a table — that maps every article on the English version of Wikipedia against every word that appears on the site. That’s 1.4 million articles, or matrix rows, and 4.4 million words, or matrix columns.
That matrix would be much too large to analyze using low-rank approximation, an algorithm that can deduce the topics of free-form texts. But with their coreset, the researchers were able to use low-rank approximation to extract clusters of words that denote the 100 most common topics on Wikipedia. The cluster that contains “dress,” “brides,” “bridesmaids,” and “wedding,” for instance, appears to denote the topic of weddings; the cluster that contains “gun,” “fired,” “jammed,” “pistol,” and “shootings” appears to designate the topic of shootings.
Joining Rus on the paper are Mikhail Volkov, an MIT postdoc in electrical engineering and computer science, and Dan Feldman, director of the University of Haifa’s Robotics and Big Data Lab and a former postdoc in Rus’s group.
The researchers’ new coreset technique is useful for a range of tools with names like singular-value decomposition, principal-component analysis, and latent semantic analysis. But what they all have in common is dimension reduction: They take data sets with large numbers of variables and find approximations of them with far fewer variables.
In this, these tools are similar to coresets. But coresets are application-specific, while dimension-reduction tools are general-purpose. That generality makes them much more computationally intensive than coreset generation — too computationally intensive for practical application to large data sets.
The researchers believe that their technique could be used to winnow a data set with, say, millions of variables — such as descriptions of Wikipedia pages in terms of the words they use — to merely thousands. At that point, a widely used technique like principal-component analysis could reduce the number of variables to mere hundreds, or even lower.
The researchers’ technique works with what is called sparse data. Consider, for instance, the Wikipedia matrix, with its 4.4 million columns, each representing a different word. Any given article on Wikipedia will use only a few thousand distinct words. So in any given row — representing one article — only a few thousand matrix slots out of 4.4 million will have any values in them. In a sparse matrix, most of the values are zero.
Crucially, the new technique preserves that sparsity, which makes its coresets much easier to deal with computationally. Calculations become lot easier if they involve a lot of multiplication by and addition of zero.
During January of her junior year at MIT, Caroline Colbert chose to do a winter externship at Massachusetts General Hospital (MGH). Her job was to shadow the radiation oncology staff, including the doctors that care for patients and medical physicists that design radiation treatment plans.
Colbert, now a senior in the Department of Nuclear Science and Engineering (NSE), had expected to pursue a career in nuclear power. But after working in a medical environment, she changed her plans.
She stayed at MGH to work on building a model to automate the generation of treatment plans for patients who will undergo a form of radiation therapy called volumetric-modulated arc therapy (VMAT). The work was so interesting that she is still involved with it and has now decided to pursue a doctoral degree in medical physics, a field that allows her to blend her training in nuclear science and engineering with her interest in medical technologies.
She’s even zoomed in on schools with programs that have accreditation from the Commission on Accreditation of Medical Physics Graduate Programs so she’ll have the option of having a more direct impact on patients. “I don’t know yet if I’ll be more interested in clinical work, research, or both,” she says. “But my hope is to work in a hospital setting.”
Many NSE students and faculty focus on nuclear energy technologies. But, says Colbert, “the department is really supportive of students who want to go into other industries.”
It was as a middle school student that Colbert first became interested in engineering. Later, in a chemistry class, a lesson about nuclear decay set her on a path towards nuclear science and engineering. “I thought it was so cool that one element can turn into another,” she says. “You think of elements as the fundamental building blocks of the physical world.”
Colbert’s parents, both from the Boston area, had encouraged her to apply to MIT. They also encouraged her towards the medical field. “They loved the idea of me being a doctor, and then when I decided on nuclear engineering, they wanted me to look into medical physics,” she says. “I was trying to make my own way. But when I did look seriously into medical physics, I had to admit that my parents were right.”
At MGH, Colbert’s work began with searching for practical ways to improve the generation of VMAT treatment plans. As with another form of radiation therapy called intensity-modulated radiation therapy (IMRT), the technology focuses radiation doses on the tumor and away from the healthy tissue surrounding it. The more accurate the dosing, the fewer side effects patients have after therapy.
With VMAT, a main challenge is in devising an accurate individualized treatment plan. Each plan is customized specifically to the patient’s anatomy. This design process is well defined for IMRT, which uses a set of intersecting beams to deliver radiation. VMAT also intersects beams but rotates them around the patient. “There are more degrees of freedom, so it should provide more accurate treatment, but it’s also more computationally difficult to optimize an individual treatment plan,” says Colbert.
People generally associate graphic processing units (GPUs) with imaging processing. Developed for video games in the 1990s, modern GPUs are specialized circuits with thousands of small, efficient processing units, or “cores,” that work simultaneously to rapidly render graphics on screen.
But for the better part of a decade, GPUs have also found general computing applications. Because of their incredible parallel-computing speeds and high-performance memory, GPUs are today used for advanced lab simulations and deep-learning programming, among other things.
Now, Todd Mostak, a former researcher at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), is using GPUs to develop an analytic database and visualization platform called MapD, which is the fastest of its kind in the world, according to Mostak.
MapD is essentially a form of a commonly used database-management system that’s modified to run on GPUs instead of the central processing units (CPUs) that power most traditional database-management systems. By doing so, MapD can process billions of data points in milliseconds, making it 100 times faster than traditional systems. Moreover, MapD visualizes all processed data points nearly instantaneously — such as, say, plotting tweets on a world map — and parameters can be modified on the fly to adjust the visualized display.
With its first product launched last March, MapD’s clients already include Verizon and other big-name telecommunications companies, a social media giant, and financial and advertising firms. In October, the investment arm of the U.S. Central Intelligence Agency, In-Q-Tel, announced that it had invested in MapD’s latest funding round to accelerate the development of certain features for the U.S. intelligence community.
“[The CIA has] a lot of geospatial data, and they need to be able to form, visualize, and query that data in real-time. It’s a real need across the intelligence community,” Mostak says.
“Making GPUs first-class citizens”
GPUs are designed specifically for parallel computing, with thousands of energy-efficient cores that can, for example, simultaneously determine the color of each pixel on a computer screen to render an image. GPUs also use high bandwidth memory, a form of random access memory (RAM) that’s about an order of magnitude faster than CPUs.
Today, some databases are being powered by GPUs. But these systems suffer from a major design flaw, Mostak says: “In most implementations, the data is initially stored on a CPU, moved to the GPU for a query, and results are moved back to the CPU for storage. Even if you speed up the computation time of a query [by using a GPU], you lose most of the speed by transferring from CPU to GPU and back.”