Tushar Krishna will have two of his recent research papers featured in the IEEE Micro“Top Picks from Computer Architecture Conferences,” to be published in the May/June 2019 issue. One paper was selected as an IEEE Micro Top Pick, and another paper was selected as an Honorable Mention.
Krishna is an assistant professor in the Georgia Tech School of Electrical and Computer Engineering (ECE), where he leads the Synergy Lab. He has been on the faculty since 2015.
Every year, IEEE Micro publishes this special issue, which recognizes the year’s top papers that have potential for long-term impact. In order for a paper to be chosen as a top pick, it must first have been accepted in a major computer architecture conference that year. Out of 123 top pick submissions in 2018, 12 were selected as Top Picks and 11 were selected as Honorable Mentions.
IEEE Micro Top Pick
Krishna’s paper that was selected as a Top Pick is entitled “Synchronized Progress in Interconnection Networks (SPIN): A New Theory for Deadlock Freedom.” The paper was published at the 45th International Symposium on Computer Architecture (ISCA), held June 2-6, 2018 in Los Angeles, California. Krishna’s coauthors are his recently graduated M.S. student, Aniruddh Ramrakhyani, and Paul Gratz, an ECE associate professor at Texas A&M University.
All high-performance computers today are built by connecting many processors together. These could be cores on a single-chip inside a smartphone or laptop, or servers inside a supercomputer or datacenter. A key challenge in designing the interconnection network connecting these processors is that of “deadlocks”. A deadlock is a scenario where a set of packets is stuck indefinitely and cannot move forward because they form a cyclic dependence. An analogy is that of a traffic jam in road networks where each car waits for the car in front of it to move, but no car can move if they end up forming a cycle. The traditional approaches to avoid deadlocks either restricts routes (leading to lower performance) or adds more queues (leading to more area and power). Unfortunately, paying one of these two expenses is unavoidable today since a deadlock can bring the whole system to a standstill and has to be avoided for functional correctness of any interconnection network.
In this paper, Krishna and his co-authors challenge the theoretical notion of viewing deadlocks as a resource (in this case queues) dependence problem, and view it instead as a lack of coordination between distributed packets. They demonstrate that enabling every packet to move forward at exactly the same time can help them all move forward and get out of the deadlock. Imagine the same traffic jam as before, but every car in the jam agreeing to move forward at exactly the same time to avoid any collisions. This was the first work to show a deadlock-free interconnection network with fully adaptive routing, without any routing restrictions, with only a single queue at every router port.
IEEE Micro Honorable Mention
Krishna’s paper that was selected as an Honorable Mention is entitled “MAERI: Enabling Flexible Dataflow Mapping over DNN Accelerators via Reconfigurable Interconnects.” The paper was published at the 23rd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), held March 24-28, 2018 in Williamsburg, Virginia. Krishna’s coauthors are his Ph.D. students, Hyoukjun Kwon and Ananda Samajdar.
Machine Learning (ML) and Artificial Intelligence (AI) are becoming ubiquitous. Deep Neural Networks (DNN) have demonstrated highly promising results across applications like computer vision, speech recognition, language translation, recommendation systems, and games. The computational complexity of DNNs and a need for high energy-efficiency has led to a surge in research on hardware accelerators. These AI accelerators are designed for keeping the target DNN algorithm in mind, and use custom datapaths and memory hierarchies to provide 10-1000x better performance or energy-efficiency than traditional CPUs and GPUs. Almost every major company today is building its own version of an AI accelerator. However, a key challenge today is that AI/ML algorithms are evolving at an extremely rapid rate - almost daily, while designing and taping out a hardware chip takes millions of dollars, and replacing these chips every time the algorithm changes is not practical. Thus, an open question today is how to design an accelerator chip that can be built and deployed (on smartphones and/or datacenters) and will be able to run both current and future algorithms efficiently, without having to be replaced frequently.
In their paper, Krishna and his students address this issue by adding lightweight, non-blocking, and reconfigurable interconnects within a DNN accelerator called MAERI. They demonstrate that almost any DNN model can be mapped while utilizing close to 100 percent of the accelerator’s compute resources, by simply reconfiguring the proposed interconnects appropriately. This makes the MAERI approach future-proof to innovations across DNN models and dataflow/mapping techniques.