Krishna to Have Two Papers Featured in IEEE Micro Top Picks Issue



Tushar Krishna will have two of his recent research papers featured in the IEEE Micro“Top Picks from Computer Architecture Conferences,” to be published in the May/June 2019 issue. One paper was selected as an IEEE Micro Top Pick, and another paper was selected as an Honorable Mention. 

Krishna is an assistant professor in the Georgia Tech School of Electrical and Computer Engineering (ECE), where he leads the Synergy Lab. He has been on the faculty since 2015. 

Every year, IEEE Micro publishes this special issue, which recognizes the year’s top papers that have potential for long-term impact. In order for a paper to be chosen as a top pick, it must first have been accepted in a major computer architecture conference that year. Out of 123 top pick submissions in 2018, 12 were selected as Top Picks and 11 were selected as Honorable Mentions. 

IEEE Micro Top Pick

Krishna’s paper that was selected as a Top Pick is entitled “Synchronized Progress in Interconnection Networks (SPIN): A New Theory for Deadlock Freedom.” The paper was published at the 45th International Symposium on Computer Architecture (ISCA), held June 2-6, 2018 in Los Angeles, California. Krishna’s coauthors are his recently graduated M.S. student, Aniruddh Ramrakhyani, and Paul Gratz, an ECE associate professor at Texas A&M University.

All high-performance computers today are built by connecting many processors together. These could be cores on a single-chip inside a smartphone or laptop, or servers inside a supercomputer or datacenter. A key challenge in designing the interconnection network connecting these processors is that of “deadlocks”. A deadlock is a scenario where a set of packets is stuck indefinitely and cannot move forward because they form a cyclic dependence. An analogy is that of a traffic jam in road networks where each car waits for the car in front of it to move, but no car can move if they end up forming a cycle. The traditional approaches to avoid deadlocks either restricts routes (leading to lower performance) or adds more queues (leading to more area and power). Unfortunately, paying one of these two expenses is unavoidable today since a deadlock can bring the whole system to a standstill and has to be avoided for functional correctness of any interconnection network.

In this paper, Krishna and his co-authors challenge the theoretical notion of viewing deadlocks as a resource (in this case queues) dependence problem, and view it instead as a lack of coordination between distributed packets. They demonstrate that enabling every packet to move forward at exactly the same time can help them all move forward and get out of the deadlock. Imagine the same traffic jam as before, but every car in the jam agreeing to move forward at exactly the same time to avoid any collisions. This was the first work to show a deadlock-free interconnection network with fully adaptive routing, without any routing restrictions, with only a single queue at every router port.

IEEE Micro Honorable Mention

Krishna’s paper that was selected as an Honorable Mention is entitled “MAERI: Enabling Flexible Dataflow Mapping over DNN Accelerators via Reconfigurable Interconnects.” The paper was published at the 23rd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), held March 24-28, 2018 in Williamsburg, Virginia. Krishna’s coauthors are his Ph.D. students, Hyoukjun Kwon and Ananda Samajdar.

Machine Learning (ML) and Artificial Intelligence (AI) are becoming ubiquitous. Deep Neural Networks (DNN) have demonstrated highly promising results across applications like computer vision, speech recognition, language translation, recommendation systems, and games. The computational complexity of DNNs and a need for high energy-efficiency has led to a surge in research on hardware accelerators. These AI accelerators are designed for keeping the target DNN algorithm in mind, and use custom datapaths and memory hierarchies to provide 10-1000x better performance or energy-efficiency than traditional CPUs and GPUs. Almost every major company today is building its own version of an AI accelerator. However, a key challenge today is that AI/ML algorithms are evolving at an extremely rapid rate - almost daily, while designing and taping out a hardware chip takes millions of dollars, and replacing these chips every time the algorithm changes is not practical. Thus, an open question today is how to design an accelerator chip that can be built and deployed (on smartphones and/or datacenters) and will be able to run both current and future algorithms efficiently, without having to be replaced frequently.

In their paper, Krishna and his students address this issue by adding lightweight, non-blocking, and reconfigurable interconnects within a DNN accelerator called MAERI. They demonstrate that almost any DNN model can be mapped while utilizing close to 100 percent of the accelerator’s compute resources, by simply reconfiguring the proposed interconnects appropriately. This makes the MAERI approach future-proof to innovations across DNN models and dataflow/mapping techniques.

Krishna Tapped for Google Faculty Research Award



Tushar Krishna has been named as one of the recipients of the Google Faculty Research Award (FRA). Krishna is an assistant professor in the Georgia Tech School of Electrical and Computer Engineering (ECE).

The Google FRA program focuses on funding world-class technical research in Computer Science, Engineering, and related fields. Among 910 proposals from 40 countries and over 320 universities submitted this year, 158 projects were selected for funding. The goal of the Google FRA is to identify and strengthen long-term collaborative relationships with faculty working on problems that will impact how future generations use technology. The award is highly competitive – only 15 percent of applicants receive funding – and each proposal goes through a rigorous Google-wide review process.

The title of Krishna’s award-winning proposal was "Using ML to Design ML Accelerators." With the end of Moore’s Law, the performance of conventional CPUs has stagnated. The growing computing demands from applications, such as Machine Learning (ML), has led to an explosion of custom hardware accelerators (across several companies and startups) for running Machine Learning algorithms at real-time latency and high energy-efficiency. 

A key challenge, however, is that the design space of these accelerators is extremely huge due to an increasing and rapidly evolving suite of Artificial Intelligence/ML models, different requirements for training vs. inference, a plethora of dataflow approaches for minimizing data movement, and varying area-power budgets depending on the target device. Krishna’s proposal seeks to enable rapid design and deployment of ML accelerators by leveraging ML algorithms to efficiently represent and search through this hardware design space.

An ECE faculty member since 2015, Krishna leads the Synergy Lab at Georgia Tech. He and his team focus on architecting next-generation intelligent computer systems and interconnection networks for emerging application areas such as machine learning. Krishna also received an NSF CISE Research Initiation Initiative Award in 2018. He recently had one of his papers selected as an IEEE Micro Top Pick from computer architecture conferences and a second paper was selected as an Honorable Mention; Krishna’s work will be acknowledged in the May/June 2019 issue of IEEE Micro.

Krishna’s Research to be Featured in IEEE Micro Top Picks Issue



Tushar Krishna will have one of his recent research papers featured in the IEEE Micro “Top Picks from Computer Architecture Conferences,” to be published in the May/June 2020 issue. 

Krishna is an assistant professor in the Georgia Tech School of Electrical and Computer Engineering, where he leads the Synergy Lab. This is the second year in a row that one of Krishna’s papers has been chosen as an IEEE Micro Top Pick.

Every year, IEEE Micro publishes this special issue, which recognizes the year’s top papers in computer architecture that have potential for long-term impact. In order for a paper to be considered for a top pick, it must first have been accepted in a major computer architecture conference that year and that have acceptance rates of ~18-22%. Out of 96 submissions this year, twelve were selected as "Top Picks." 

Krishna's paper was titled "Understanding Reuse, Performance, and Hardware Cost of DNN Dataflows: A Data-Centric Approach.” The co-authors were his Ph.D. student Hyoukjun Kwon; Vivek Sarkar, a professor from the School of Computer Science; Sarkar's Ph.D. student Prasanth Chatarasi; and two NVIDIA collaborators, Michael Pellauer and Angshuman Parashar. 

Deep Learning is being deployed at an increasing scale—across the cloud and IoT platforms—to solve complex regression and classification problems in image recognition, speech recognition, language translation, and many more fields, with accuracy close to and even surpassing that of humans. Tight latency, throughput, and energy constraints when running Deep Neural Networks (DNNs) have led to a meteoric increase in specialized hardware–known as accelerators–to run them.

Running DNNs efficiently is challenging for two reasons. First, DNNs today are massive and require billions of computations, and secondly, DNNs have millions of inputs/weights that need to be moved from memory to the accelerator chip which consumes orders of magnitude more energy than the actual computation. DNN accelerators try to address these two challenges by mapping these computations in parallel across hundreds of processing elements to improve performance and by reusing inputs/weights on-chip across multiple outputs to improve energy efficiency. Unfortunately, there can be trillions of ways of slicing and dicing the DNN (also known as “dataflow”) to map it over the finite compute and memory resources within an accelerator.

Krishna’s paper demonstrates a principled approach and framework called MAESTRO to estimate data reuse, performance, power, and area of DNN dataflows. MAESTRO enables rapid design-space exploration of DNN accelerator architectures and mapping strategies, depending on the target DNNs or domain (cloud or IoT). MAESTRO is available as an open-source tool at, and it has already seen adoption within NVIDIA, Facebook, and Sandia National Labs.


Atlanta, GA



Jackie Nemeth

School of Electrical and Computer Engineering


Subscribe to IEEE Micro