Prof. Vijay Janapa Reddi

I am an Associate Professor in the John A. Paulson School of Engineering and Applied Sciences (SEAS) at Harvard University.

Prior to joining Harvard University, I was an Associate Professor at The University of Texas at Austin from 2011 to 2018. I started at Harvard University in the Spring of 2019.

My research is centered on mobile and edge-centric computing systems with a rare taste for cloud computing aspects, mostly as it pertains to edge computing or my students' interests. I direct the Edge Computing Lab. I believe in solving computing problems, rather than associating myself with a particular domain or field of computing (i.e., hardware or software). I take great pride in that, and that reflects in my research groups' training. I generally publish in Computer Architecture, Robotics and ML venues.

Read more

My teaching focuses on both hardware and software. At the freshman/sophomore level, I teach topics that are related to embedded systems and help explain their relationship to the rest of the connected world (i.e., the Internet of Things). At the junior and senior undergraduate level, I teach Computer Architecture. Going beyond that, at the graduate level, I teach courses that are focused on several edge computing aspects, which covers a variety of state-of-the-art research challenges and issues. Examples include how we design better systems for robotics, autonomous cars, drones, etc.


Tiny Robot Learning: Tiny Machine Learning (tinyML) for Robotics, at Conference on Robot Learning (2021), Keynote talk., Wednesday, November 10, 2021:
Tiny machine learning (tinyML) is a fast-growing and emerging field at the intersection of machine learning (ML) algorithms and low-cost embedded systems. It enables on-device analysis of sensor data (vision, audio, IMU, etc.) at ultra-low-power consumption (<1mW). Moving machine learning compute close to the sensor(s) allows for an expansive new variety of always-on ML use-cases, especially in size, weight and power (SWaP) constrained robots. This talk introduces the broad vision behind tinyML, and specifically, it focuses on exciting new applications that tinyML enables for cheap and... Read more about Tiny Robot Learning: Tiny Machine Learning (tinyML) for Robotics
TinyMLPerf: Benchmarking Ultra-low Power Machine Learning Systems, at TinyML Summit 2020 (, Thursday, February 13, 2020:

Tiny machine learning (ML) is poised to drive enormous growth within the IoT hardware and software industry. Measuring the performance of these rapidly proliferating systems, and comparing them in a meaningful way presents a considerable challenge; the complexity and dynamicity of the field obscure the measurement of progress and make embedded ML application and system design and deployment intractable. To foster more systematic development, while enabling innovation, a fair, replicable, and robust method of evaluating tinyML systems is required. A reliable and widely accepted tinyML...

Read more about TinyMLPerf: Benchmarking Ultra-low Power Machine Learning Systems
Gables: A Roofline Model for Mobile SoCs, at Workshop on Infrastructure and Methodology for SoC-level Performance and Power Modeling (co-located with ASPLOS 2019), Saturday, April 13, 2019:

Over a billion mobile consumer system-on-chip (SoC) chipsets ship each year. Of these, the mobile consumer market undoubtedly involving smartphones has a significant market share. Most modern smartphones comprise of advanced SoC architectures that are made up of multiple cores, GPS, and many different programmable and fixed-function accelerators connected via a complex hierarchy of interconnects with the goal of running a dozen or more critical software usecases under strict power, thermal and energy constraints. The steadily growing complexity of a modern SoC challenges hardware...

Read more about Gables: A Roofline Model for Mobile SoCs
Ten Commandments for Mobile Computing, at Infrastructure and Methodology for SoC-level Performance and Power Modeling (co-located with ASPLOS 2019), Saturday, April 13, 2019:
Mobile computing has grown drastically over the past decade. Despite the rapid pace of advancements, mobile device understanding, benchmarking, and evaluation are still in their infancies, both in industry and academia. This article presents an industry perspective on the challenges facing mobile computer architecture, specifically involving mobile workloads, benchmarking, and experimental methodology, with the hope of fostering new research within the community to address pending problems. These challenges pose a threat to the systematic development of future mobile systems, which, if... Read more about Ten Commandments for Mobile Computing
Mobile Robotics for Computer Architects, at International Workshop on Domain Specific System Architecture (DOSSA), Sunday, October 21, 2018:
Autonomous computing systems are marching toward ubiquity in everyday life. In recent years, Unmanned Aerial Systems (UAS) have seen an influx of attention, specifically in application areas with a strong demand for autonomy. A key challenge in making mobile robots such as UAS autonomous is their need to operate under power and energy constraints, which severely limit their onboard sensing, intelligence, and endurance capabilities. To overcome these challenges, researchers must understand how endurance, power efficiency, and computational bottlenecks in autonomous systems relate to one... Read more about Mobile Robotics for Computer Architects
The Vision Behind MLPerf: A Broad ML Benchmark Suite for Measuring the Performance of ML Software Frameworks, ML Hardware Accelerators, and ML Cloud and Edge Platforms, at Samsung Technology Forum in Austin at Samsung Austin Research Center (SARC), Tuesday, October 16, 2018:

Deep Learning is transforming the field of machine learning (ML) from theory to practice. It has also sparked a renaissance in computer system design, fueled by the industry’s need to improve ML accuracy and performance rapidly. But despite the fast pace of innovation, there is a key issue affecting the industry at large, and that is how to enable fair and useful benchmarking of ML software frameworks, ML hardware accelerators and ML platforms. There is a need for systematic ML benchmarking that is both representative of real-world use-cases, and useful for fair comparisons...

Read more about The Vision Behind MLPerf: A Broad ML Benchmark Suite for Measuring the Performance of ML Software Frameworks, ML Hardware Accelerators, and ML Cloud and Edge Platforms
  • 1 of 2
  • »


CS249r: Tiny Machine Learning (TinyML)





Offered: 2020Fa

The explosive growth in machine learning and the ease of use of platforms like TensorFlow (TF) make it an indispensable topic of study for the modern computer science student. At the same time, the pervasiveness of ultra-low power embedded devices, coupled with the introduction of embedded machine learning frameworks like TensorFlow Lite for Microcontrollers will enable the mass proliferation of AI powered IoT devices. As such we have designed an introductory course at the intersection of Machine Learning and Embedded...

Read more about CS249r: Tiny Machine Learning (TinyML)

CS141: Computing Hardware





Offered: 2020Sp

The main emphasis of this course is on the basic concepts of digital computing hardware and fundamental digital design principles and practices for computer systems. This course will cover topics ranging from logic design to machine organization and will address the impact of hardware design on applications and system software. An integral component of this course will be a sequence of hands-on hardware laboratory assignments where you will build digital circuits using simple logic gates and make use of some common software packages...

Read more about CS141: Computing Hardware

CS249r: Special Topics in Edge Computing - Autonomous Machines





Offered: 2019Fa

The course covers a broad range of hardware and software topics in the context of smart/intelligent embedded systems. Traditional embedded systems are passive electronic devices that perform a single task and operate in isolation. In contrast, modern embedded systems are intelligent devices that involve complex hardware and software to perform a multitude of cognitive functions collaboratively. Designing such systems requires us to have deep understanding of the target application domains, as well as an appreciation for...

Read more about CS249r: Special Topics in Edge Computing - Autonomous Machines

EE382V: Dynamic Compilation





Offered: 2014Sp, 2013Sp, 2012Sp, 2010Sp

In this course, we focus on an important category of compilation called dynamic compilation that occurs as a program is running. With modern software heavily utilizing shared libraries, dynamic class loading and runtime binding, the scope of static analysis and transformation has grown restrictive. Dynamic compilation overcomes this challenge by postponing code generation and optimization until the initial stages of execution are complete. Dynamic compilers enable effective feedback-directed optimization, architecture-...

Read more about EE382V: Dynamic Compilation

EE382V: Code Generation and Optimization





Offered: 2017Fa, 2013Fa, 2012Fa

Ever wonder how the high-level C/C++/<your favorite language> syntax statements eventually get converted to the binary 1's and 0's that the machines understand?! In CGO, we focus on the backend, rather than the front-end---it isn't about parsing and lexical analysis---rather it is about code generation and optimization. You'd want to take this course if you are interested in how compilers generate code ARM vs. x86, or how they optimize for different ISA types (RISC vs. CISC vs. VLIW). It assumes limited computer...

Read more about EE382V: Code Generation and Optimization
More Classes