Defense Event

Enabling Program Analysis Through Deterministic Replay and Optimistic Hybrid Analysis

David Devecsery

Friday, October 20, 2017
10:00am - 12:00pm
3725 Beyster Building

Add to Google Calendar

About the Event

As software continues to evolve, software systems increase in complexity. With software systems composed of many distinct but interacting components, today’s system program- mers, users, and administrators find themselves requiring automated ways to find, under- stand, and handle system mis-behavior. Recent information breaches such as the Equifax breach of 2017, and the Heartbleed vulnerability of 2014 show the need to understand and debug prior states of computer systems. In this thesis I focus on enabling practical entire-system retroactive analysis, allowing programmers, users, and system administrators to diagnose and understand the impact of these devastating mishaps. I focus primarly on two techniques. First, I discuss a novel deterministic record, and replay system which enables fast, practical recollection of entire systems of computer state. Second, I discuss optimistic hybrid analysis, a novel optimiza- tion method capable of dramatically accelerating retroactive program analysis. Record and replay systems greatly aid in solving a variety of problems, such as fault tolerance, forensic analysis, and information providence. These solutions, however, assume ubiquitous recording of any application which may have a problem. Current record and replay systems are forced to trade-off between disk space and replay speed. This trade-off viihas historically made it impractical to both record and replay large histories of system-level computation. I present Arnold, a novel record and replay system which efficiently records and replays years of computation on a commodity hard-drive. Arnold combines caching with a unique process-group granularity of recording to produce both small, and quickly recalled recordings. My experiments show that under a desktop workload, Arnold could store 4 years of computation on a commodity 4TB hard drive. Dynamic analysis is used to retroactively identify and address many forms of system mis-behaviors including: programming errors, data-races, private information leakage, and memory errors. Unfortunately, the runtime overhead of dynamic analysis has precluded its adoption in many instances. I present a new dynamic analysis methodology called optimistic hybrid analysis (OHA). OHA uses knowledge of the past to predict program behaviors in the future. These predictions, or likely invariants are speculatively assumed true by a static analysis. This creates a static analysis which can be far more accurate than its traditional counterpart. Once this predicated static analysis is created, it is speculatively used to optimize a final dynamic analysis, creating a far more efficient dynamic analysis than otherwise possible. I demonstrate the effectiveness of OHA by creating an optimistic hybrid backward slicer, OptSlice, and optimistic data-race detector OptFT. OptSlice and OptFT are just as accurate as their traditional hybrid counterparts, but run on average 8.3x and 1.6x faster respectively. In this thesis I demonstrate that Arnold’s ability to record and replay entire computer systems, combined with optimistic hybrid analysis’s ability to quickly analyze prior com- putation, enable a practical and useful entire system retroactive analysis that has been pre- viously unrealized.

Additional Information

Sponsor(s): Peter Chen

Open to: Public