SCAM 2019: Modeling Observation-based Approximate Dependence

A paper titled “MOAD: Modeling Observation-based Approximate Dependency” has been accepted at International Working Conference on Source Code Analysis and Manipulation (SCAM 2019). It is a collaboration between Seongmin Lee, Dave Binkley, Robert Feldt, Nicolas Gold, and Shin Yoo.

MOAD: Modeling Observation-based Approximate Dependency, Lee, S., Binkley, D., Feldt, R., Gold, N. and Yoo, S., 19th IEEE International Working Conference on Source Code Analysis and Manipulation.

Abstract

While dependency analysis is foundational to many applications of program analysis, the static nature of the existing technique presents challenges such as limited scalability and inability to cope with multi-lingual systems. We present a novel dependency analysis technique that aims to approximate program dependency from a relatively small number of perturbed executions. Our technique, called MOAD (Modeling Observation-based Approximate Dependency), reformulates program dependency as the likelihood of one program element being dependent on another based on a set of observations, instead of a Boolean relationship. MOAD generates a set of program variants by deleting parts of the source code, and executes them while observing the impacts of deletions on various program points. From these observations, MOAD infers a model of program dependency that captures the dependency relationship between the modification and observation points. While MOAD is a purely dynamic dependency analysis technique similar to Observational Slicing, it does not require iterative deletions for a single slicing criterion. Rather, MOAD makes a much smaller number of multiple, independent observations in parallel and infers dependency relationship for multiple program elements simultaneously, significantly reducing the cost of dynamic dependency analysis. We evaluate MOAD by instantiating program slices from the obtained probabilistic dependency model and comparing the slices to the results of Observational Slicing. The results show that, with a certain configuration, MOAD requires only 18.7% of the number of observations to construct the model; the size of the slice generated by MOAD is 16% larger than the slice from ORBS, on average.