Bridging Cognitive Gaps of Software Development Dynamics

“Dynamism.  The quality of being characterized by change and progress.” Software, in particular, is defined by this quality.  From changing source code, contributed by teams of developers over the course of years, to billions of instructions performing calculations in the blink of an eye, software is highly dynamic. 

Although this quality can be awe-inspiring, it can also be intimidating and challenging for even the developers of the software to comprehend.  Bridging these gaps between ever-changing complexity and the need for human comprehension of these aspects of software is the subject of Professor Jim Jones’s research at UC Irvine.  

Jones explains, “We architect software, concept by concept, often with an incomplete vision of our intentions. Those intentions get fleshed out over the course of years, with influences coming from customers, other developers, and many other stakeholders.  So, what starts as a coherent architecture and plan often morphs into a product whose structure and behavior are emergent and not easily fully comprehended by any single developer.”

An early success in Jones’s career was an innovation that uses information from test suites to automatically identify the locations of bugs in large codebases.  The technique observes the instructions that are executed by each test case, computes the correlation of each instruction’s execution with failure and success, and visualizes a heat-map of the code to highlight areas that may be buggy.  The technique and tool, called “Tarantula,” sparked off a currently popular field of research called spectra-based fault localization. 

Debugging has been a focus of Jones’s research, then and since, because it “exemplifies the cognitive gap between intention and actuality that often plagues software developers.”  Jones explains, “Our code is designed and implemented as static structures—large constructions of logic and intention.  But only when we actually run the system, does it come to life—and the logic and those intentions are put to the test.  Unfortunately, because these systems are so large and complicated, those intentions often are not fully realized and the symptoms of bugs are observed—resulting in the launch of a potentially time-consuming and frustratingly difficult debugging process.”

One direction that Jones and his research group, The Spider Lab, are taking is to model and analyze the entire change-history of a software system.  The analysis is performed at a fine-grained level that allows for powerfully expressive querying and discovery of trends, culpability, and rationale.  Current revision control systems are not well suited for answering queries such as “which developers ever changed any past version of these particular lines of code, and when were all those changes made?”  Such a query could support investigations of culpability, recovery of design rationale, or discovery of neglected bug patches. 

At a more microscopic (or “micro-chronologic”, as Jones puts it) scale, Jones’s research is modeling and analyzing fine-grained internal behaviors of software in execution.  Jones and his group are researching ways to observe and record internal software execution in ways that allow for discovery of key events that stray from the intention of its developers. Jones has been inspired by modern brain-mapping technologies, such as functional magnetic resonance imaging (or “fMRI”), that shows areas of the human brain that are active during different sorts of activity. Similarly, Jones is mapping internal software behavior to visualizations of execution behaviors.  Code is first “clustered” according to observed cooperation in computation, then the software is executed to observe how those clusters (Jones calls them “emergent features”) activate as a result of the input stimulus to form the externally observable behavior of the system.  Through such observation, patterns and anti-patterns can be observed that lead to insights that can be otherwise difficult to diagnose and comprehend. Moreover, observation and analysis of these internal behaviors will allow developers to navigate, challenge hypotheses, and query distant effects of code execution through the development of futuristic debugging environments.

Further assisting human comprehension of large and complex software systems, Jones’s group has worked to provide automatic, natural-language characterizations of software behavior. By analyzing the software codebase for common concepts described by developers (in code “comments” and identifiers) and pairing these with observations of execution invocations, Jones’s work produces high-level natural-language descriptions of software behavior: both internal and external, and both intentional and unintentional.  These automatic natural-language cues provide early and fast hints of the nature of complex behaviors.

 

Through such innovations and tools, Jones and his research group are working to facilitate deeper comprehension of complex software dynamics, such as its runtime behavior and the evolution of its codebase over time.  Practicality and usability are ever-present goals for Jones, and he is working to turn his ideas and prototypes into systems usable by all software developers.

Jones’s research group—The Spider Lab—includes Ph.D. students Francisco Servant, Nicholas DiGiuseppe, and Vijay Krishna Palepu, and recent undergraduate students Ethan Wessel and Lawrence Yu. 

Jones is a recipient of the prestigious National Science Foundation CAREER Award (2014), which recognizes early-career faculty whose activities form a firm foundation for a lifetime of leadership in integrating education and research.  Jones is also a recipient of a National Science Foundation Computing and Communication Foundations Award (2011) and a Google Faculty Research Award (2011).  His research has also been funded by grants from Boeing and Tata Consultancy Services. 

For more information on Jones’s research, visit his website:
http://www.isr.uci.edu/~jajones

and his research group’s website:
http://spideruci.org

This article appeared in ISR Connector issue: