Bridging the Abstraction Gaps from Architecture to Code

Prof. Joshua GarciaOver the past few years, Prof. Joshua Garcia has been working extensively in the areas of mobile security, testing, and analysis; software architecture; and software maintenance and re-engineering. Garcia’s research utilizes static and dynamic analysis techniques, machine learning, and artificial intelligence to address problems in the area of mobile applications and decay of software architecture.

One of his most recent works, led by Negar Ghorbani, a Ph.D. student coadvised by both Garcia and ISR Director Prof. Sam Malek, has examined the effects of architectural inconsistencies that arise among the differences between the prescriptive architecture of a software system, i.e., the architecture as intended or designed by the system’s architects, and the descriptive architecture, i.e., the architecture as implemented or found in the code-level artifacts of the system. "All software systems have an architecture, even if that architecture is not explicitly documented or resides primarily in the minds of a system’s architects,” says Garcia.

A particularly exciting aspect of this research is that it has been conducted using actual specifications of the system’s architecture by its developers and architects. The ability to obtain such specifications is possible due to the recent emergence of true architectural components that are now available in the Java programming language, one of the most widely used programming languages in the world. Specifically, starting with Java 9, programs written in Java must contain a module descriptor file that specifies Java modules, which is the term used to refer to architectural components. Java modules expose Java packages at compile-time or runtime and even control the extent to which packages of a module may be accessed through reflection, i.e., the ability of a program to inspect or modify itself during runtime. Through five different module directives that control the extent to which a Java module requires internals of other modules or exposes its own internals, a Java program now necessitates the use of this new system called the Java Platform Module System (JPMS). In fact, starting with Java 9, JPMS requires that developers either explicitly specify modules or create a single module that exposes all of its internals.

Example Java modules and their inter-dependencies.“After about two decades of research about architecture-based development, amainstream programming language finally has an explicit notion of components with rich architectural interfaces, opening up opportunities for architectural research that could not be conducted before, especially from an empirical standpoint,” says Garcia.

A variety of inconsistencies arise due to mismatches between prescriptive architectures as specified in a module descriptor file and the actual dependencies implemented in a Java system. For example, a module that exports internals that are not used by other modules in an application increases the attack surface of a module, reducing its security. At the same time, creating spurious dependencies reduces encapsulation and, in turn, maintainability. As another example, a module that requires more internals than it actually uses at runtime increases an application’s memory consumption, creating software bloat. Garcia and his team specified eight different types of Java module inconsistencies derived from the five different types of module directives available in JPMS. To detect and repair these inconsistencies, the team created a novel approach called Darcy.

To evaluate Darcy, Garcia and his team obtained 38 Java applications. Using Darcy, they found that 74% of the applications had architectural inconsistencies, totaling 124 inconsistencies across 28 Java applications. Through manual inspection, they found that all of the detected inconsistencies were, in fact, correct. To assess the correctness of Darcy repairs, the team verified that Darcy repairs result in programs that can still compile and, for applications with test suites, that the passing rate of the suites is the same both before and after Darcy repairs. Both of these repair assessments showed thatDarcy repairs programs in a way that maintains both a program’s compilation ability and test passing rates, confirming Darcy’s ability to automatically repair architectural inconsistencies.

A high-level overview of Darcy.The team further evaluated Darcy in terms of security, encapsulation, and software bloat. For security, they assessed the extent to which Darcy repairs can reduce the attack surface of modules, finding an average reduction of 60.33% from 25 apps with overexposed modules. In terms of encapsulation, the team found a reduction of undesirable coupling, due to Darcy repairs, ranging from 20.7%-25.3% on average, up to 80.5%. For six applications suffering from software bloat, Darcy repairs result in reductions of memory consumption by 14% on average, with reductions up to 54.7%.

Besides work on inconsistencies among Java modules, Garcia is one of the lead researchers addressing problems involving the decay of software architectures through the construction of a communitywide research infrastructure called the Software Architecture INstrument (SAIN). This infrastructure is funded by the NSF and involves multiple teams across the United States interested in addressing problems of reproducibility, interoperability, and a lack of datasets and benchmarks when conducting software architecture research from the perspective of software maintenance and empirical software engineering. Collaborators on the project include nearly 50 researchers from industry and academia across the globe.

Recently, and in support of SAIN, Garcia was one of the co-organizers of the 2nd International Workshop on Establishing the Community-Wide Infrastructure for Architecture-Based Software Engineering (ECASE 2019). ECASE workshops explore issues at the intersection of software architecture and empirical software engineering, and identify plausible solutions that jointly move both areas forward. One of ECASE’s goals is to support the construction of SAIN. ECASE 2019 was also coorganized with ISR professors Malek and Nenad Medvidović of USC. Beyond research focused on software architecture, Garcia continues to study issues involving mobile security. In one project led by Garcia’s Ph.D. student Sumaya Almanee, Garcia’s team is examining third-party library vulnerabilities in native code, a major attack surface of Android apps that has been largely ignored thus far by the research community but which can, nevertheless, lead to severe security issues, including privilege escalations and memory-oriented vulnerabilities. The team, which includes Prof. Mathias Payer from Ecole polytechnique fédérale de Lausanne (EPFL), has been gathering many versions of the top 600 apps from Google Play and studying the extent to which they contain vulnerable third-party native libraries and how often apps update their usage of such native libraries.

Garcia is an Assistant Professor in the Department of Informatics in the Donald Bren School of Information and Computer Sciences (ICS). Prior to being appointed an Assistant Professor, he was an Associate Project Scientist at UCI ISR, working under the supervision of Prof. Malek. Garcia received his Ph.D. in 2014 from the University of Southern California under the advisement of Professor Medvidović.

To find out more about Prof. Garcia and his research, visit his website.

This article appeared in ISR Connector issue: