Sourcerer is an ongoing research project at the University of California, Irvine aimed at exploring open source projects through the use of code analysis. The existence of an extremely large body of open source code presents a tremendous opportunity for software engineering research. Not only do we leverage this code for our own research, but we provide the open source Sourcerer Infrastructure and curated datasets for other researchers to use.
The Sourcerer Infrastructure is composed of a number of layers.
Over the last several years we have been studying how digital media affects people’s lives. Rather than bring people into a laboratory, I view the real world as a living laboratory--I go where people live, study, and work, to study them as they go about their normal activities. Digital media use affects people’s mood, stress, and behavior quite significantly. In particular, people experience disruptions when working with digital media due to multi-tasking and interruptions.
Collaboration is becoming ubiquitious; at the same time the emergence of new technologies have been changing the landscape of interaction and collaboration. I am interested in the effect that information technologies have on collaboration and the development of new organizational practices such as network-centricity, group-to-group collaboration, nomadic work, and large-scale collaboration. I am also very interested in how Web 2.0 technologies (blogs, wikis, social-networking sites, etc.) are used in collaboration and how they can be integrated into the course of daily work.
We developed a fault-localization technique that utilized correlation-based heuristics. The technique and tool was called Tarantula. Tarantula uses the pass/fail statuses of test cases and the events that occurred during execution of each test case to offer the developer recommendations of what may be the faults that are causing test-case failures. The intuition of the approach is to find correlations between execution events and test-case outcomes --- those events that correlate most highly with failure are suggested as places to begin investigation.
Test suites often need to adapt to the software that it is intended to test. The core software changes and grows, and as such, its test suite also needs to change and grow. However, the test suites can often grow so large as to be unmaintainable. We have developed techniques to assist in the maintenance of these test suites, specifically in allowing for test-suite reduction (while preserving coverage adequacy) and test-suite prioritization.
One method of facilitating developers to understand the complex inner nature of software that we have employed is the use of information visualization. Software is often so complex that even the developers who initially created it cannot understand all of the possible runtime behaviors that it can exhibit --- specifically, all of the bugs that it may contain. In order to present large code bases with innumerable characteristics and relationships of its components (e.g., instructions, variables, values, and timings) we have developed a number of novel visualizations of software.
ISR has long been an internationally recognized leader in research into all aspects of open source software development. In this role, researchers at ISR along with colleagues throughout the U.S. helped to develop a new agenda that can help guide future research into open source software development.