2018  |  2017  |  2016  |  2015  |  2014  |  2013  |  2012  |  2011  |  2010  |  2009  |  2008  |  2007  |  2006  |  2005  |  2004  |  2003  |  2002  |  2001  |  2000

Research Projects

The development of a software system is now ever more frequently a part of a larger development effort, including multiple software systems that co-exist in the same environment: a software ecosystem. Though most studies of the evolution of software have focused on a single software system, there is much that we can learn from the analysis of a set of interrelated systems. Topic modeling techniques show promise for mining the data stored in software repositories to understand the evolution of a system.

Project Dates: 
September 2012

When there is a major environmental disruption such as a natural disaster or war, it is not only the technical infrastructure that needs to be repaired but also the human infrastructure. I am currently studying collaboration resilience-the extent to which people continue to work and socialize despite such a disruption. In this project we are examining the role that information technology plays in helping people repair their human infrastructure.

Research Area(s): 
Project Dates: 
January 2008

In order to produce effective fault-localization, debugging, failure-clustering, and test-suite maintenance techniques, researchers would benefit from a deeper understanding of how faults (i.e., bugs) behave and interact with each other. Some faults, even if executed, may or may not propagate to the output, and even still may or may not influence the output in a way to cause failure. Furthermore, in the presence of multiple faults, faults may interact in a way to obscure each other or in a way to produce behavior not seen in their isolation.

Research Area(s): 
Project Dates: 
August 2011

In the era of big data and personalization, websites and (mobile) applications collect an increasingly large amount of personal information about their users. The large majority of users decide to disclose some but not all information that is requested from them. They trade off the anticipated benefits with the privacy risks of disclosure, a decision process that has been dubbed privacy calculus. Such decisions are inherently difficult though, because they may have uncertain repercussions later on that are difficult to weigh against the (possibly immediate) gratification of disclosure. How can we help users to balance the benefits and risks of information disclosure in a user-friendly manner, so that they can make good privacy decisions?

Project Dates: 
September 2010

Test suites often need to adapt to the software that it is intended to test. The core software changes and grows, and as such, its test suite also needs to change and grow. However, the test suites can often grow so large as to be unmaintainable. We have developed techniques to assist in the maintenance of these test suites, specifically in allowing for test-suite reduction (while preserving coverage adequacy) and test-suite prioritization.

Research Area(s): 
Project Dates: 
May 2001

We developed a token-based approach for large scale code clone detection which is based on a filtering heuristic that reduces the number of token comparisons when the two code blocks are compared. We also developed a MapReduce based parallel algorithm that uses the filtering heuristic and scales to thousands of projects. The filtering heuristic is generic and can also be used in conjunction with other token-based approaches. In that context, we demonstrated how it can increase the retrieval speed and decrease the memory usage of the index-based approaches.

Project Dates: 
July 2011 to January 2014

We developed techniques for clustering of failures. Failure-clustering techniques attempt to categorize failing test cases according to the bugs that caused them. Test cases are clustered by utilizing their execution profiles (gathered from instrumented versions of the code) as a means to encode the behavior of those executions. Such techniques can offer suggestions for duplicate submissions of bug reports.

Research Area(s): 
Project Dates: 
July 2007

Microtask crowdsourcing systems such as FoldIt and ESP partition work into short, self-contained microtasks, reducing barriers to contribute, increasing parallelism, and reducing the time to complete work. Could this model be applied to software development? To explore this question, we are designing a development process and cloud-based IDE for crowd development.

Project Dates: 
May 2012

Pages