2018  |  2017  |  2016  |  2015  |  2014  |  2013  |  2012  |  2011  |  2010  |  2009  |  2008  |  2007  |  2006  |  2005  |  2004  |  2003  |  2002  |  2001  |  2000

Research Projects

Bitcoin is a digital currency and payment platform that has been the source of much media attention. The currency is not backed by a government like most conventional currencies but is part of a democratic and dencentralized movement. Bitcoin transactions are pseudo-anonymous in a similar way to cash money. Why do people use this currency? How do their political values align with their usage of bitcoin? Furthermore, how does the community regulate itself in the absence of a formal hierarchical structure? Lastly, how do anonymous users form communities?

Research Area(s): 
Project Dates: 
October 2013

Previous studies have shown that there is a non-trivial amount of duplication in source code. We analyzed a corpus of 2.6 million non-fork projects hosted on GitHub representing over 258 million files written in Java, C++ Python and JavaScript. We found that this corpus has a mere 54 million unique files. In other words, 79% of the code on GitHub consists of clones of previously created files. There is considerable variation between language ecosystems. JavaScript has the highest rate of file duplication, only 7% of the files are distinct.

Project Dates: 
January 2017

Microtask crowdsourcing systems such as FoldIt and ESP partition work into short, self-contained microtasks, reducing barriers to contribute, increasing parallelism, and reducing the time to complete work. Could this model be applied to software development? To explore this question, we are designing a development process and cloud-based IDE for crowd development.

Project Dates: 
May 2012

We developed a token-based approach for large scale code clone detection which is based on a filtering heuristic that reduces the number of token comparisons when the two code blocks are compared. We also developed a MapReduce based parallel algorithm that uses the filtering heuristic and scales to thousands of projects. The filtering heuristic is generic and can also be used in conjunction with other token-based approaches. In that context, we demonstrated how it can increase the retrieval speed and decrease the memory usage of the index-based approaches.

Project Dates: 
July 2011 to January 2014

This project describes and documents observational results that arise from the playtesting­-based evaluation of twenty-­six computer games focused on science learning or scientific research. We refer to this little studied genre of computer games as science learning games (SLGs). Our goal was to begin to identify a new set of criteria, play mechanics, and play experiences that give rise to play­-based learning experiences in the realm of different scientific topics.

Project Dates: 
October 2014

We developed a fault-localization technique that utilized correlation-based heuristics. The technique and tool was called Tarantula.  Tarantula uses the pass/fail statuses of test cases and the events that occurred during execution of each test case to offer the developer recommendations of what may be the faults that are causing test-case failures. The intuition of the approach is to find correlations between execution events and test-case outcomes --- those events that correlate most highly with failure are suggested as places to begin investigation.

Research Area(s): 
Project Dates: 
May 2001

Computer games may well be the quintessential domain for software engineering R&D. Why? Modern multi-player online games (MMOG) must address core issues in just about every major area of Computer Science and SE research and education.

Project Dates: 
January 2010

Pages