We developed a token-based approach for large scale code clone detection which is based on a filtering heuristic that reduces the number of token comparisons when the two code blocks are compared. We also developed a MapReduce based parallel algorithm that uses the filtering heuristic and scales to thousands of projects. The filtering heuristic is generic and can also be used in conjunction with other token-based approaches. In that context, we demonstrated how it can increase the retrieval speed and decrease the memory usage of the index-based approaches.
Yelp reviews and ratings are important source of information to make informed decisions about a venue. We conjecture that further classification of yelp reviews into relevant categories can help users to make an informed decision based on their personal preferences for categories. Moreover, this aspect is especially useful when users do not have time to read many reviews to infer the popularity of venues across these categories.
This research addresses challenges in understanding and developing lightweight, Web-based informal music education environments that bring the complexity and joy of orchestral music to diverse audiences. The challenges span from providing awareness and appreciation of different classical music genres through creation of multi-instrument musical compositions, in ways that are fun and interactive.
Sustainability has become a pressing concern, especially given the looming effects of climate change. Sustainable development aims to meet current needs while ensuring sustainability of natural systems and the environment so as to not compromise the ability of future generations to meet their own needs. Current software engineering methods, however, do not explicitly support sustainability or sustainable development.
In the era of big data and personalization, websites and (mobile) applications collect an increasingly large amount of personal information about their users. The large majority of users decide to disclose some but not all information that is requested from them. They trade off the anticipated benefits with the privacy risks of disclosure, a decision process that has been dubbed privacy calculus. Such decisions are inherently difficult though, because they may have uncertain repercussions later on that are difficult to weigh against the (possibly immediate) gratification of disclosure. How can we help users to balance the benefits and risks of information disclosure in a user-friendly manner, so that they can make good privacy decisions?
Computer games may well be the quintessential domain for software engineering R&D. Why? Modern multi-player online games (MMOG) must address core issues in just about every major area of Computer Science and SE research and education.
Research shows that sharing one’s location can help people stay connected, coordinate daily activities, and provide a sense of comfort and safety [1]. Recently, smartphones and location-based services (LBS) have become widely available in developed countries [7], but only a small percentage of smartphone users have ever tried sharing location with other people [8]. Our work aims to understand real-world factors shaping behaviors and attitudes towards social location-sharing, especially in regards to why people avoid or abandon the technology, or limit their usage.
One of the many challenges of software development and maintenance is the need to collaborate among many constituents and stakeholders. For example, clients interact with software development organizations; software-development organizations consist of many developers and maintainers within the same location and across different locations; and the development organization often outsources some of the testing efforts to independent test agencies. Each of these parties may reside in different locations, often across many very disparate time zones.