Failure Clustering

Project Dates: 
July 2007
Research Area(s): 
Project Description: 

We developed techniques for clustering of failures. Failure-clustering techniques attempt to categorize failing test cases according to the bugs that caused them. Test cases are clustered by utilizing their execution profiles (gathered from instrumented versions of the code) as a means to encode the behavior of those executions. Such techniques can offer suggestions for duplicate submissions of bug reports. Today, bug reports that are submitted by users (or developers) are identified as duplicates of existing, already-submitted, bug reports based on the textual descriptions of the symptoms reported in the bug reports. Alternatively, the bug reports are recognized as duplicates upon finding and fixing the bug which caused one bug report, and only later when investigating other bug reports is it found that other bug reports are no longer valid --- their bugs had been fixed by earlier bug-report debugging. Such erroneous duplicate identification can cause information overload (i.e., thousands of open bug reports) and bug investigations that utilize less information than could have been offered if the duplication were correctly found. The automated techniques would provide heuristic suggestions to the developer in finding similar bug reports.