Code smells are an example of a pattern-oriented trend applied to software evolution. Due to their abstract and informal nature, the detection process requires support from lower-level indicators like individual metric values, history of code changes, knowledge about program structure or results of program execution. We examine how the presence of program structures, called micro patterns, is correlated with the code smells and how micro patterns can be exploited for the purpose of smell detection.
Over the last 15 years pattern-oriented research has been a growing field in software engineering. Starting with design patterns, the idea of providing generic, proven solutions to problems of a diverse nature has reached nearly every stage of software development cycle. Patterns are finally being applied to software evolution, which is currently one of the most important challenges in modern software engineering.
Code smells are high-level, intuitive and informal characteristics of program source code that can make software hard to maintain. Using a medical metaphor, they play the role of easily observable symptoms that may indicate a serious illness, but they do not provide information about the root cause. The original catalogue of code smells was proposed by Fowler in 1999 and has been successively extended.
Due to their vague and abstract nature, there is no a single detection method for a given code smell. A smell named Large Class, for example, describes a class that carries too much responsibility and probably should be split up. There are multiple indicators that can suggest the presence of a smell: numerous class members, many implemented interfaces or high internal complexity or low values of cohesion metrics. Therefore, a number of smell indicators should usually be analyzed. Previously we identified a number of such indicators and provided a few examples of how they leverage the detection of particular smells. They include for instance, metrics values, results of dynamic analysis, history of code revisions and knowledge about other flaws that have already been detected or rejected.
Figure 1: Hypothetical relationships between code smells and micropatterns
Micro patterns are another example of a pattern-related trend in software engineering. They are intended to capture common low-level programming techniques, both positive and negative. They can be thought of as class-level traceable patterns, ie structures similar to design patterns that can be mechanically recognized. The detection of micro patterns can provide hints on good and bad programming practices. In our research we analyze how micro patterns can be used as indicators of a code smell’s presence.
The research involves the following steps:
- Identify hypothetical relationships that exist between micro patterns and code smells.
- Conduct an experiment to verify the hypotheses formulated in step one.
- Perform post mortem analysis to find associations between smells and micro patterns and their impact on the detection process.
Brief description of experiment
A number of smell-detection tools are currently on the market. Based on our experience and early experiments we chose two: InFusion and PMD, both designed for analyzing Java programs, but employing different definitions of detected smells.
The codebase for the experiment included six instances of software systems, coming from two open-source projects: GanttProject (in five versions) and JHotDraw (v. 7.0.9), with sizes varying from 393 to 800 classes.
In order to determine a number of smells in all six instances an initial analysis was performed. GanttProject instances had 11 different code smells which recurred – albeit with varying strengths - across almost all versions of the software. JHotDraw was found to be biased with 13 smells, in most cases reduntant with those detected in GanttProject. Similarly, all instances were also examined for the presence of micro patterns. We found examples of 13 different micro patterns in all instances of analyzed projects.
Analysis of results
In order to find relationships between micro patterns and code smells we first counted positive and negative matches of a given micro pattern to a particular smell, and calculated how they correlate. Next, the datasheet with smells and micro patterns has been processed with a Weka implementation of Predictive a priori algorithm. This step resulted in a set of rules that express the identified relationships between code smells and micro patterns, with a certain confidence value. It revealed 12 correlations with a confidence level higher than 50% (eg co-existence of Data Manager and Extender micro patterns implies the presence of Significant Duplication smell with accuracy of 71.5%). However, not all analyzed micro patterns appeared correlated with code smells. Only six (out of 13) patterns and three (out of 11) smells were included in the generated rules.
Our results indicate that the presence of some micro patterns is correlated with the existence of code smells, and can be used to support the smell detection process, and later to evaluate the internal quality of the software. However, the experiment involved a relatively small code base, so a more thorough investigation is needed.
Francesca Arcelli, University of Milano Bicocca, Italy,
Bartosz Walter, Poznań University of Technology, Poland