Kitabı oku: «Innovando la educación en la tecnología», sayfa 16
1. INTRODUCTION
Life on earth presents humans with some very complex and complicated concepts. It is easy for people to suppose that cancer and quantum mechanics are both complex and complicated, as they are defined like that by nature. The best humans can do is to make simplified model representations of complex physical phenomena. Software development is unique in the world of science and math, such that developers can choose if our work is going to be complicated.
Our research begins by examining and affirming the definitions of complicated and complex. Afterward, the consequences of complicated are measured with a thorough examination of the bug reports and the source code of prominent software projects. The analysis focused on the components of a LAMP server (Linux, Apache, MySQL, PHP). The results of this work show that when a source code becomes complicated, it requires more effort to sustain. By establishing the indicators of complicated and complex, a protocol can be defined to reduce the development of complicated software. As mentioned before, regardless of problem complexity, developers can choose whether their code will be complicated.
2. BACKGROUND
Complicated and Complex
The words complicated and complex are often misconstrued, and it is possible to have an understanding that the two words mean the same thing. This fact becomes evident when we review standard definitions of the words, as it can be challenging to notice the differences. For example, a standard definition of complicated indicates that something complicated is composed of interconnected parts and is difficult to analyze (Mish, 2004). Similarly, something complex is “composed of many interconnected parts; compound; composite” (Mish, 2004). These definitions almost sound identical.
The etymology of the word complicated shows the Latin root “plic,” which means “to fold”. The word complex has the Latin root “plex,” which means “to weave”. Something that is woven has many strands, but the strands can be well organized. When “fold” is used as a suffix, it means a “multiple”. Something complicated can have fragments folded in, making it conceptually tricky to understand (Lissack, 1999).
Finally, when we consider whether something is complicated, we must acknowledge that, many times, something is complicated when it is not understood. Upon understanding a topic, it is often no longer complicated. With this in mind, consider that the number of people involved also plays a role in creating complicatedness. Having more people involved means that it is more likely that a single person will not understand it.
Git
The use of features found in Git was an essential part of this research. Git is a prevalent and heavily utilized version control system developed in 2005 by Linus Torvalds (Spinellis, 2012). Logs created by Git were used extensively and were examined using the Git log command. A python script was used to parse the logs which focused on Git commits. A Git commit means that a file was created or updated and stored in a local Git repository (Torek, 2019).
Static Analysis
Static analysis examines a program source code without actually executing the program. One of the first tools for performing static analysis was Lint, created in 1978 (Bardas, 2010). Static analysis has improved considerably since then and bugs flagged by the original Lint are now included in the compiler output. Many static analysis tools exist at this time, and the various tools specialize in different areas (Bardas, 2010). This study used the Cppcheck tool (Marjamäki, 2019) as it looks for multiple types of errors and is easy to run immediately without integrating it into a functioning build.
Measurements and Metrics
Attempting to understand complications and complexity is not a new concept, and several metrics have been invented throughout the decades to quantify these subjects. The first metric would merely be counting lines of code and reporting using units such as LOC (lines of code) and KLOC (thousands of lines of code). In 1976, Thomas McCabe developed the cyclomatic complexity metric, which is a measure of the number of linearly independent paths through a program (McCabe 1976). In 1977, Maurice Halstead published complexity metrics based on distinct operators, distinct operands, the total number of operators, and the total number of operands (Halstead, 1977). Using Halstead attributes, a value for difficulty and effort is made. Both Halstead and McCabe metrics have been both praised and criticized, but they are relatively simple and practical contributions to the software engineering community. More advanced metrics exist, but as this research is focused on identifying complexity and complicatedness, we have decided to use uncomplex and uncomplicated metrics as part of our work.
3. RELATED WORK
Existing Literature
Several research studies that consider the human aspects of a complicated source code have been conducted. Daniel Sturtevant has done an excellent work in his thesis for MIT describing the impact of software architectural design, and connecting it to employee productivity and staff turnover. He worked with a professional organization, and included a case study of a development project which shows that the structure of a software system does impact productivity and employee retention. Sturtevant points out that further research on different types of projects is warranted, and that there are many opportunities for this (Sturtevant, 2013).
Human considerations are further explored in the paper from Foucault et al. (2015) researching the impact that developer turnover has on software quality. This paper covers open-source software, and demonstrates a link between developer turnover and the number of bugs found in the project (Foucault, 2015). Interestingly, Foucault and Sturtevant do not entirely agree on the aspect of whether poorly designed software drives developers away or whether the turnover creates a poorly designed software.
The turnover of developers might also be a reason for code churn as described in the paper Use of Relative Code Churn Measurements to Predict System Defect Density (Nagappan. 2005). Code churn is a metric based on how much the code is changed, and this metric has been shown to be an indicator of faults. It is suggested that more recent code is faultier than the original code (Nagappan, 2015).
Other papers analyzed measurable aspects of software itself, rather than human influence. In the article Reexamining the Fault Density - Component Size Connection, Les Hatton discusses the relationship with the size of the program and the number of faults. Lines of code (LOC) has always been a good predictor of fault opportunities, and it is intuitively apparent why (Hatton, 1997). A four-line “Hello World” program will have fewer opportunities for faults than a four thousand-line program. Having more lines offers more opportunities for mistakes. Though LOC is a good indicator of bug opportunities, it also does not help you narrow down specific locations where faults may be present.
Previous Project Work
In a previous work on this project, we determined which programming practices software was the most difficult for engineers. We discovered what engineers misinterpret and what software engineers find displeasing to review. Also, a consistent unpleasant to review code had a higher Halstead and cyclomatic complexity. Stylistic constructs that programmers generally have difficulty evaluating are shown in Table 1 (Dorin, 2018).
Table 1
Unpleasant Styles (Dorin, 2018)
Style Name |
Avoid too deep blocks |
Do not write over 120 columns per line |
Matching braces should be in the same column |
Use less than five parameters |
Use braces even for one statement |
Indent blocks inside a function |
For measuring project conformance with stylistic rules, the nsiqcppstyle tool (Yoon, 2014) was used. Using this previously developed knowledge in a complex and complicated code, we are now expanding the work to develop indicators to relate a complicated source code to this ongoing effort.
4. METHODOLOGY
Existing literature covers a lot of important subjects, but many papers seem to be specific to particular cases and specific environments. In this project, we reviewed several active open-source projects maintained by Git. Popular projects were preferred as they gave the most opportunities for users to report bugs. GitHub.com was used as a source for all the projects. The four cornerstone projects selected were Linux, Apache, MySQL, and PHP. The remaining projects were found based on a report of active projects on GitHub. Once projects were found, the following steps were performed:
1. Calculation of thousands of lines of code (KLOC) for C, C++, and headers. The amount of information given to a human to process is a factor in how much can be understood. For each project, for each source related file, we used the Lizard tool (Yin, 2019) to calculate the lines of code in both source files and headers.
2. Calculation of Halstead for C, C++, and headers. The Halstead metric, which evaluates complexity based on distinct operators, distinct operands, the total number of operators, and the total number of operands, was also used in this study. For each project, we used a program called commented code detector (Borowiec, 2014) to measure the Halstead metrics.
3. Calculation of McCabe cyclomatic complexity for C, C++, and headers. The McCabe cyclomatic complexity metric is based on the number of linearly independent paths through code. The cyclomatic complexity was used because more paths require more effort to understand. The Lizard tool (Yin, 2019) was also used to gather information on cyclomatic complexity for each project.
4. Determination of “bug”-sized changes. Several small utilities were written to analyze the logs maintained by Git. When searching for keywords, we determined the typical number of lines of code found in bugs and thus classified bugs by size. We classified tiny bugs as “gnats,” as they are important for consideration since numerous small bugs take resources away from developing essential features. The gnat-sized bugs constitute 50% of the Git check-ins, which is a good indication of how bothersome they are.
5. Measurement of conformance to stylistic rules. From our previous work, we determined what was essential and what rules had the most impact on programmer understanding. As with our past projects, we used the nsiqcppstyle tool to measure stylistic rule conformance.
6. Measurement of basic coupling between modules. Though many utilities exist for measuring coupling, they were not practical for our use. For coupling, we created a straightforward metric which we called Sheficom. The Sheficom metric computes how many external modules are coupled using a count of the number of headers in the module. Sheficom is based on the assumption that, if a module is not coupled, it would not need to include a header.
7. Measurement of the percentage of bug-related changes. The goal of this effort was to measure how much work in a project was dedicated to bug fixes. By using the technique outlined in Item 4 to identify bugs, the percentage of bug-related changes was measured. We counted the aggregate number of complete bugs as well as the total number of lines changed for fixing bugs.
8. Count of the number of authors. As mentioned, one of the natures of being complicated is the absence of understanding. More programmers mean more people are required to understand the source code. Depending on the communication paths and capabilities of the authors, understanding could be impacted.
9. Performance of static analysis of modules. The Cppcheck tool (Marjamäki, 2019) was used to perform static analysis on the source code. This tool was chosen as it does not require performing any special actions or creating special configurations to analyze the source code, but it will provide an estimate of errors which exist in the project.
Table 2
Metric | SoftwareComplexity | HumanComplication |
Size | X | |
Style | X | |
Sheficom (coupling) | X | |
Author Turnover | X | |
Halstead | X | |
Cyclomatic | X |
These metrics were used to get a picture of the effort needed for ongoing maintenance work. All of the metrics, in one way or another, contribute to impacting effort. However, some metrics have a more significant influence on what humans perceive as complicated (see Table 1). For example, in the new Sheficom metric, tightly coupled modules take a human longer to review and therefore more time to understand. It is plausible that a human, with enough time, could understand what is going on. At that point, though it remains complex, the code is no longer complicated. Author turnover was also essential to measure, as new authors require time to build up an understanding of the existing code.
5. RESULTS
Linux
Linux is a very well-established operating system and, for this study, we performed measurements every year for ten years, starting in 2008. See Table 3 and Figure 2 for a summary of the results. When examining the coding style, it seems to conform less year by year (see Figure 2). Improvements in module coupling and cyclomatic complexity are apparent. However, the size of the project grows linearly over time, and the number of static analysis detected bugs per snapshot also increases over time. More than forty percent of the effort to continue Linux seems to be changes related to bugs, as shown in Table 3. One more thing to bear in mind is that Linux has a lot of authors working on the project, which will contribute to the project being more complicated. It seems that a lot of authors and a lot of code lead to misunderstandings, and a lot of effort is spent fixing bugs rather than adding new functionality.
Apache
The results of the analysis of Apache are listed in Figure 3 and Table 4. As far as significant projects go, Apache does an outstanding job with the level of effort dedicated to bug fixes versus improvements (~32% for bug fixes). A positive aspect of Apache is that many authors remain from release to release (~85%), which means understanding can remain high, thus reducing the impact of a complicated code. We can also see improvements to cyclomatic complexity. Though Halstead measurements remain pretty steady, the coupling is pretty constant, and the style is pretty steady. Even the Cppcheck static analysis snapshot remains steady.
MySQL
The results for MySQL are listed in Figure 4 and Table 5. MySQL has dramatically increased in the number of lines of code. Note the codebase growth between 2015 and 2018, where a large amount of code was added. Along with the dramatic increase in the number of lines of code, the coupling (Sheficom) has also increased. Also, the problems predicted by static analysis (Cppcheck) have increased, and most importantly, the amount of effort spent on fixing bugs has also increased over time.
PHP
PHP has conditions showing a more complicated code in several areas, as shown in Figure 5. Style conformance has dropped while the number of lines of code has grown. The number of veteran authors has fluctuated, contributing to only a few people having an ongoing understanding of the code. Halstead and cyclomatic complexity both indicate a complicated code in this product, which will also negatively impact new authors joining the project. PHP is spending about half of their time fixing issues rather than adding new features.
ImageMagic
Finally, the results for the ImageMagic project were also analyzed and are shown in Figure 6. In this project, style conformance is not that much different from other projects. Measurements like Halstead and cyclomatic complexity are not superb either. However, note the strong connection between existing authors and bug changes. After 2014, a decrease in existing authors coincides with a dramatic increase in bug efforts. The amount of energy spent on bugs impacted the amount of effort on new features as new authors arrived.
6. DISCUSSION
This study used several metrics as a means of identifying how complicated a project’s source code is. The metrics covered different areas related to a complicated code, including author turnover and module coupling. The most significant metrics seem to be the size of the project and how many human contributors there are. All software engineers have had the experience that something is initially complicated, but as time goes by and more effort is applied, the project gets less complicated. The metrics related to software complexity, as shown in Table 1, seemed to be consistent year over year on most of the projects. More successful projects maintain veteran programmers on their teams, as it would seem that new authors are not as effective. Complexity metrics, as shown in Table 1, are a harbinger of how long it may take new authors to understand the code and be productive. Projects with veteran programmers consistently show that less time and effort is required for addressing bugs in the maintenance process.
7. RISKS TO VALIDITY
Several difficulties had to be addressed when doing this project. One difficulty was ascertaining the number of bugs in a project. Often developers find and fix bugs without mentioning them in bug tracking systems. It is also possible that a single “Git commit” has multiple bug fixes included. To overcome this, we evaluated bug-sized fixes and compared them to “Git commits” in general. We may be overestimating or underestimating the bug count. Another area of risk is based on the tools selected. These tools were all available open source and easy to find. However, an exhaustive test on the tools was not performed as part of this study.
Finally, an understanding of the terms complex and complicated is difficult to arrive at, as humans will all perceive these concepts differently. Something may be thought complicated, but after some time and understanding, the judgment of complicated no longer seems accurate.
8. CONCLUSION
The paper shows that there are consequences to having a complicated code. Many aspects play a part in how complicated a source code is, but the most significant contributors seem to be the size of the project and the number of authors participating. More lines of code mean more to understand. The more authors who participate in a project, the more people who need to understand its code. These factors contribute to an increased amount of time spent fixing bugs as opposed to improving the product.
Simple metrics, as used in this paper, can be employed to identify complicated areas of a project. Areas that are identified as complicated by the metrics imply it will take longer to understand those portions of the code. The longer it takes to understand, the more effort required for bug fixing. If a code is hard to understand, new authors will not be able to come up to speed quickly, and existing authors may not wish to remain on the project. Projects need to maintain loyal authors or spend more time on fixing problems instead of adding features.
Areas for future study include more effort mapping static analysis back to a complicated code and measuring how many identified areas remain in a delivered product. The second area of prospective study would be to focus on projects which follow all the rules established and verify if such projects truly have fewer errors than those which ignore these rules.