[go: nahoru, domu]

Jump to content

Essential complexity: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
for starters, these are clearly not indended to be the same notion.
Changing short description from "numerical measure defined by Thomas J. McCabe" to "Numerical measure of program structure"
 
(37 intermediate revisions by 18 users not shown)
Line 1: Line 1:
{{Short description|Numerical measure of program structure}}
{{split}}
{{About-distinguish-text|the numerical measure of a program's "structuredness" defined by McCabe|the notion antonymic to accidental complexity used by Brooks in [[No Silver Bullet]] and subsequent works}}
'''Essential complexity''' has been defined in various ways by various people. One definition is as an antonym to [[accidental complexity]]. Another definition is a number attempting to measure how close (or far) a given program is from being a [[structured programming|structured program]].


'''Essential complexity''' is a [[numerical measure]] defined by Thomas J. McCabe, Sr., in his highly cited, 1976 paper better known for introducing [[cyclomatic complexity]]. McCabe defined essential complexity as the cyclomatic complexity of the reduced CFG ([[control-flow graph]]) after iteratively replacing (reducing) all [[structured programming]] [[control structure]]s, i.e. those having a single entry point and a single exit point (for example if-then-else and while loops) with placeholder single statements.<ref name="mccabe76">{{cite journal| last=McCabe| date=December 1976| journal=IEEE Transactions on Software Engineering| pages=308–320| title=A Complexity Measure| issue=4| doi=10.1109/tse.1976.233837| s2cid=9116234}}</ref>{{rp|317}}<ref>http://www.mccabe.com/pdf/mccabe-nist235r.pdf {{Bare URL PDF|date=March 2022}}</ref>{{rp|80}}<!-- note that the original paper has an error in the final formula for ev, but this is corrected in the technical report-->
== Antonym to accidental complexity ==
'''Essential complexity''' refers to a situation where all reasonable solutions to a problem must be complicated (and possibly confusing) because the "simple" solutions would not adequately solve the problem.{{cn}} It stands in contrast to [[accidental complexity]], which arises purely from mismatches in the particular choice of tools and methods applied in the solution.


McCabe's reduction process is intended to simulate the conceptual replacement of control structures (and actual statements they contain) with subroutine calls, hence the requirement for the control structures to have a single entry and a single exit point.<ref name="mccabe76"/>{{rp|317}} (Nowadays a process like this would fall under the umbrella term of [[refactoring]].) All structured programs evidently have an essential complexity of 1 as defined by McCabe because they can all be iteratively reduced to a single call to a top-level subroutine.<ref name="mccabe76"/>{{rp|318}} As McCabe explains in his paper, his essential complexity metric was designed to provide a measure of how far off this ideal (of being completely structured) a given program was.<ref name="mccabe76"/>{{rp|317}} Thus greater than 1 essential complexity numbers, which can only be obtained for non-structured programs, indicate that they are further away from the structured programming ideal.<ref name="mccabe76"/>{{rp|317}}
This term has been used since, at least, the mid-1980s. [[Turing Award]] winner [[Fred Brooks]] has used this term and its antonym of [[accidental complexity]] since the mid-1980s. He has also updated his views in 1995 for an anniversary edition of ''Mythical Man-Month,'' chapter 17 "'[[No Silver Bullet]]' Refired".<ref name="Brooks, Proc. IFIP" >[[#Brooks, Proc. IFIP|Brooks, Proc. IFIP]]</ref><ref>Brooks, IEEE Computer</ref><ref>Brooks, Mythical Man-Month, Silver Bullet Refired</ref>
<ref name="nist_web">{{cite web|
url=http://hissa.nist.gov/HHRFdata/Artifacts/ITLdoc/235/chaptera.htm|
title=Structured Testing: A Testing Methodology Using the Cyclomatic Complexity Metric Chapter 10: Essential Complexity|
author=McCabe, Watson|
year=1996}}</ref>


To avoid confusion between various notions of reducibility to structured programs, it's important to note that McCabe's paper briefly discusses and then operates in the context of a 1973 paper by [[S. Rao Kosaraju]], which gave a refinement (or alternative view) of the [[structured program theorem]]. The seminal 1966 paper of Böhm and Jacopini showed that all programs can be [re]written using only structured programming constructs, (aka the D structures: sequence, if-then-else, and while-loop), however, in transforming a random program into a structured program additional variables may need to be introduced (and used in the tests) and some code may be duplicated.<ref name="WattFindlay2004">{{cite book|author1=David Anthony Watt|author2=William Findlay|title=Programming language design concepts|year=2004|publisher=John Wiley & Sons|isbn=978-0-470-85320-7|pages=228}}</ref>
== Measure of a closeness to being a structured program ==
'''Essential complexity''' is also used with a different meaning in connection with [[cyclomatic complexity]]. In this context, essential complexity refers to the cyclomatic complexity after iteratively replacing all well structured control structures with a single statement. Structures such as if-then-else and while loops are considered well structured and then do not increase the essential cyclomatic complexity. Unconstrained use of goto, break and continue statements can produce programs which can not be reduced in this way.


In their paper, Böhm and Jacopini conjectured, but did not prove that it was necessary to introduce such additional variables for certain kinds of non-structured programs in order to transform them into structured programs.<ref name="K">{{cite journal|title=Analysis of structured programs|author=S. Rao Kosaraju|journal=Journal of Computer and System Sciences|volume=9|number=3|date=December 1974|doi=10.1016/S0022-0000(74)80043-7|pages=232–255|doi-access=}}</ref>{{rp|236}} An example of program (that we now know) does require such additional variables is a loop with two conditional exits inside it. In order to address the conjecture of Böhm and Jacopini, Kosaraju defined a more restrictive notion of program reduction than the Turing equivalence used by Böhm and Jacopini. Essentially, Kosaraju's notion of reduction imposes, besides the obvious requirement that the two programs must compute the same value (or not finish) given the same inputs, that the two programs must use the same primitive actions and predicates, the latter understood as expressions used in the conditionals. Because of these restrictions, Kosaraju's reduction does not allow the introduction of additional variables; assigning to these variables would create new primitive actions and testing their values would change the predicates used in the conditionals. Using this more restrictive notion of reduction, Kosaraju proved Böhm and Jacopini's conjecture, namely that a loop with two exits cannot be transformed into a structured program ''without introducing additional variables'', but went further and proved that programs containing multi-level breaks (from loops) form a hierarchy, such that one can always find a program with multi-level breaks of depth ''n'' that cannot be reduced to a program of multi-level breaks with depth less than ''n'', again without introducing additional variables.<ref name="K"/><ref>For more modern treatment of the same results see: Kozen, [http://www.cs.cornell.edu/~kozen/papers/bohmjacopini.pdf The Böhm–Jacopini Theorem is False, Propositionally]</ref>
However, the [[structured program theorem]] proves that goto is '''never''' necessary; any program including goto can be re-written as an equivalent program that does not contain goto. Hence, this sense of essential complexity is not essential.


McCabe notes in his paper that in view of Kosaraju's results, he intended to find a way to capture the essential properties of non-structured programs in terms of their control-flow graphs.<ref name="mccabe76"/>{{rp|315}} He proceeds by first identifying the control-flow graphs corresponding to the smallest non-structured programs (these include branching into a loop, branching out of a loop, and their if-then-else counterparts) which he uses to formulate a theorem analogous to [[Kuratowski's theorem]], and thereafter he introduces his notion of essential complexity in order to give a scale answer ("measure of the structuredness of a program" in his words) rather than a yes/no answer to the question of whether a program's control-flow graph is structured or not.<ref name="mccabe76"/>{{rp|315}} Finally, the notion of reduction used by McCabe to shrink the CFG is not the same as Kosaraju's notion of reducing flowcharts. The reduction defined on the CFG does not know or care about the program's inputs, it is simply a [[graph transformation]].<ref>McCabe footnotes the two definitions of on pages 315 and 317.</ref>
For example, the following C program fragment has an essential complexity of 1, because the inner '''if''' statement and the '''for''' can be reduced:


For example, the following C program fragment has an essential complexity of 1, because the inner '''if''' statement and the '''for''' can be reduced, i.e. it is a structured program.
for(i=0;i<3;i++) {

if(a[i] == 0) b[i] += 2;
<syntaxhighlight lang="C">
for (i = 0; i < 3; i++) {
if (a[i] == 0) b[i] += 2;
}
}
</syntaxhighlight>


The following C program fragment has an essential complexity of more than one. It finds the first row of z which is all zero and puts that index in i; if there is none, it puts -1 in i.
The following C program fragment has an essential complexity of four; its CFG is irreducible. The program finds the first row of z which is all zero and puts that index in i; if there is none, it puts -1 in i.


<syntaxhighlight lang="C">
for(i=0;i<m;i++) {
for(j=0;j<n;j++) {
for (i = 0; i < m; i++) {
if(z[i][j] != 0) '''goto non_zero''';
for (j = 0; j < n; j++) {
}
if (z[i][j] != 0)
'''goto found''';
goto non_zero;
non_zero:
}
}
goto found;
non_zero:
i = -1;
found:
}
i = -1;
found:
</syntaxhighlight>

The idea of CFG reducibility by successive collapses of sub-graphs (ultimately to a single node for well-behaved CFGs) is also used in modern compiler optimization. However the notion from structured programming of single-entry and single-exit control structure is replaced with that of [[natural loop]], which is defined as a "single-entry, multiple-exit loop, with only a single branch back to the entry from within it". The areas of the CFG that cannot be reduced to natural loops are called ''improper regions''; these regions end up having a fairly simple definition: multiple-entry, strongly connected components of the CFG. The simplest improper region is thus a loop with two entry points. Multiple exits do not cause analysis problems in modern compilers. Improper regions (multiple-entries into loops) do cause additional difficulties in optimizing code.<ref name="Muchnick1997">{{cite book|author=Steven S. Muchnick|title=Advanced Compiler Design Implementation|year=1997|publisher=Morgan Kaufmann|isbn=978-1-55860-320-2|pages=[https://archive.org/details/advancedcompiler00much/page/196 196–197 and 215]|url-access=registration|url=https://archive.org/details/advancedcompiler00much/page/196}}</ref>


==See also==
==See also==
* [[History of software engineering]]
* [[History of software engineering]]
* [[Software prototyping]] (one of the main strategies against essential complexity in "No Silver Bullet")
* [[Decision-to-decision path]]
* [[Decision-to-decision path]]
* [[Occam's Razor]]
* [[No silver bullet]]
* [[Cyclomatic complexity]]
* [[Cyclomatic complexity]]

==Further reading==
* {{cite journal
|title=No Silver Bullet - Essence and Accident in Software Engineering
|last=Brooks |first=Fred P.
|authorlink=Fred Brooks
|journal=Proceedings of the IFIP Tenth World Computing Conference
|pages=1069–1076
|year=1986
|ref=Brooks, Proc. IFIP
}}
* {{cite journal
|title=No Silver Bullet - Essence and Accidents of Software Engineering
|last=Brooks |first=Fred P.
|authorlink=Fred Brooks
|authormask=1
|journal=[[Computer (magazine)|IEEE Computer]]
|volume=20 |issue=4
|date=April 1987
|pages=10–19
|ref=Brooks, IEEE Computer
}}

* {{cite book
|work=The Mythical Man Month
|edition=Anniversary Edition with four new chapters
|chapter=Chap. 17
|title='No Silver Bullet' Refired
|last=Brooks |first=Fred P.
|authorlink=Fred Brooks
|authormask=1
|publisher=Addison-Wesley
|year=1995
|isbn=0-201-83595-9
|ref=Brooks, Mythical Man-Month, Silver Bullet Refired
}}


== References ==
== References ==

Latest revision as of 22:47, 5 March 2024

Essential complexity is a numerical measure defined by Thomas J. McCabe, Sr., in his highly cited, 1976 paper better known for introducing cyclomatic complexity. McCabe defined essential complexity as the cyclomatic complexity of the reduced CFG (control-flow graph) after iteratively replacing (reducing) all structured programming control structures, i.e. those having a single entry point and a single exit point (for example if-then-else and while loops) with placeholder single statements.[1]: 317 [2]: 80 

McCabe's reduction process is intended to simulate the conceptual replacement of control structures (and actual statements they contain) with subroutine calls, hence the requirement for the control structures to have a single entry and a single exit point.[1]: 317  (Nowadays a process like this would fall under the umbrella term of refactoring.) All structured programs evidently have an essential complexity of 1 as defined by McCabe because they can all be iteratively reduced to a single call to a top-level subroutine.[1]: 318  As McCabe explains in his paper, his essential complexity metric was designed to provide a measure of how far off this ideal (of being completely structured) a given program was.[1]: 317  Thus greater than 1 essential complexity numbers, which can only be obtained for non-structured programs, indicate that they are further away from the structured programming ideal.[1]: 317 

To avoid confusion between various notions of reducibility to structured programs, it's important to note that McCabe's paper briefly discusses and then operates in the context of a 1973 paper by S. Rao Kosaraju, which gave a refinement (or alternative view) of the structured program theorem. The seminal 1966 paper of Böhm and Jacopini showed that all programs can be [re]written using only structured programming constructs, (aka the D structures: sequence, if-then-else, and while-loop), however, in transforming a random program into a structured program additional variables may need to be introduced (and used in the tests) and some code may be duplicated.[3]

In their paper, Böhm and Jacopini conjectured, but did not prove that it was necessary to introduce such additional variables for certain kinds of non-structured programs in order to transform them into structured programs.[4]: 236  An example of program (that we now know) does require such additional variables is a loop with two conditional exits inside it. In order to address the conjecture of Böhm and Jacopini, Kosaraju defined a more restrictive notion of program reduction than the Turing equivalence used by Böhm and Jacopini. Essentially, Kosaraju's notion of reduction imposes, besides the obvious requirement that the two programs must compute the same value (or not finish) given the same inputs, that the two programs must use the same primitive actions and predicates, the latter understood as expressions used in the conditionals. Because of these restrictions, Kosaraju's reduction does not allow the introduction of additional variables; assigning to these variables would create new primitive actions and testing their values would change the predicates used in the conditionals. Using this more restrictive notion of reduction, Kosaraju proved Böhm and Jacopini's conjecture, namely that a loop with two exits cannot be transformed into a structured program without introducing additional variables, but went further and proved that programs containing multi-level breaks (from loops) form a hierarchy, such that one can always find a program with multi-level breaks of depth n that cannot be reduced to a program of multi-level breaks with depth less than n, again without introducing additional variables.[4][5]

McCabe notes in his paper that in view of Kosaraju's results, he intended to find a way to capture the essential properties of non-structured programs in terms of their control-flow graphs.[1]: 315  He proceeds by first identifying the control-flow graphs corresponding to the smallest non-structured programs (these include branching into a loop, branching out of a loop, and their if-then-else counterparts) which he uses to formulate a theorem analogous to Kuratowski's theorem, and thereafter he introduces his notion of essential complexity in order to give a scale answer ("measure of the structuredness of a program" in his words) rather than a yes/no answer to the question of whether a program's control-flow graph is structured or not.[1]: 315  Finally, the notion of reduction used by McCabe to shrink the CFG is not the same as Kosaraju's notion of reducing flowcharts. The reduction defined on the CFG does not know or care about the program's inputs, it is simply a graph transformation.[6]

For example, the following C program fragment has an essential complexity of 1, because the inner if statement and the for can be reduced, i.e. it is a structured program.

   for (i = 0; i < 3; i++) {
      if (a[i] == 0) b[i] += 2;
   }

The following C program fragment has an essential complexity of four; its CFG is irreducible. The program finds the first row of z which is all zero and puts that index in i; if there is none, it puts -1 in i.

   for (i = 0; i < m; i++) {
      for (j = 0; j < n; j++) {
         if (z[i][j] != 0)
            goto non_zero;
      }
      goto found;
non_zero:
   }
   i = -1;
found:

The idea of CFG reducibility by successive collapses of sub-graphs (ultimately to a single node for well-behaved CFGs) is also used in modern compiler optimization. However the notion from structured programming of single-entry and single-exit control structure is replaced with that of natural loop, which is defined as a "single-entry, multiple-exit loop, with only a single branch back to the entry from within it". The areas of the CFG that cannot be reduced to natural loops are called improper regions; these regions end up having a fairly simple definition: multiple-entry, strongly connected components of the CFG. The simplest improper region is thus a loop with two entry points. Multiple exits do not cause analysis problems in modern compilers. Improper regions (multiple-entries into loops) do cause additional difficulties in optimizing code.[7]

See also[edit]

References[edit]

  1. ^ a b c d e f g McCabe (December 1976). "A Complexity Measure". IEEE Transactions on Software Engineering (4): 308–320. doi:10.1109/tse.1976.233837. S2CID 9116234.
  2. ^ http://www.mccabe.com/pdf/mccabe-nist235r.pdf [bare URL PDF]
  3. ^ David Anthony Watt; William Findlay (2004). Programming language design concepts. John Wiley & Sons. p. 228. ISBN 978-0-470-85320-7.
  4. ^ a b S. Rao Kosaraju (December 1974). "Analysis of structured programs". Journal of Computer and System Sciences. 9 (3): 232–255. doi:10.1016/S0022-0000(74)80043-7.
  5. ^ For more modern treatment of the same results see: Kozen, The Böhm–Jacopini Theorem is False, Propositionally
  6. ^ McCabe footnotes the two definitions of on pages 315 and 317.
  7. ^ Steven S. Muchnick (1997). Advanced Compiler Design Implementation. Morgan Kaufmann. pp. 196–197 and 215. ISBN 978-1-55860-320-2.