[go: nahoru, domu]

Jump to content

Criticism of C++: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Adjusting verbiage to reflect opinion compared to fact.
 
(107 intermediate revisions by 68 users not shown)
Line 1: Line 1:
{{Short description|Criticism of the C++ programming language}}
{{Multiple issues|
{{original research|date=June 2016}}
{{Use dmy dates|date=September 2021}}
{{unreliable sources|date=June 2016}}
{{lacking overview|date=September 2021}}
Although [[C++]] is one of the most widespread programming languages,<ref>{{Cite web|title=Stack Overflow Developer Survey 2021|url=https://insights.stackoverflow.com/survey/2021/|access-date=2021-12-28|website=Stack Overflow|language=en}}</ref> many prominent software engineers criticize C++ (the language, and its compilers) arguing that it is overly complex<ref>{{Cite web|title=Google executive frustrated by Java, C++ complexity - Google, software, application development, Development tools, Languages and standards, Rob Pike|url=https://www2.cio.com.au/article/print/354210/google_executive_frustrated_by_java_c_complexity/|access-date=2021-12-28|website=CIO}}</ref> and fundamentally flawed.<ref>{{Cite web|title=C++ (Al Viro; Linus Torvalds; Theodore Ts'o)|url=https://yarchive.net/comp/linux/c++.html|access-date=2021-12-28|website=yarchive.net}}</ref> Among the critics have been: [[Rob Pike|Robert Pike]],<ref>{{Cite web|title=Google executive frustrated by Java, C++ complexity - Google, software, application development, Development tools, Languages and standards, Rob Pike|url=https://www2.cio.com.au/article/print/354210/google_executive_frustrated_by_java_c_complexity/|access-date=2021-12-28|website=CIO}}</ref> [[Joshua Bloch]], [[Linus Torvalds]],<ref>{{Cite web|title=C++ (Al Viro; Linus Torvalds; Theodore Ts'o)|url=https://yarchive.net/comp/linux/c++.html|access-date=2021-12-28|website=yarchive.net}}</ref> [[Donald Knuth]], [[Richard Stallman]], and [[Ken Thompson]]. [[C++]] has been widely adopted and implemented as a [[Systems programming|systems language]] through most of its existence. It has been used to build many pieces of very important software (such types of software include, but are not limited to: [[Operating system|operating systems]], [[Runtime system|runtime systems]], [[Interpreter (computing)|programming language interpreters]], [[Parsing|parsers]], [[Lexical analysis|lexers]], [[Compiler|compilers]], etc...).

Some of the problems might be related to the compiler used and not the language itself.}}
{{Use dmy dates|date=December 2013}}
[[C++]] is a [[general-purpose programming language]] with [[imperative programming|imperative]], [[object-oriented programming|object-oriented]] and [[generic programming|generic]] programming features. Many criticisms have been leveled at the programming language from, among others, prominent software developers like [[Linus Torvalds]],<ref>{{cite mailing list |url=https://lwn.net/Articles/249460/ |title=Re: [RFC] Convert builin-mailinfo.c to use The Better String Library |date=6 September 2007 |accessdate=31 March 2015 }}</ref> [[Richard Stallman]],<ref>{{cite mailing list |url=http://harmful.cat-v.org/software/c++/rms |title=Re: Efforts to attract more users? |date=12 July 2010 |accessdate=31 March 2015 }}</ref> [[Joshua Bloch]], [[Ken Thompson]]<ref>{{cite web |url=http://www.drdobbs.com/open-source/interview-with-ken-thompson/229502480 |title=Dr. Dobb's: Interview with Ken Thompson |author=Andrew Binstock |date=18 May 2011 |accessdate=7 February 2014}}</ref><ref name="Seibel2009">{{cite book|author=Peter Seibel|title=Coders at Work: Reflections on the Craft of Programming|url=https://books.google.com/books?id=nneBa6-mWfgC&pg=PA475|date=16 September 2009|publisher=Apress|isbn=978-1-4302-1948-4|pages=475–476}}</ref><ref name="gigamonkeysWordpress">https://gigamonkeys.wordpress.com/2009/10/16/coders-c-plus-plus/</ref>, and [[Donald Knuth]]<ref name="dobbsKnuth">http://www.drdobbs.com/architecture-and-design/an-interview-with-donald-knuth/228700500</ref><ref name="knuth1993">http://tex.loria.fr/litte/knuth-interview</ref>.

C++ is a [[multiparadigm programming language]]<ref>{{cite web |url=http://www.stroustrup.com/bs_faq.html#multiparadigm|title=What is "multiparadigm programming"?}}</ref> with extensive but not complete [[backward compatibility]] with the programming language [[C (programming language)|C]].<ref>{{cite web |url=http://www.stroustrup.com/bs_faq.html#remove-from-C++|title=Are there any features you'd like to remove from C++?}}</ref> This article focuses not on C features like [[pointer arithmetic]], [[Operator precedence in C and C++|operator precedence]] or [[preprocessor]] [[Macro (computer science)|macros]], but on pure C++ features that are often criticized.


==Slow compile times==
==Slow compile times==
The natural interface between [[Translation unit (programming)|source files]] in C/C++ are [[Include directive|header files]]. Each time a header file is modified, all source files that include the header file should recompile their code. Header files are slow because they are textual and context-dependent as a consequence of the preprocessor.<ref>{{cite web|url=http://www.drdobbs.com/cpp/c-compilation-speed/228701711|title=C++ compilation speed|author=Walter Bright}}</ref> C only has limited amounts of information in header files, the most important being struct declarations and function prototypes. C++ stores its [[C++ classes|classes]] in header files and they are not only exposing their public variables and public functions (like C with its structs and function prototypes) but also their private functions. This forces unnecessary recompiles of all source files that include the header file, each time when changing these private functions. This problem is magnified where the classes are written as [[Template (C++)|templates]], forcing all of their code into the slow header files, which is the case with the whole [[C++ standard library]]. Large C++ projects can therefore be relatively slow to compile.<ref>{{cite web|url=http://commandcenter.blogspot.de/2012/06/less-is-exponentially-more.html|title=Less is exponentially more|quote=Back around September 2007, I was doing some minor but central work on an enormous Google C++ program, one you've all interacted with, and my compilations were taking about 45 minutes on our huge distributed compile cluster.|author=[[Rob Pike]]}}</ref> The problem is largely solved by precompiled header in modern compilers.
The natural interface between [[Translation unit (programming)|source files]] in C and C++ are [[Include directive|header files]]. Each time a header file is modified, all source files that include the header file should recompile their code. Header files are slow because they are textual and context-dependent as a consequence of the preprocessor.<ref name="02jUg">{{cite web|url=http://www.drdobbs.com/cpp/c-compilation-speed/228701711|title=C++ compilation speed|author=Walter Bright}}</ref> C only has limited amounts of information in header files, the most important being struct declarations and function prototypes. C++ stores its [[C++ classes|classes]] in header files and they not only expose their public variables and public functions (like C with its structs and function prototypes) but also their private functions. This forces unnecessary recompilation of all source files which include the header file each time these private functions are edited. This problem is magnified where the classes are written as [[Template (C++)|templates]], forcing all of their code into the slow header files, which is the case with much of the [[C++ standard library]]. Large C++ projects can therefore be relatively slow to compile.<ref name="OfbVj">{{cite web|url=http://commandcenter.blogspot.de/2012/06/less-is-exponentially-more.html|title=Less is exponentially more|quote=Back around September 2007, I was doing some minor but central work on an enormous Google C++ program, one you've all interacted with, and my compilations were taking about 45 minutes on our huge distributed compile cluster.|author=Rob Pike|date=25 June 2012 |author-link=Rob Pike}}</ref> The problem is largely solved by precompiled headers in modern compilers or using the module system that was added in [[C++20]]; future C++ standards are planning to expose the functionality of the standard library using modules.<ref name="n05vr">{{cite web|url=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0592r3.html|title=To boldly suggest an overall plan for C++23|author=Ville Voutilainen}}</ref>

One solution for this is to use the [[Pimpl idiom]]. By using pointers on the stack to the implementation object on the [[Memory management#HEAP|heap]] there is a higher chance all object sizes on the stack become equal. This of course comes with the cost of an unnecessary heap allocation for each object. Additionally [[precompiled header]]s can be used for header files that are fairly static.

One suggested solution is to use a module system.<ref>{{cite web|url=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4047.pdf|title=A Module System for C++}}</ref>


==Global format state of <iostream>==
==Global format state of <iostream>==
C++ [[Input/output (C++)|<iostream>]] unlike C [[C file input/output|<stdio.h>]] relies on a global format state. This fits very poorly together with [[Exception handling|exceptions]], when a function must interrupt the control flow, after an error, but before resetting the global format state. One fix for this is to use [[Resource Acquisition Is Initialization]] (RAII) which is implemented in [[Boost (C++ libraries)|Boost]]<ref>{{cite web|url=http://www.boost.org/doc/libs/1_60_0/libs/io/doc/ios_state.html|title=iostream state saver}}</ref> but is not a part of the [[C++ Standard Library]].
C++ <code>[[Input/output (C++)|<iostream>]]</code>, unlike C <code>[[C file input/output|<stdio.h>]]</code>, relies on a global format state. This fits very poorly together with [[Exception handling|exceptions]], when a function must interrupt the control flow, after an error but before resetting the global format state. One fix for this is to use [[resource acquisition is initialization]] (RAII), which is implemented in the [[Boost (C++ libraries)|Boost]]<ref name="PzCwY">{{Cite web|url=https://www.boost.org/doc/libs/1_60_0/libs/io/doc/ios_state.html|title=I/O Stream-State Saver Library - 1.60.0|website=www.boost.org}}</ref> libraries and part of the [[C++ Standard Library]].


The global state of <iostream> uses static constructors which causes overhead.<ref>{{cite web|url=http://llvm.org/docs/CodingStandards.html#include-iostream-is-forbidden|title=#include <iostream> is Forbidden}}</ref> Another source of bad performance is the use of std::endl instead of '\n' when doing output, because of it calling flush as a side effect. C++ <iostream> is by default synchronized with <stdio.h> which can cause performance problems. Shutting it off can improve performance but forces giving up thread safety.
<code><iostream></code> uses static constructors which causes overhead if included, even if the library is not used.<ref name="omcjI">{{Cite web|url=https://llvm.org/docs/CodingStandards.html|title=LLVM Coding Standards — LLVM 12 documentation|website=llvm.org}}</ref> Another source of bad performance is the misuse of <code>std::endl</code> instead of <code>\n</code> when doing output, as it also calls <code>.flush()</code>. C++ <code><iostream></code> is by default synchronized with <code><stdio.h></code> which can cause performance problems in command-line i/o intensive applications. Shutting it off can improve performance but forces giving up some ordering guarantees.


Here follows an example where an exception interrupts the function before std::cout can be restored from hexadecimal to decimal. The error number in the catch statement will be written out in hexadecimal which probably isn't what one wants:
Here follows an example where an exception interrupts the function before <code>std::cout</code> can be restored from hexadecimal to decimal. The error number in the catch statement will be written out in hexadecimal which probably is not what one wants:
<source lang="c++">
<syntaxhighlight lang="c++">
#include <iostream>
#include <iostream>
#include <vector>
#include <vector>


int main() {
int main() {
try {
try {
std::cout << std::hex;
std::cout << std::hex
std::cout << 0xFFFFFFFF << std::endl;
<< 0xFFFFFFFF << '\n';
// std::bad_alloc will be thrown here:
std::vector<int> vector(0xFFFFFFFFFFFFFFFFL,0); // Exception
std::vector<int> vector(0xFFFFFFFFFFFFFFFFull);
std::cout << std::dec; // Never reached
} catch(std::exception &e) {
std::cout << std::dec; // Never reached
// (using scopes guards would have fixed that issue
std::cout << "Error number: " << 10 << std::endl; // Not in decimal
// and made the code more expressive)
}
}
catch (const std::exception& e) {
std::cout << "Error number: " << 10 << '\n'; // Not in decimal
}
}
}
</source>It is acknowledged even by some members of the C++ standards body<ref>{{Cite web|url=http://open-std.org/JTC1/SC22/WG21/docs/papers/2015/n4412.html|title=N4412: Shortcomings of iostreams|website=open-std.org|access-date=2016-05-03}}</ref> that the iostreams interface is an aging interface that needs to be replaced eventually. This design forces the library implementers to adopt solutions that impact performance greatly.{{citation needed|date=June 2016}}
</syntaxhighlight>It is even acknowledged by some members of the C++ standards body<ref name="BA4GS">{{Cite web|url=http://open-std.org/JTC1/SC22/WG21/docs/papers/2015/n4412.html|title=N4412: Shortcomings of iostreams|website=open-std.org|access-date=3 May 2016}}</ref> that <code><iostream></code> is an aging interface that eventually needs to be replaced.

==Heap allocations in containers==
After the inclusion of the STL in C++, its templated containers were promoted while the traditional C arrays were strongly discouraged.<ref>{{cite web|url=http://www.artima.com/intv/goldilocks.html|title=A Conversation with Bjarne Stroustrup}}</ref> One important feature of containers like std::string and std::vector is them having their memory on the [[Memory management#HEAP|heap]] instead of on the stack like C arrays.<ref>{{cite web|url=http://www.cplusplus.com/reference/vector/vector/|title=std::vector}}</ref><ref>{{cite web|url=http://www.cplusplus.com/reference/string/string/|title=std::string}}</ref> To stop them from allocating on the heap, one would be forced to write a custom [[Allocator (C++)|allocator]], which isn't standard. Heap allocation is slower than [[Stack-based memory allocation|stack allocation]] which makes claims about the classical C++ containers being "just as fast" as C arrays somewhat untrue.<ref>{{cite web|url=http://www.artima.com/intv/goldilocks.html|title=A Conversation with Bjarne Stroustrup|quote=I think a better way of approaching C++ is to use some of the standard library facilities. For example, use a vector rather than an array. A vector knows its size. An array does not... Most of these techniques are criticized unfairly for being inefficient. The assumption is that if it is elegant, if it is higher level, it must be slow. It could be slow in a few cases, so deal with those few cases at the lower level, but start at a higher level. In some cases, you simply don't have the overhead. For example, vectors really are as fast as arrays.}}</ref><ref>{{cite web|url=http://www.stroustrup.com/bs_faq2.html#slow-containers|title=Why are the standard containers so slow?|author=Bjarne Stroustrup|quote=People sometimes worry about the cost of std::vector growing incrementally. I used to worry about that and used reserve() to optimize the growth. After measuring my code and repeatedly having trouble finding the performance benefits of reserve() in real programs, I stopped using it except where it is needed to avoid iterator invalidation (a rare case in my code). Again: measure before you optimize.}}</ref>{{failed verification|date=June 2016}}{{synthesis inline|date=June 2016}} They are just as fast to use, but not to construct. One way to solve this problem was to introduce stack allocated containers like boost::array<ref>{{cite web|url=http://www.boost.org/doc/libs/1_60_0/doc/html/array.html|title=boost::array|quote=As replacement for ordinary arrays, the STL provides class std::vector. However, std::vector<> provides the semantics of dynamic arrays. Thus, it manages data to be able to change the number of elements. This results in some overhead in case only arrays with static size are needed.}}</ref> or std::array.


C++20 added <code>std::format</code> that eliminated the global formatting state and addressed other issues in iostreams.<ref name="P0645">{{Cite web|url=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0645r10.html|title=P0645: Text Formatting|website=open-std.org|access-date=20 May 2021}}</ref> For example, the catch clause can now be written as
As for strings there is the possibility to use SSO (short string optimization) where only strings exceeding a certain size are allocated on the heap. There is however no standard way in C++ for the user to decide this SSO limit and it remains hard coded and implementation specific.<ref>{{cite web|url=http://scottmeyers.blogspot.de/2012/04/stdstring-sso-and-move-semantics.html|title=std::string, SSO, and Move Semantics|author=Scott Meyers|quote=Case in point: std::string. It supports moves, but in cases where std::string is implemented using SSO (the small string optimization), small strings are just as expensive to move as to copy! What it means to be "small" is up to the implementation. Unless I'm misreading the source files, std::strings with a capacity of up to 15 are "small" in Visual C++ 11 beta, so std::string objects of up to that capacity will not benefit when copies become moves. (The SSO buffer size seems to be hardwired to be 16 bytes in VC11, so the maximum capacity of a std::wstring that fits in the SSO buffer is smaller: 7 characters.)}}</ref><ref>{{cite web|url=http://stackoverflow.com/questions/21694302/what-are-the-mechanics-of-short-string-optimization-in-libc|title=What are the mechanics of short string optimization in libc++? A comprehensive answer by Howard Hinnant, libc++ maintainer|author=Howard Hinnant|quote=On a 32 bit machine, 10 chars will fit in the short string. sizeof(string) is 12. On a 64 bit machine, 22 chars will fit in the short string. sizeof(string) is 24. A major design goal was to minimize sizeof(string), while making the internal buffer as large as possible.}}</ref>{{acn|date=June 2016}}
<syntaxhighlight lang="c++">
std::cout << std::format("Error number: {}\n", 10);
</syntaxhighlight>
which is not affected by the stream state. Although it might introduce overhead due to the actual formatting being done at runtime.


==Iterators==
==Iterators==
The philosophy of the [[Standard Template Library]] (STL) embedded in the [[C++ Standard Library]] is to use generic algorithms in the form of [[Template (C++)|templates]] using [[iterator]]s. Early compilers optimized small objects such as iterators poorly, which [[Alexander Stepanov]] characterized as the "abstraction penalty", although modern compilers optimize away such small abstractions well.<ref>{{cite web|url=http://www.open-std.org/jtc1/sc22/wg21/docs/D_3.cpp|title=Stepanov Benchmark|author=Alexander Stepanov|quote=The final number printed by the benchmark is a geometric mean of the performance degradation factors of individual tests. It claims to represent the factor by which you will be punished by your compiler if you attempt to use C++ data abstraction features. I call this number "Abstraction Penalty." As with any benchmark it is hard to prove such a claim; some people told me that it does not represent typical C++ usage. It is, however, a noteworthy fact that majority of the people who so object are responsible for C++ compilers with disproportionately large Abstraction Penalty.}}</ref> The interface using pairs of iterators to denote ranges of elements has also been criticized,<ref>{{cite web|url=http://accu.org/content/conf2009/AndreiAlexandrescu_iterators-must-go.pdf|title=Iterators Must Go|author=Andrei Alexandrescu}}</ref><ref>{{cite web|url=http://dconf.org/2015/talks/alexandrescu.pdf|title=Generic Programming Must Go|author=Andrei Alexandrescu}}</ref> and ranges have been proposed for the C++ standard library.<ref>{{cite web|url=https://isocpp.org/blog/2014/10/ranges|title=Ranges for the Standard Library|author=Eric Niebler}}</ref>
The philosophy of the [[Standard Template Library]] (STL) embedded in the C++ Standard Library is to use generic algorithms in the form of templates using [[iterator]]s. Early compilers optimized small objects such as iterators poorly, which [[Alexander Stepanov]] characterized as the "abstraction penalty", although modern compilers optimize away such small abstractions well.<ref name="klbeH">{{cite web|url=http://www.open-std.org/jtc1/sc22/wg21/docs/D_3.cpp|title=Stepanov Benchmark|author=Alexander Stepanov|quote=The final number printed by the benchmark is a geometric mean of the performance degradation factors of individual tests. It claims to represent the factor by which you will be punished by your compiler if you attempt to use C++ data abstraction features. I call this number "Abstraction Penalty." As with any benchmark it is hard to prove such a claim; some people told me that it does not represent typical C++ usage. It is, however, a noteworthy fact that majority of the people who so object are responsible for C++ compilers with disproportionately large Abstraction Penalty.}}</ref> The interface using pairs of iterators to denote ranges of elements has also been criticized.<ref name="ASVsk">{{cite web|url=http://accu.org/content/conf2009/AndreiAlexandrescu_iterators-must-go.pdf|title=Iterators Must Go|author=Andrei Alexandrescu}}</ref><ref name="GNTM6">{{cite web|url=http://dconf.org/2015/talks/alexandrescu.pdf|title=Generic Programming Must Go|author=Andrei Alexandrescu}}</ref> The C++20 standard library's introduction of ranges should solve this problem.<ref name="D4QAf">{{Cite web|url=https://en.cppreference.com/w/cpp/ranges|title=Ranges library (C++20) - cppreference.com|website=en.cppreference.com}}</ref>


One big problem is that iterators often deal with heap allocated data in the C++ containers and becomes invalid if the data is independently moved by the containers. Functions that change the size of the container often invalidate all iterators pointing to it, creating dangerous cases of [[undefined behavior]].<ref>{{cite book|title=Effective STL|author=Scott Meyers|quote=Given all that allocation, deallocation, copying, and destruction. It should not stun you to learn that these steps can be expensive. Naturally, you don't want to perform them any more frequently than you have to. If that doesn't strike you as natural, perhaps it will when you consider that each time these steps occur, all iterators, pointers, and references into the vector or string are invalidated. That means that the simple act of inserting an element into a vector or string may also require updating other data structures that use iterators, pointers, or references into the vector or string being expanded.}}</ref><ref>{{cite web|url=http://www.angelikalanger.com/Conferences/Slides/CppInvalidIterators-DevConnections-2002.pdf|title=Invalidation of STL Iterators|author=Angelika Langer}}</ref> Here is an example where the iterators in the for loop get invalidated because of the std::string container changing its size on the [[Memory management#HEAP|heap]]:
One big problem is that iterators often deal with heap allocated data in the C++ containers and become invalid if the data is independently moved by the containers. Functions that change the size of the container often invalidate all iterators pointing to it, creating dangerous cases of [[undefined behavior]].<ref name="uAbkG">{{cite book|title=Effective STL|author=Scott Meyers|quote=Given all that allocation, deallocation, copying, and destruction. It should not stun you to learn that these steps can be expensive. Naturally, you don't want to perform them any more frequently than you have to. If that doesn't strike you as natural, perhaps it will when you consider that each time these steps occur, all iterators, pointers, and references into the vector or string are invalidated. That means that the simple act of inserting an element into a vector or string may also require updating other data structures that use iterators, pointers, or references into the vector or string being expanded.}}</ref><ref name="sgMa6">{{cite web|url=http://www.angelikalanger.com/Conferences/Slides/CppInvalidIterators-DevConnections-2002.pdf|title=Invalidation of STL Iterators|author=Angelika Langer}}</ref> Here is an example where the iterators in the for loop get invalidated because of the <code>std::string</code> container changing its size on the [[Memory management#HEAP|heap]]:


<source lang="c++">
<syntaxhighlight lang="c++">
#include <iostream>
#include <iostream>
#include <string>
#include <string>


int main() {
int main() {
std::string text = "One\nTwo\nThree\nFour\n";
std::string text = "One\nTwo\nThree\nFour\n";
// Let's add an '!' where we find newlines
// Let's add an '!' where we find newlines
for(auto i = text.begin(); i != text.end(); ++i) {
for (auto it = text.begin(); it != text.end(); ++it) {
if(*i == '\n') {
if (*it == '\n') {
// i =
// it =
text.insert(i,'!')+1;
text.insert(it, '!') + 1;
// Without updating the iterator this program has
// Without updating the iterator this program has
// undefined behavior and will likely crash
// undefined behavior and will likely crash
}
}
}
std::cout << text;
}
std::cout << text;
}
}
</syntaxhighlight>
</source>


==Uniform initialization syntax==
==Uniform initialization syntax==
The [[C++11]] uniform initialization syntax and std::initializer_list share the same syntax which are triggered differently depending on the internal workings of the classes. If there is a std::initializer_list constructor then this is called. Otherwise the normal constructors are called with the uniform initialization syntax. This can be confusing for beginners and experts alike<ref>{{cite web|url=http://scottmeyers.blogspot.de/2015/09/thoughts-on-vagaries-of-c-initialization.html|title=Thoughts on the Vagaries of C++ Initialization|author=Scott Meyers}}</ref><ref>{{cite web|url=http://llvm.org/docs/CodingStandards.html#do-not-use-braced-initializer-lists-to-call-a-constructor|title=Do not use Braced Initializer Lists to Call a Constructor}}</ref>
The [[C++11]] uniform initialization syntax and std::initializer_list share the same syntax which are triggered differently depending on the internal workings of the classes. If there is a std::initializer_list constructor then this is called. Otherwise the normal constructors are called with the uniform initialization syntax. This can be confusing for beginners and experts alike.<ref name="tQuuk">{{cite web|url=http://scottmeyers.blogspot.de/2015/09/thoughts-on-vagaries-of-c-initialization.html|title=Thoughts on the Vagaries of C++ Initialization|author=Scott Meyers|date=7 September 2015 }}</ref><ref name="omcjI"/>


<source lang="c++">
<syntaxhighlight lang="c++">
#include <iostream>
#include <iostream>
#include <vector>
#include <vector>


int main() {
int main() {
int integer1{10}; // int
int integer1{10}; // int
int integer2(10); // int
int integer2(10); // int
std::vector<int> vector1{10,0}; // std::initializer_list
std::vector<int> vector1{10, 0}; // std::initializer_list
std::vector<int> vector2(10,0); // size_t,int
std::vector<int> vector2(10, 0); // std::size_t, int

std::cout << "Will print 10"
std::cout << "Will print 10\n" << integer1 << '\n';
<< std::endl << integer1 << std::endl;
std::cout << "Will print 10\n" << integer2 << '\n';

std::cout << "Will print 10"
<< std::endl << integer2 << std::endl;
std::cout << "Will print 10,0,\n";

for (const auto& item : vector1) {
std::cout << "Will print 10,0," << std::endl;
for(auto &i : vector1) std::cout << i << ',';
std::cout << item << ',';
}
std::cout << std::endl;

std::cout << "Will print 0,0,0,0,0,0,0,0,0,0," << std::endl;
for(auto &i : vector2) std::cout << i << ',';
std::cout << "\nWill print 0,0,0,0,0,0,0,0,0,0,\n";

for (const auto& item : vector2) {
std::cout << item << ',';
}
}
}
</syntaxhighlight>
</source>


==Exceptions==
==Exceptions==
There have been concerns that the zero-overhead principle<ref>{{cite web|url=http://www.stroustrup.com/ETAPS-corrected-draft.pdf|title=Foundations of C++|author=Bjarne Stroustrup}}</ref> isn't compatible with exceptions.<ref>{{cite web|url=http://llvm.org/docs/CodingStandards.html#do-not-use-rtti-or-exceptions|title=Do not use RTTI or Exceptions}}</ref> Most modern implementations have a zero performance overhead when exceptions are enabled but not used, but do have an overhead during exception handling and in binary size due to the need to unroll tables. Many compilers support disabling exceptions from the language to save the binary overhead. Exceptions have also been criticized for being unsafe for state-handling; this safety issue can be handled using the [[Resource Acquisition Is Initialization|RAII]] idiom,{{sfn|Stroustrup|1994|loc=16.5 Resource Management, pp. 388–89}}.
There have been concerns that the zero-overhead principle<ref name="FgU7N">{{cite web|url=http://www.stroustrup.com/ETAPS-corrected-draft.pdf|title=Foundations of C++|author=Bjarne Stroustrup}}</ref> is not compatible with exceptions.<ref name="omcjI"/> Most modern implementations have a zero performance overhead when exceptions are enabled but not used, but do have an overhead during exception handling and in binary size due to the need to unroll tables. Many compilers support disabling exceptions from the language to save the binary overhead. Exceptions have also been criticized for being unsafe for state-handling. This safety issue has led to the invention of the [[Resource acquisition is initialization|RAII]] idiom,{{sfn|Stroustrup|1994|loc=16.5 Resource Management, pp. 388–89}} which has proven useful beyond making C++ exceptions safe.


==Encoding of string literals in source-code==
==Strings without Unicode==
C++ string literals, like those of C, do not consider the character encoding of the text within them: they are merely a sequence of bytes, and the C++ <code>string</code> class follows the same principle. Although source code can (since C++11) request an encoding for a literal, the compiler does not attempt to validate that the chosen encoding of the source literal is "correct" for the bytes being put into it, and the runtime does not enforce character encoding. Programmers who are used to other languages such as Java, Python or C# which try to enforce character encodings often consider this to be a defect of the language.
The C++ Standard Library offers no real support for [[Unicode]]. std::basic_string::length will only return the underlying array length which is acceptable when using [[ASCII]] or [[UTF-32]] but not when using [[Variable-width encoding|variable length]] encodings like [[UTF-8]] or [[UTF-16]]. In these encodings, array length is neither a correct measure of the number of [[code point]]s, [[Combining_character|number of characters]] or [[Halfwidth and fullwidth forms|width]]. There is no support for advanced Unicode concepts like [[Unicode equivalence#Normalization|normalization]], [[UTF-16#U.2B10000 to U.2B10FFFF|surrogate pairs]], [[Bi-directional text|bidi]] or conversion between encodings, although unicode libraries exist to handle these issues, such as [[iconv]] and [[International Components for Unicode|ICU]].


The example program below illustrates the phenomenon.
The example below prints the lengths of two equally long strings. The strings are equally long in characters, but the program takes their lengths in bytes. If the program's source code is saved in a constant width character set like [[ISO-8859-1]], both strings come out as 18 bytes, but in UTF-8, the first string becomes either 22 or 26 bytes depending on [[unicode normalization]].
<syntaxhighlight lang="c++">

<source lang="c++">
#include <iostream>
#include <iostream>
#include <string>
#include <string>
// note that this code is no longer valid in C++20
int main() {
// all strings are declared with the UTF-8 prefix


// file encoding determines the encoding of å and Ö
int main() {
std::string auto_enc = u8"Vår gård på Öland!";
// UTF-8 prefix just to be explicit
// this text is well-formed in both ISO-8859-1 and UTF-8
std::string utf8 = u8"Vår gård på Öland!";
std::string ascii = u8"Var gard pa Oland!";
std::string ascii = u8"Var gard pa Oland!";
// explicitly use the ISO-8859-1 byte-values for å and Ö
std::cout << "length of «" << utf8 << "» = " << utf8 .length() << '\n';
// this is invalid UTF-8
std::cout << "length of «" << ascii << "» = " << ascii.length() << '\n';
std::string iso8859_1 = u8"V\xE5r g\xE5rd p\xE5 \xD6land!";
// explicitly use the UTF-8 byte sequences for å and Ö
// this will display incorrectly in ISO-8859-1
std::string utf8 = u8"V\xC3\xA5r g\xC3\xA5rd p\xC3\xA5 \xC3\x96land!";

std::cout << "byte-count of automatically-chosen, [" << auto_enc
<< "] = " << auto_enc.length() << '\n';
std::cout << "byte-count of ASCII-only [" << ascii << "] = " << ascii.length()
<< '\n';
std::cout << "byte-count of explicit ISO-8859-1 bytes [" << iso8859_1
<< "] = " << iso8859_1.length() << '\n';
std::cout << "byte-count of explicit UTF-8 bytes [" << utf8
<< "] = " << utf8.length() << '\n';
}
}
</syntaxhighlight>
</source>


Despite the presence of the C++11 'u8' prefix, meaning "Unicode UTF-8 string literal", the output of this program actually depends on the source file's text encoding (or the compiler's settings - most compilers can be told to convert source files to a specific encoding before compiling them). When the source file is encoded using UTF-8, and the output is run on a terminal that's configured to treat its input as UTF-8, the following output is obtained:
==Code bloat==
<pre>
byte-count of automatically-chosen, [Vår gård på Öland!] = 22
byte-count of ASCII-only [Var gard pa Oland!] = 18
byte-count of explicit ISO-8859-1 bytes [Vr grd p land!] = 18
byte-count of explicit UTF-8 bytes [Vår gård på Öland!] = 22
</pre>
The output terminal has stripped the invalid UTF-8 bytes from display in the ISO-8859 example string. Passing the program's output through a [[Hex dump]] utility will reveal that they are still present in the program output, and it is the terminal application that removed them.


However, when the same source file is instead saved in ISO-8859-1 and re-compiled, the output of the program on the same terminal becomes:
Some older implementations of C++ have been accused of generating [[code bloat]].<ref name=Joyner1999>{{cite book|last=Joyner|first=Ian|title=Objects Unencapsulated: Java, Eiffel, and C++?? (Object and Component Technology)|publisher=Prentice Hall PTR; 1st edition|date=1999|isbn=978-0130142696}}</ref>{{rp|177}}
<pre>
byte-count of automatically-chosen, [Vr grd p land!] = 18
byte-count of ASCII-only [Var gard pa Oland!] = 18
byte-count of explicit ISO-8859-1 bytes [Vr grd p land!] = 18
byte-count of explicit UTF-8 bytes [Vår gård på Öland!] = 22
</pre>
One proposed solution is to make the source encoding reliable across all compilers.


==See also==
==See also==
{{div col|2}}
{{div col|colwidth=30em}}
* [[Most vexing parse]]
* [[Most vexing parse]]
* {{slink|Function overloading|Complications}}
* [[Object-oriented programming#Criticism|Criticism of object-oriented programming]]
* {{slink|Operator overloading|Criticisms}}
* [[Multiple inheritance#The diamond problem|Diamond inheritance problem]]
* {{slink|Exception handling|Criticism}}
* [[Template (C++)#Advantages and disadvantages of templates over macros|Advantages and disadvantages of templates over macros]]
* {{slink|Argument-dependent name lookup|Criticism}}
* [[Function overloading#Complications|Function overloading complications]]
* [[Late binding#Criticism|Late binding criticism]]
* [[Operator overloading#Criticisms|Operator overloading criticism]]
* [[Exception handling#Criticism|Exception handling criticism]]
* [[Argument-dependent name lookup#Criticism|Argument dependent name lookup criticism]]
* [[Feature creep]]
* [[Gotcha (programming)]]
* [[Object slicing]]
* [[Object slicing]]
{{div col end}}
{{div col end}}


==References==
==References==
{{Reflist}}
{{reflist|colwidth=35em}}


===Works cited===
===Works cited===
* {{Cite book | isbn = 0-201-54330-3 | title = [[The Design and Evolution of C++]] | last = Stroustrup | first = Bjarne | authorlink = Bjarne Stroustrup | year = 1994 | publisher = Addison-Wesley | ref = harv}}
* {{Cite book | isbn = 0-201-54330-3 | title = [[The Design and Evolution of C++]] | last = Stroustrup | first = Bjarne | author-link = Bjarne Stroustrup | year = 1994 | publisher = Addison-Wesley }}


==Further reading==
==Further reading==
{{refbegin}}
{{refbegin}}
* {{cite book|author=Ian Joyner|title=Objects Unencapsulated: Java, Eiffel, and C++?? (Object and Component Technology)|publisher=Prentice Hall PTR; 1st edition|date=1999|isbn=978-0130142696}}
* {{cite book|author=Peter Seibel|title=Coders at Work: Reflections on the Craft of Programming|publisher=Apress|date=2009|isbn=978-1430219484}}
* {{cite book|author=Peter Seibel|title=Coders at Work: Reflections on the Craft of Programming|publisher=Apress|date=2009|isbn=978-1430219484}}
{{refend}}
{{refend}}
Line 148: Line 170:
==External links==
==External links==
* [http://yosefk.com/c++fqa/index.html C++ FQA Lite] by Yossi Kreinin
* [http://yosefk.com/c++fqa/index.html C++ FQA Lite] by Yossi Kreinin
* [http://web.mit.edu/simsong/www/ugh.pdf#page=238 C++ The COBOL of the 90s]
* [http://web.mit.edu/simsong/www/ugh.pdf#page=238 "C++ The COBOL of the 90s" chapter 10 in The Unix Haters Group Book]
* [https://gigamonkeys.wordpress.com/2009/10/16/coders-c-plus-plus/ C++ in Coders at Work] Excerpts from the book Coders at Work, by Peter Seibel
* [https://gigamonkeys.wordpress.com/2009/10/16/coders-c-plus-plus/ C++ in Coders at Work] Excerpts from the book Coders at Work, by Peter Seibel
* [http://www.ustream.tv/recorded/47947981 DConf 2014: The Last Thing D Needs] A video of a talk by Scott Meyers
* [https://www.youtube.com/watch?v=KAWA1DuvCnQ DConf 2014: The Last Thing D Needs] A video of a talk by Scott Meyers
* [http://archive.adaic.com/intro/ada-vs-c/cppcv3.pdf A Critique of C++ and Programming and Language Trends of the 1990s - 3rd Edition] by Ian Joyner - 1996


[[Category:C++]]
[[Category:C++]]
[[Category:Criticisms of programming languages|C++]]
[[Category:Criticisms of programming languages|C++]]

[[be:C++#Недахопы]]
[[de:C++#Kritik]]
[[pt:C++#Críticas]]
[[ru:C++#Недостатки и критика]]
[[uk:C++#Недоліки мови C++]]
[[zh:C++#争议]]

Latest revision as of 15:40, 12 July 2024

Although C++ is one of the most widespread programming languages,[1] many prominent software engineers criticize C++ (the language, and its compilers) arguing that it is overly complex[2] and fundamentally flawed.[3] Among the critics have been: Robert Pike,[4] Joshua Bloch, Linus Torvalds,[5] Donald Knuth, Richard Stallman, and Ken Thompson. C++ has been widely adopted and implemented as a systems language through most of its existence. It has been used to build many pieces of very important software (such types of software include, but are not limited to: operating systems, runtime systems, programming language interpreters, parsers, lexers, compilers, etc...).

Slow compile times

[edit]

The natural interface between source files in C and C++ are header files. Each time a header file is modified, all source files that include the header file should recompile their code. Header files are slow because they are textual and context-dependent as a consequence of the preprocessor.[6] C only has limited amounts of information in header files, the most important being struct declarations and function prototypes. C++ stores its classes in header files and they not only expose their public variables and public functions (like C with its structs and function prototypes) but also their private functions. This forces unnecessary recompilation of all source files which include the header file each time these private functions are edited. This problem is magnified where the classes are written as templates, forcing all of their code into the slow header files, which is the case with much of the C++ standard library. Large C++ projects can therefore be relatively slow to compile.[7] The problem is largely solved by precompiled headers in modern compilers or using the module system that was added in C++20; future C++ standards are planning to expose the functionality of the standard library using modules.[8]

Global format state of <iostream>

[edit]

C++ <iostream>, unlike C <stdio.h>, relies on a global format state. This fits very poorly together with exceptions, when a function must interrupt the control flow, after an error but before resetting the global format state. One fix for this is to use resource acquisition is initialization (RAII), which is implemented in the Boost[9] libraries and part of the C++ Standard Library.

<iostream> uses static constructors which causes overhead if included, even if the library is not used.[10] Another source of bad performance is the misuse of std::endl instead of \n when doing output, as it also calls .flush(). C++ <iostream> is by default synchronized with <stdio.h> which can cause performance problems in command-line i/o intensive applications. Shutting it off can improve performance but forces giving up some ordering guarantees.

Here follows an example where an exception interrupts the function before std::cout can be restored from hexadecimal to decimal. The error number in the catch statement will be written out in hexadecimal which probably is not what one wants:

#include <iostream>
#include <vector>

int main() {
  try {
    std::cout << std::hex
              << 0xFFFFFFFF << '\n';
    // std::bad_alloc will be thrown here:
    std::vector<int> vector(0xFFFFFFFFFFFFFFFFull);
    std::cout << std::dec; // Never reached
                           // (using scopes guards would have fixed that issue 
                           //  and made the code more expressive)
  } 
  catch (const std::exception& e) {
    std::cout << "Error number: " << 10 << '\n';  // Not in decimal
  }
}

It is even acknowledged by some members of the C++ standards body[11] that <iostream> is an aging interface that eventually needs to be replaced.

C++20 added std::format that eliminated the global formatting state and addressed other issues in iostreams.[12] For example, the catch clause can now be written as

std::cout << std::format("Error number: {}\n", 10);

which is not affected by the stream state. Although it might introduce overhead due to the actual formatting being done at runtime.

Iterators

[edit]

The philosophy of the Standard Template Library (STL) embedded in the C++ Standard Library is to use generic algorithms in the form of templates using iterators. Early compilers optimized small objects such as iterators poorly, which Alexander Stepanov characterized as the "abstraction penalty", although modern compilers optimize away such small abstractions well.[13] The interface using pairs of iterators to denote ranges of elements has also been criticized.[14][15] The C++20 standard library's introduction of ranges should solve this problem.[16]

One big problem is that iterators often deal with heap allocated data in the C++ containers and become invalid if the data is independently moved by the containers. Functions that change the size of the container often invalidate all iterators pointing to it, creating dangerous cases of undefined behavior.[17][18] Here is an example where the iterators in the for loop get invalidated because of the std::string container changing its size on the heap:

#include <iostream>
#include <string>

int main() {
  std::string text = "One\nTwo\nThree\nFour\n";
  // Let's add an '!' where we find newlines
  for (auto it = text.begin(); it != text.end(); ++it) {
    if (*it == '\n') {
      // it =
      text.insert(it, '!') + 1;
      // Without updating the iterator this program has
      // undefined behavior and will likely crash
    }
  }
  std::cout << text;
}

Uniform initialization syntax

[edit]

The C++11 uniform initialization syntax and std::initializer_list share the same syntax which are triggered differently depending on the internal workings of the classes. If there is a std::initializer_list constructor then this is called. Otherwise the normal constructors are called with the uniform initialization syntax. This can be confusing for beginners and experts alike.[19][10]

#include <iostream>
#include <vector>

int main() {
  int integer1{10};                 // int
  int integer2(10);                 // int
  std::vector<int> vector1{10, 0};  // std::initializer_list
  std::vector<int> vector2(10, 0);  // std::size_t, int

  std::cout << "Will print 10\n" << integer1 << '\n';
  std::cout << "Will print 10\n" << integer2 << '\n';

  std::cout << "Will print 10,0,\n";

  for (const auto& item : vector1) {
    std::cout << item << ',';
  }

  std::cout << "\nWill print 0,0,0,0,0,0,0,0,0,0,\n";

  for (const auto& item : vector2) {
    std::cout << item << ',';
  }
}

Exceptions

[edit]

There have been concerns that the zero-overhead principle[20] is not compatible with exceptions.[10] Most modern implementations have a zero performance overhead when exceptions are enabled but not used, but do have an overhead during exception handling and in binary size due to the need to unroll tables. Many compilers support disabling exceptions from the language to save the binary overhead. Exceptions have also been criticized for being unsafe for state-handling. This safety issue has led to the invention of the RAII idiom,[21] which has proven useful beyond making C++ exceptions safe.

Encoding of string literals in source-code

[edit]

C++ string literals, like those of C, do not consider the character encoding of the text within them: they are merely a sequence of bytes, and the C++ string class follows the same principle. Although source code can (since C++11) request an encoding for a literal, the compiler does not attempt to validate that the chosen encoding of the source literal is "correct" for the bytes being put into it, and the runtime does not enforce character encoding. Programmers who are used to other languages such as Java, Python or C# which try to enforce character encodings often consider this to be a defect of the language.

The example program below illustrates the phenomenon.

#include <iostream>
#include <string>
// note that this code is no longer valid in C++20
int main() {
  // all strings are declared with the UTF-8 prefix

  // file encoding determines the encoding of å and Ö
  std::string auto_enc = u8"Vår gård på Öland!";
  // this text is well-formed in both ISO-8859-1 and UTF-8
  std::string ascii = u8"Var gard pa Oland!";
  // explicitly use the ISO-8859-1 byte-values for å and Ö
  // this is invalid UTF-8
  std::string iso8859_1 = u8"V\xE5r g\xE5rd p\xE5 \xD6land!";
  // explicitly use the UTF-8 byte sequences for å and Ö
  // this will display incorrectly in ISO-8859-1
  std::string utf8 = u8"V\xC3\xA5r g\xC3\xA5rd p\xC3\xA5 \xC3\x96land!";

  std::cout << "byte-count of automatically-chosen, [" << auto_enc
            << "] = " << auto_enc.length() << '\n';
  std::cout << "byte-count of ASCII-only [" << ascii << "] = " << ascii.length()
            << '\n';
  std::cout << "byte-count of explicit ISO-8859-1 bytes [" << iso8859_1
            << "] = " << iso8859_1.length() << '\n';
  std::cout << "byte-count of explicit UTF-8 bytes [" << utf8
            << "] = " << utf8.length() << '\n';
}

Despite the presence of the C++11 'u8' prefix, meaning "Unicode UTF-8 string literal", the output of this program actually depends on the source file's text encoding (or the compiler's settings - most compilers can be told to convert source files to a specific encoding before compiling them). When the source file is encoded using UTF-8, and the output is run on a terminal that's configured to treat its input as UTF-8, the following output is obtained:

byte-count of automatically-chosen, [Vår gård på Öland!] = 22
byte-count of ASCII-only [Var gard pa Oland!] = 18
byte-count of explicit ISO-8859-1 bytes [Vr grd p land!] = 18
byte-count of explicit UTF-8 bytes [Vår gård på Öland!] = 22

The output terminal has stripped the invalid UTF-8 bytes from display in the ISO-8859 example string. Passing the program's output through a Hex dump utility will reveal that they are still present in the program output, and it is the terminal application that removed them.

However, when the same source file is instead saved in ISO-8859-1 and re-compiled, the output of the program on the same terminal becomes:

byte-count of automatically-chosen, [Vr grd p land!] = 18
byte-count of ASCII-only [Var gard pa Oland!] = 18
byte-count of explicit ISO-8859-1 bytes [Vr grd p land!] = 18
byte-count of explicit UTF-8 bytes [Vår gård på Öland!] = 22

One proposed solution is to make the source encoding reliable across all compilers.

See also

[edit]

References

[edit]
  1. ^ "Stack Overflow Developer Survey 2021". Stack Overflow. Retrieved 28 December 2021.
  2. ^ "Google executive frustrated by Java, C++ complexity - Google, software, application development, Development tools, Languages and standards, Rob Pike". CIO. Retrieved 28 December 2021.
  3. ^ "C++ (Al Viro; Linus Torvalds; Theodore Ts'o)". yarchive.net. Retrieved 28 December 2021.
  4. ^ "Google executive frustrated by Java, C++ complexity - Google, software, application development, Development tools, Languages and standards, Rob Pike". CIO. Retrieved 28 December 2021.
  5. ^ "C++ (Al Viro; Linus Torvalds; Theodore Ts'o)". yarchive.net. Retrieved 28 December 2021.
  6. ^ Walter Bright. "C++ compilation speed".
  7. ^ Rob Pike (25 June 2012). "Less is exponentially more". Back around September 2007, I was doing some minor but central work on an enormous Google C++ program, one you've all interacted with, and my compilations were taking about 45 minutes on our huge distributed compile cluster.
  8. ^ Ville Voutilainen. "To boldly suggest an overall plan for C++23".
  9. ^ "I/O Stream-State Saver Library - 1.60.0". www.boost.org.
  10. ^ a b c "LLVM Coding Standards — LLVM 12 documentation". llvm.org.
  11. ^ "N4412: Shortcomings of iostreams". open-std.org. Retrieved 3 May 2016.
  12. ^ "P0645: Text Formatting". open-std.org. Retrieved 20 May 2021.
  13. ^ Alexander Stepanov. "Stepanov Benchmark". The final number printed by the benchmark is a geometric mean of the performance degradation factors of individual tests. It claims to represent the factor by which you will be punished by your compiler if you attempt to use C++ data abstraction features. I call this number "Abstraction Penalty." As with any benchmark it is hard to prove such a claim; some people told me that it does not represent typical C++ usage. It is, however, a noteworthy fact that majority of the people who so object are responsible for C++ compilers with disproportionately large Abstraction Penalty.
  14. ^ Andrei Alexandrescu. "Iterators Must Go" (PDF).
  15. ^ Andrei Alexandrescu. "Generic Programming Must Go" (PDF).
  16. ^ "Ranges library (C++20) - cppreference.com". en.cppreference.com.
  17. ^ Scott Meyers. Effective STL. Given all that allocation, deallocation, copying, and destruction. It should not stun you to learn that these steps can be expensive. Naturally, you don't want to perform them any more frequently than you have to. If that doesn't strike you as natural, perhaps it will when you consider that each time these steps occur, all iterators, pointers, and references into the vector or string are invalidated. That means that the simple act of inserting an element into a vector or string may also require updating other data structures that use iterators, pointers, or references into the vector or string being expanded.
  18. ^ Angelika Langer. "Invalidation of STL Iterators" (PDF).
  19. ^ Scott Meyers (7 September 2015). "Thoughts on the Vagaries of C++ Initialization".
  20. ^ Bjarne Stroustrup. "Foundations of C++" (PDF).
  21. ^ Stroustrup 1994, 16.5 Resource Management, pp. 388–89.

Works cited

[edit]

Further reading

[edit]
  • Peter Seibel (2009). Coders at Work: Reflections on the Craft of Programming. Apress. ISBN 978-1430219484.
[edit]