348 lines
16 KiB
XML
348 lines
16 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE section PUBLIC "-//Boost//DTD BoostBook XML V1.1//EN"
|
|
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
|
|
<section id="safe_numerics.tutorial">
|
|
<title>Tutorial and Motivating Examples</title>
|
|
|
|
<section id="safe_numerics.tutorial.1">
|
|
<title>Arithmetic Expressions Can Yield Incorrect Results</title>
|
|
|
|
<para>When some operation on signed integer types results in a result
|
|
which exceeds the capacity of a data variable to hold it, the result is
|
|
undefined. In the case of unsigned integer types a similar situation
|
|
results in a value wrap as per modulo arithmetic. In either case the
|
|
result is different than in integer number arithmetic in the mathematical
|
|
sense. This is called "overflow". Since word size can differ between
|
|
machines, code which produces mathematically correct results in one set of
|
|
circumstances may fail when re-compiled on a machine with different
|
|
hardware. When this occurs, most C++ programs will continue to execute
|
|
with no indication that the results are wrong. It is the programmer's
|
|
responsibility to ensure such undefined behavior is avoided.</para>
|
|
|
|
<para>This program demonstrates this problem. The solution is to replace
|
|
instances of built in integer types with corresponding safe types.</para>
|
|
|
|
<programlisting><xi:include href="../../example/example1.cpp" parse="text"
|
|
xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting>
|
|
|
|
<screen>example 1:undetected erroneous expression evaluation
|
|
Not using safe numerics
|
|
error NOT detected!
|
|
-127 != 127 + 2
|
|
Using safe numerics
|
|
error detected:converted signed value too large: positive overflow error
|
|
Program ended with exit code: 0</screen>
|
|
</section>
|
|
|
|
<section id="safe_numerics.tutorial.2">
|
|
<title>Arithmetic Operations Can Overflow Silently</title>
|
|
|
|
<para>A variation of the above is when a value is incremented/decremented
|
|
beyond its domain.</para>
|
|
|
|
<programlisting><xi:include href="../../example/example2.cpp" parse="text"
|
|
xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting>
|
|
|
|
<screen>example 2:undetected overflow in data type
|
|
Not using safe numerics
|
|
-2147483648 != 2147483647 + 1
|
|
error NOT detected!
|
|
Using safe numerics
|
|
addition result too large
|
|
error detected!</screen>
|
|
|
|
<para>When variables of unsigned integer type are decremented below zero,
|
|
they "roll over" to the highest possible unsigned version of that integer
|
|
type. This is a common problem which is generally never detected.</para>
|
|
</section>
|
|
|
|
<section id="safe_numerics.tutorial.3">
|
|
<title>Arithmetic on Unsigned Integers Can Yield Incorrect Results</title>
|
|
|
|
<para>Subtracting two unsigned values of the same size will result in an
|
|
unsigned value. If the first operand is less than the second the result
|
|
will be arithmetically in correct. But if the size of the unsigned types
|
|
is less than that of an <code>unsigned int</code>, C/C++ will promote the
|
|
types to <code>signed int</code> before subtracting resulting in an
|
|
correct result. In either case, there is no indication of an error.
|
|
Somehow, the programmer is expected to avoid this behavior. Advice usually
|
|
takes the form of "Don't use unsigned integers for arithmetic". This is
|
|
well and good, but often not practical. C/C++ itself uses unsigned for
|
|
<code>sizeof(T)</code> which is then used by users in arithmetic.</para>
|
|
|
|
<para>This program demonstrates this problem. The solution is to replace
|
|
instances of built in integer types with corresponding safe types.</para>
|
|
|
|
<programlisting><para><xi:include href="../../example/example8.cpp"
|
|
parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></para></programlisting>
|
|
|
|
<screen>example 8:undetected erroneous expression evaluation
|
|
Not using safe numerics
|
|
error NOT detected!
|
|
4294967171 != 2 - 127
|
|
Using safe numerics
|
|
error detected:subtraction result cannot be negative: negative overflow error
|
|
Program ended with exit code: 0</screen>
|
|
</section>
|
|
|
|
<section id="safe_numerics.tutorial.4">
|
|
<title>Implicit Conversions Can Lead to Erroneous Results</title>
|
|
|
|
<para>At CPPCon 2016 Jon Kalb gave a very entertaining (and disturbing)
|
|
<ulink url="https://www.youtube.com/watch?v=wvtFGa6XJDU">lightning
|
|
talk</ulink> related to C++ expressions.</para>
|
|
|
|
<para>The talk included a very, very simple example similar to the
|
|
following:</para>
|
|
|
|
<para><programlisting><xi:include href="../../example/example4.cpp"
|
|
parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 3: implicit conversions change data values
|
|
Not using safe numerics
|
|
a is -1 b is 1
|
|
b is less than a
|
|
error NOT detected!
|
|
Using safe numerics
|
|
a is -1 b is 1
|
|
converted negative value to unsigned: domain error
|
|
error detected!
|
|
</screen></para>
|
|
|
|
<para>A normal person reads the above code and has to be dumbfounded by
|
|
this. The code doesn't do what the text - according to the rules of
|
|
algebra - says it does. But C++ doesn't follow the rules of algebra - it
|
|
has its own rules. There is generally no compile time error. You can get a
|
|
compile time warning if you set some specific compile time switches. The
|
|
explanation lies in reviewing how C++ reconciles binary expressions
|
|
(<code>a < b</code> is an expression here) where operands are different
|
|
types. In processing this expression, the compiler:</para>
|
|
|
|
<para><itemizedlist>
|
|
<listitem>
|
|
<para>Determines the "best" common type for the two operands. In
|
|
this case, application of the rules in the C++ standard dictate that
|
|
this type will be an <code>unsigned int</code>.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Converts each operand to this common type. The signed value of
|
|
-1 is converted to an unsigned value with the same bit-wise
|
|
contents, 0xFFFFFFFF, on a machine with 32 bit integers. This
|
|
corresponds to a decimal value of 4294967295.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Performs the calculation - in this case it's
|
|
<code><</code>, the "less than" operation. Since 1 is less than
|
|
4294967295 the program prints "b is less than a".</para>
|
|
</listitem>
|
|
</itemizedlist></para>
|
|
|
|
<para>In order for a programmer to detect and understand this error he
|
|
should be pretty familiar with the implicit conversion rules of the C++
|
|
standard. These are available in a copy of the standard and also in the
|
|
canonical reference book <citetitle><link linkend="stroustrup">The C++
|
|
Programming Language</link></citetitle> (both are over 1200 pages long!).
|
|
Even experienced programmers won't spot this issue and know to take
|
|
precautions to avoid it. And this is a relatively easy one to spot. In the
|
|
more general case this will use integers which don't correspond to easily
|
|
recognizable numbers and/or will be buried as a part of some more complex
|
|
expression.</para>
|
|
|
|
<para>This example generated a good amount of web traffic along with
|
|
everyone's pet suggestions. See for example <ulink
|
|
url="https://bulldozer00.com/2016/10/16/the-unsigned-conundrum/">a blog
|
|
post with everyone's favorite "solution"</ulink>. All the proposed
|
|
"solutions" have disadvantages and attempts to agree on how handle this
|
|
are ultimately fruitless in spite of, or maybe because of, the <ulink
|
|
url="https://twitter.com/robertramey1/status/795742870045016065">emotional
|
|
content</ulink>. Our solution is by far the simplest: just use the safe
|
|
numerics library as shown in the example above.</para>
|
|
|
|
<para>Note that in this particular case, usage of the safe types results
|
|
in no runtime overhead in using the safe integer library. Code generated
|
|
will either equal or exceed the efficiency of using primitive integer
|
|
types.</para>
|
|
</section>
|
|
|
|
<section id="safe_numerics.tutorial.5">
|
|
<title>Mixing Data Types Can Create Subtle Errors</title>
|
|
|
|
<para>C++ contains signed and unsigned integer types. In spite of their
|
|
names, they function differently which often produces surprising results
|
|
for some operands. Program errors from this behavior can be exceedingly
|
|
difficult to find. This has lead to recommendations of various ad hoc
|
|
"rules" to avoid these problems. It's not always easy to apply these
|
|
"rules" to existing code without creating even more bugs. Here is a
|
|
typical example of this problem:</para>
|
|
|
|
<para><programlisting><xi:include href="../../example/example10.cpp"
|
|
parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting>Here
|
|
is the output of the above program:<screen>example 4: mixing types produces surprising results
|
|
Not using safe numerics
|
|
10000
|
|
4294957296
|
|
error NOT detected!
|
|
Using safe numerics
|
|
10000
|
|
error detected!converted negative value to unsigned: domain error
|
|
</screen>This solution is simple, just replace instances of <code>int</code>
|
|
with <code>safe<int></code>.</para>
|
|
</section>
|
|
|
|
<section id="safe_numerics.tutorial.6">
|
|
<title>Array Index Value Can Exceed Array Limits</title>
|
|
|
|
<para>Using an intrinsic C++ array, it's very easy to exceed array limits.
|
|
This can fail to be detected when it occurs and create bugs which are hard
|
|
to find. There are several ways to address this, but one of the simplest
|
|
would be to use safe_unsigned_range;</para>
|
|
|
|
<para><programlisting><xi:include href="../../example/example5.cpp"
|
|
parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 5: array index values can exceed array bounds
|
|
Not using safe numerics
|
|
error NOT detected!
|
|
Using safe numerics
|
|
error detected:Value out of range for this safe type: domain error
|
|
</screen></para>
|
|
|
|
<para>Collections like standard arrays and vectors do array index checking
|
|
in some function calls and not in others so this may not be the best
|
|
example. However it does illustrate the usage of
|
|
<code>safe_range<T></code> for assigning legal ranges to variables.
|
|
This will guarantee that under no circumstances will the variable contain
|
|
a value outside of the specified range.</para>
|
|
</section>
|
|
|
|
<section id="safe_numerics.tutorial.7">
|
|
<title>Checking of Input Values Can Be Easily Overlooked</title>
|
|
|
|
<para>It's way too easy to overlook the checking of parameters received
|
|
from outside the current program.<programlisting><xi:include
|
|
href="../../example/example6.cpp" parse="text"
|
|
xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 6: checking of externally produced value can be overlooked
|
|
Not using safe numerics
|
|
2147483647 0
|
|
error NOT detected!
|
|
Using safe numerics
|
|
error detected:error in file input: domain error
|
|
</screen>Without safe integer, one will have to insert new code every time an
|
|
integer variable is retrieved. This is a tedious and error prone
|
|
procedure. Here we have used program input. But in fact this problem can
|
|
occur with any externally produced input.</para>
|
|
</section>
|
|
|
|
<section id="safe_numerics.tutorial.8">
|
|
<title>Cannot Recover From Arithmetic Errors</title>
|
|
|
|
<para>If a divide by zero error occurs in a program, it's detected by
|
|
hardware. The way this manifests itself to the program can and will depend
|
|
upon</para>
|
|
|
|
<para><itemizedlist>
|
|
<listitem>
|
|
<para>data type - int, float, etc</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>setting of compile time command line switches</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>invocation of some configuration functions which convert these
|
|
hardware events into C++ exceptions</para>
|
|
</listitem>
|
|
</itemizedlist>It's not all that clear how one would detect and recover
|
|
from a divide by zero error in a simple portable way. Usually, users just
|
|
ignore the issue which usually results in immediate program termination
|
|
when this situation occurs.</para>
|
|
|
|
<para>This library will detect divide by zero errors before the operation
|
|
is invoked. Any errors of this nature are handled according to the <link
|
|
linkend="safe_numeric.exception_policies">exception_policy</link> selected
|
|
by the library user.</para>
|
|
|
|
<para><programlisting><xi:include href="../../example/example13.cpp"
|
|
parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 7: cannot recover from arithmetic errors
|
|
Not using safe numerics
|
|
error NOT detectable!
|
|
Using safe numerics
|
|
error detected:divide by zero: domain error
|
|
</screen></para>
|
|
</section>
|
|
|
|
<section id="safe_numerics.tutorial.9">
|
|
<title>Compile Time Arithmetic is Not Always Correct</title>
|
|
|
|
<para>If a divide by zero error occurs while a program is being compiled,
|
|
there is not guarantee that it will be detected. This example shows a real
|
|
example compiled with a recent version of CLang.</para>
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>Source code includes a constant expression containing a simple
|
|
arithmetic error.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>The compiler emits a warning but otherwise calculates the wrong
|
|
result.</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Replacing int with safe<int> will guarantee that the error
|
|
is detected at runtime</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>Operations using safe types are marked constexpr. So we can
|
|
force the operations to occur at runtime by marking the results as
|
|
constexpr. This will result in an error at compile time if the
|
|
operations cannot be correctly calculated.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<para><programlisting><xi:include href="../../example/example14.cpp"
|
|
parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 8: cannot detect compile time arithmetic errors
|
|
Not using safe numerics
|
|
0error NOT detected!
|
|
Using safe numerics
|
|
error detected:positive overflow error
|
|
Program ended with exit code: 0</screen></para>
|
|
</section>
|
|
|
|
<section id="safe_numerics.tutorial.10">
|
|
<title>Programming by Contract is Too Slow</title>
|
|
|
|
<para>Programming by Contract is a highly regarded technique. There has
|
|
been much written about it and it has been proposed as an addition to the
|
|
C++ language <citation><xref linkend="garcia"/></citation><citation><xref
|
|
linkend="crowl2"/></citation> It (mostly) depends upon runtime checking of
|
|
parameter and object values upon entry to and exit from every function.
|
|
This can slow the program down considerably which in turn undermines the
|
|
main motivation for using C++ in the first place! One popular scheme for
|
|
addressing this issue is to enable parameter checking only during
|
|
debugging and testing which defeats the guarantee of correctness which we
|
|
are seeking here! Programming by Contract will never be accepted by
|
|
programmers as long as it is associated with significant additional
|
|
runtime cost.</para>
|
|
|
|
<para>The Safe Numerics Library has facilities which, in many cases, can
|
|
check guaranteed parameter requirements with little or no runtime
|
|
overhead. Consider the following example:</para>
|
|
|
|
<para><programlisting><xi:include href="../../example/example7.cpp"
|
|
parse="text" xmlns:xi="http://www.w3.org/2001/XInclude"/></programlisting><screen>example 8:
|
|
enforce contracts with zero runtime cost
|
|
parameter error detected</screen></para>
|
|
|
|
<para>In the example above, the function <code>convert</code> incurs
|
|
significant runtime cost every time the function is called. By using
|
|
"safe" types, this cost is moved to the moment when the parameters are
|
|
constructed. Depending on how the program is constructed, this may totally
|
|
eliminate extraneous computations for parameter requirement type checking.
|
|
In this scenario, there is no reason to suppress the checking for release
|
|
mode and our program can be guaranteed to be always arithmetically
|
|
correct.</para>
|
|
</section>
|
|
</section>
|