A roundoff error,^{[1]} also called rounding error,^{[2]} is the difference between the result produced by a given algorithm using exact arithmetic and the result produced by the same algorithm using finiteprecision, rounded arithmetic.^{[3]} Rounding errors are due to inexactness in the representation of real numbers and the arithmetic operations done with them. This is a form of quantization error.^{[4]} When using approximation equations or algorithms, especially when using finitely many digits to represent real numbers (which in theory have infinitely many digits), one of the goals of numerical analysis is to estimate computation errors.^{[5]} Computation errors, also called numerical errors, include both truncation errors and roundoff errors.
When a sequence of calculations with an input involving roundoff error are made, errors may accumulate, sometimes dominating the calculation. In illconditioned problems, significant error may accumulate.^{[6]}
In short, there are two major facets of roundoff errors involved in numerical calculations^{[7]}:
 Digital computers have magnitude and precision limits on their ability to represent numbers.
 Certain numerical manipulations are highly sensitive to roundoff errors. This can result from both mathematical considerations as well as from the way in which computers perform arithmetic operations.
YouTube Encyclopedic

1/5Views:48 37111 975991 076230 3441 541

✪ Round off Error: Sources of Error

✪ Rounding and Truncation Errors

✪ Math Antics  Rounding

✪ Excel formula tutorial: Working with ROUND, ROUNDUP, and ROUNDDOWN  lynda.com

✪ MATLAB Programming Tutorial #09 RoundOff Errors & Iterative Methods
Transcription
. . . In this segment we're going to talk about roundoff errors. There are several possibilities of error whenever you're going to use numerical methods, but we want to concentrate here on just two errors, one is the roundoff error and the other is the truncation error. So those are the sources of error which we are going to talk about, because those are the ones which are coming from something on which you may or may not have as much control as other errors, like for example if you have made a mistake in programming, or if your logic is wrong, those are not the kind of errors which we are talking about when we talk about numerical methods. So you're going to have two sources of error, which you are going to have.One is roundoff error and the other one is called truncation error. . And let's go ahead and concentrate on what roundoff error is.Now roundoff error is defined as follows, it is basically the error which comes from . . . so error created due to approximate representation of numbers. . So the roundoff error is simply the error created by the approximate representation of numbers, because in a computer you'll be able to only represent a number only so approximately.For example, if you have a number like 1 divided by 3, and you had a six significant digit computer let's suppose in the decimal notation, then this can be only approximated as 0.333333 . . . and six, there's another 3 here.So you're seeing that already a simple number, a simple rational number like 1 divided by 3 cannot be written exactly in the decimal format.So the amount of roundoff error which you are getting here is the difference between the value of 1 divided by 3 and the value of 0.333333. So in this case, this error is 0.0000003333 and so on and so forth. You're going to get similar roundoff errors from other numbers also, like, you may have pi, that also cannot be represented exactly, even in a decimal format, and then square root of 2, things like that. So you're finding out there are many, many numbers, individual numbers, like 1 divided by 3, or pi, or square root of 2, which cannot be represented exactly in a computer. So that's why this creates the roundoff error, the roundoff error is the difference between what you want to, what you want to be able to approximate, of what you want to be able to denote, and what you are able to get as its approximation. So that's the, that's what we call as roundoff error. So that's the end of this particular segment here. . . .
Contents
Representation error
The error introduced by attempting to represent a number using a finite string of digits is a form of roundoff error called representation error.^{[8]} Here are some examples of representation error in decimal representations:
Notation  Representation  Approximation  Error 

1/7  0.142 857  0.142 857  0.000 000 142 857 
ln 2  0.693 147 180 559 945 309 41...  0.693 147  0.000 000 180 559 945 309 41... 
log_{10} 2  0.301 029 995 663 981 195 21...  0.3010  0.000 029 995 663 981 195 21... 
^{3}√2  1.259 921 049 894 873 164 76...  1.25992  0.000 001 049 894 873 164 76... 
√2  1.414 213 562 373 095 048 80...  1.41421  0.000 003 562 373 095 048 80... 
e  2.718 281 828 459 045 235 36...  2.718 281 828 459 045  0.000 000 000 000 000 235 36... 
π  3.141 592 653 589 793 238 46...  3.141 592 653 589 793  0.000 000 000 000 000 238 46... 
Increasing the number of digits allowed in a representation reduces the magnitude of possible roundoff errors, but any representation limited to finitely many digits will still cause some degree of roundoff error for uncountably many real numbers. Additional digits used for intermediary steps of a calculation are known as guard digits.^{[9]}
Rounding multiple times can cause error to accumulate.^{[10]} For example, if 9.945309 is rounded to two decimal places (9.95), then rounded again to one decimal place (10.0), the total error is 0.054691. Rounding 9.945309 to one decimal place (9.9) in a single step introduces less error (0.045309). This commonly occurs when performing arithmetic operations (See Loss of Significance).
Floatingpoint number system
Compared with the fixedpoint number system, the floatingpoint number system is more efficient in representing real numbers so it is widely used in modern computers. While the real numbers are infinite and continuous, a floatingpoint number system is finite and discrete. Thus, representation error, which leads to roundoff error, occurs under the floatingpoint number system.
Notation of floatingpoint number system
A floatingpoint number system is characterized by integers:
 : base or radix
 : precision
 : exponent range, where is the lower bound and is the upper bound
 Any has the following form:
 where is an integer such that for , and is an integer such that .
Normalized floatingnumber system
 A floatingpoint number system is normalized if the leading digit is always nonzero unless the number is zero. ^{[3]} Since the mantissa is , the mantissa of a nonzero number in a normalized system satisfies . Thus, the normalized form of a nonzero IEEE floatingpoint number is where . In binary, the leading digit is always so it is not written out and is called the implicit bit. Then we can get an extra bit of precision so that the roundoff error caused by representation error is reduced.
 Since floatingpoint number system is finite and discrete, it cannot represent all real numbers which means infinite real numbers can only be approximated by some finite numbers through rounding rules. We denote the floatingpoint approximation of a given real number by .
 The total number of normalized floatingpoint numbers is
 , where
 counts choice of sign, being positive or negative
 counts choice of the leading digit
 counts remaining mantissa
 counts choice of exponents
 counts the case when the number is .
 , where
IEEE standard
We^{[who?]} will focus on the IEEE standard since it is adopted universally after it was established in 1985. In this standard, the base is binary, i.e. , and normalization is used. The IEEE standard stores the sign, exponent, and mantissa in separate fields of a floating point word, each of which has a fixed width (number of bits). The two most commonly used levels of precision for floatingpoint numbers are single precision and double precision.
Precision  Sign (bits)  Exponent (bits)  Mantissa (bits) 

Single  1  8  23 
Double  1  11  52 
Machine epsilon
Machine epsilon can be used to measure the level of roundoff error in the floatingpoint number system. Here are two different definitions. ^{[3]}
 The Machine epsilon, denoted , is the maximum possible absolute relative error in representing a nonzero real number in a floatingpoint number system.
 The Machine epsilon, denoted , is the smallest number such that . Thus, whenever .
Roundoff error under different rounding rules
There are two common rounding rules, roundbychop and roundtonearest. The IEEE standard uses roundtonearest.
 Roundbychop: The base expansion of is truncated after the digit.
 This rounding rule is biased because it always moves the result toward zero.
 Roundtonearest: We set to the nearest floatingpoint number to . When there is a tie, we use the floatingpoint number whose last stored digit is even.
 For IEEE standard where the base is , this means when there is a tie we round so that the last digit is equal to .
 This rounding rule is more accurate but more computationally expensive.
 Rounding so that the last stored digit is even when there is a tie ensures that we do not round up or down systematically. This is to try to avoid the possibility of an unwanted slow drift in long calculations due simply to a biased rounding.
 The following example illustrates the level of roundoff error under the two rounding rules. ^{[3]} The rounding rule, roundtonearest, leads to less roundoff error in general.
x  Roundbychop  Roundoff Error  Roundtonearest  Roundoff Error 

1.649  1.6  0.049  1.6  0.049 
1.650  1.6  0.050  1.6  0.050 
1.651  1.6  0.051  1.7  0.049 
1.699  1.6  0.099  1.7  0.001 
1.749  1.7  0.049  1.7  0.049 
1.750  1.7  0.050  1.8  0.050 
Calculating roundoff error in IEEE standard
Suppose we use roundtonearest and IEEE double precision.
 Example: the decimal number can be rearranged into
Since the bit to the right of the binary point is a and is followed by other nonzero bits, the roundtonearest rule requires rounding up, that is, add bit to the bit. Thus, the normalized floatingpoint representation in IEEE standard of is
 .
 Now we can calculate the roundoff error when representing with .
We derived this representation by discarding the infinite tail
from the right tail and then added in the rounding step.
 Then .
 Thus, the roundoff error is .
Measuring roundoff error by using machine epsilon
We can use machine epsilon to measure the level of roundoff error when using the two rounding rules above. Below are the formulas and corresponding proof ^{[3]}. The first definition of machine epsilon is used here.
Theorem
 Roundbychop:
 Roundtonearest:
Proof
Let where , and let be the floatingpoint representation of . Since we are using roundbychop, we have that
 In order to determine the maximum of this quantity, we need to find the maximum of the numerator and the minimum of the denominator. Since (normalized system), the minimum value of the denominator is . The numerator is bounded above by . Thus, . Therefore, for roundbychop.
The proof for roundtonearest is similar.
 Note that the first definition of machine epsilon is not quite equivalent to the second definition when using the roundtonearest rule but it is equivalent for roundbychop.
Roundoff error caused by floatingpoint arithmetic
Even if some numbers can be represented exactly by floatingpoint numbers and such numbers are called machine numbers, performing floatingpoint arithmetic may lead to roundoff error in the final result.
Addition
Machine addition consists of lining up the decimal points of the two numbers to be added, adding them, and then storing the result again as a floatingpoint number. The addition itself can be done in higher precision but the result must be rounded back to the specified precision, which may lead to roundoff error. ^{[3]}
For example, adding to in IEEE double precision as follows,
 This is saved as since roundtonearest is used in IEEE standard. Therefore, is equal to in IEEE double precision and the roundoff error is .
From this example, we can see that roundoff error can be introduced when doing the addition of a large number and a small number because the shifting of decimal points in the mantissas to make the exponents match may cause the loss of some digits.
Multiplication
In general, the product of digit mantissas contains up to digits, so the result might not fit in the mantissa. ^{[3]} Thus roundoff error will be involved in the result.
 For example, consider a normalized floatingpoint number system with the base and the mantissa digits are at most . Then and . Note that but since there at most mantissa digits. The roundoff error would be .
Division
In general, the quotient of digit mantissas may contain more than digits. ^{[3]} Thus roundoff error will be involved in the result.
 For example, if we still use the normalized floatingpoint number system above, then but . So, the tail is cut off.
Subtractive Cancellation
The subtracting of two nearly equal numbers is called subtractive cancellation. ^{[3]}
 When the leading digits are cancelled, the result may be too small to be represented exactly and it will just be represented as .
 For example, let and the second definition of machine epsilon is used here. What is the solution to ?
We know that and are nearly equal numbers, and . However, in the floatingpoint number system, . We can see that is too small so it is represented as .
 For example, let and the second definition of machine epsilon is used here. What is the solution to ?
 Even if the result is representable, the result is still regarded as "garbage". We do not have much faith in this value because the most uncertainty in any floatingpoint number is the digits on the far right.
 For example, . The result is clearly representable, but we do not have much faith in it.
Accumulation of roundoff error
Errors can be magnified or accumulated when a sequence of calculations is applied on an initial input with roundoff error due to inexact representation.
Unstable algorithms
An algorithm or numerical process is called stable if small changes in the input only produce small changes in the output and it is called unstable if large changes in the output are produced. ^{[11]}
A sequence of calculations normally occur when running some algorithm. The amount of error in the result depends on the stability of the algorithm. Roundoff error will be magnified by unstable algorithms.
For example, for with given. It is easy to show that . Suppose is our initial value and has a small representation error , which means the initial input to this algorithm is instead of . Then the algorithm does the following sequence of calculations.
The roundoff error is amplified in succeeding calculations so this algorithm is unstable.
Illconditioned problems
Even if a stable algorithm is used, the solution to a problem is still inaccurate due to the accumulation of roundoff error when the problem itself is illconditioned.
The condition number of a problem is the ratio of the relative change in the solution to the relative change in the input. ^{[3]} A problem is wellconditioned if small relative changes in input result in small relative changes in the solution. Otherwise. the problem is called illconditioned. ^{[3]} In other words, a problem is called illconditioned if its condition number is "much larger" than .
The condition number is introduced as a measure of the roundoff errors that can result when solving illconditioned problems. ^{[7]}
For example, higherorder polynomials tend to be very illconditioned, that is, they tend to be highly sensitive to roundoff error. ^{[7]}
In 1901, Carl Runge published a study on the dangers of higherorder polynomial interpolation. He looked at the following simplelooking function:
which is now called Runge's function. He took equidistantly spaced data points from this function over the interval . He then used interpolating polynomials of increasing order and found that as he took more points, the polynomials and the original curve differed considerably as illustrated in Figure “Comparison1” and Figure “Comparison 2”. Further, the situation deteriorated greatly as the order was increased. As shown in Figure “Comparison 2”, the fit has gotten even worse, particularly at the ends of the interval.
Click on the figures in order to see the full descriptions.
Real world example: Patriot Missile Failure due to magnification of roundoff error
On February 25, 1991, during the Gulf War, an American Patriot Missile battery in Dharan, Saudi Arabia, failed to intercept an incoming Iraqi Scud missile. The Scud struck an American Army barracks and killed 28 soldiers. A report of the General Accounting office, GAO/IMTEC9226, entitled Patriot Missile Defense: Software Problem Led to System Failure at Dhahran, Saudi Arabia reported on the cause of the failure. It turns out that the cause was an inaccurate calculation of the time since boot due to computer arithmetic errors. Specifically, the time in tenths of second as measured by the system's internal clock was multiplied by 1/10 to produce the time in seconds. This calculation was performed using a 24bit fixed point register. In particular, the value 1/10, which has a nonterminating binary expansion, was chopped at 24 bits after the radix point. The small chopping error, when multiplied by the large number giving the time in tenths of a second, lead to a significant error. Indeed, the Patriot battery had been up around 100 hours, and an easy calculation shows that the resulting time error due to the magnified chopping error was about 0.34 seconds. (The number 1/10 equals . In other words, the binary expansion of 1/10 is . Now the 24 bit register in the Patriot stored instead introducing an error of binary, or about decimal. Multiplying by the number of tenths of a second in hours gives ). A Scud travels at about 1,676 meters per second, and so travels more than half a kilometer in this time. This was far enough that the incoming Scud was outside the "range gate" that the Patriot tracked. Ironically, the fact that the bad time calculation had been improved in some parts of the code, but not all, contributed to the problem, since it meant that the inaccuracies did not cancel.^{[12]}
See also
 Precision (arithmetic)
 Truncation
 Rounding
 Loss of significance
 Floating point
 Kahan summation algorithm
 Machine epsilon
 Wilkinson's polynomial
References
 ^ Butt, Rizwan (2009), Introduction to Numerical Analysis Using MATLAB, Jones & Bartlett Learning, pp. 11–18, ISBN 9780763773762.
 ^ Ueberhuber, Christoph W. (1997), Numerical Computation 1: Methods, Software, and Analysis, Springer, pp. 139–146, ISBN 9783540620587.
 ^ ^{a} ^{b} ^{c} ^{d} ^{e} ^{f} ^{g} ^{h} ^{i} ^{j} ^{k} Forrester, Dick (2018). Math/Comp241 Numerical Methods (lecture notes). Dickinson College..
 ^ Aksoy, Pelin; DeNardis, Laura (2007), Information Technology in Theory, Cengage Learning, p. 134, ISBN 9781423901402.
 ^ Ralston, Anthony; Rabinowitz, Philip (2012), A First Course in Numerical Analysis, Dover Books on Mathematics (2nd ed.), Courier Dover Publications, pp. 2–4, ISBN 9780486140292.
 ^ Chapman, Stephen (2012), MATLAB Programming with Applications for Engineers, Cengage Learning, p. 454, ISBN 9781285402796.
 ^ ^{a} ^{b} ^{c} Chapra, Steven (2012). Applied Numerical Methods with MATLAB for Engineers and Scientists (3rd ed.). The McGrawHill Companies, Inc. ISBN 9780073401102.
 ^ Laplante, Philip A. (2000). Dictionary of Computer Science, Engineering and Technology. CRC Press. p. 420. ISBN 9780849326912..
 ^ Higham, Nicholas John (2002). Accuracy and Stability of Numerical Algorithms (2 ed.). Society for Industrial and Applied Mathematics (SIAM). pp. 43–44. ISBN 9780898715217..
 ^ Volkov, E. A. (1990). Numerical Methods. Taylor & Francis. p. 24. ISBN 9781560320111..
 ^ Collins, Charles (2005). "Condition and Stability" (PDF). Department of Mathematics in University of Tennessee. Retrieved 28 October 2018.
 ^ Arnold, Douglas. "The Patriot Missile Failure". Retrieved 29 October 2018.
External links
 Roundoff Error at MathWorld.
 Goldberg, David (March 1991). "What Every Computer Scientist Should Know About FloatingPoint Arithmetic" (PDF). ACM Computing Surveys. 23 (1): 5–48. doi:10.1145/103162.103163. Retrieved 20160120. ([1], [2])
 20 Famous Software Disasters