Previous
golang Quick Start [2.1]-go Language Development Environment Configuration-windows
golang Quick Start [2.2]-go Language Development Environment Configuration-macOS
golang Quick Start [2.3]-go Language Development Environment Configuration-linux
How the golang Quick Start [4]-go language is compiled into machine code
Goang Quick Start [5.2]-go Language How to Run-Memory Overview
How the golang Quick Start [5.3]-go language works - memory allocation
golang Quick Start [6.1] - Integrated Development Environment - goland Details
golang Quick Start [6.2] - Integrated Development Environment - emacs Details
Go golang Quick Start [7.1] - Project and Dependency Management - gopath
Preface
In the above section, we learned automatic type inference in go language
In this article, we will take a closer look at the storage details of floating-point numbers in the go language
What is the result of the following simple program 0.3 + 0.6?Some would naively assume 0.9, but the actual output is 0.899999999999 (go 1.13.5)
var f1 float64 = 0.3 var f2 float64 = 0.6 fmt.Println(f1 + f2)
The problem is that most decimal numbers are approximated and infinite when expressed as binary.Take 0.1 as an example.It may be one of the simplest decimal digits you can think of, but binary looks very complicated: 0.0001100110011001100...He is a continuous loop of infinite numbers (described later on on how to convert to binary numbers).
The absurdity of the result tells us that we must have a deep understanding of how floating-point numbers are stored in computers and their nature in order to properly handle the calculation of numbers.
Go golang, like many other languages (C, C++, Python), uses the IEEE-754 standard to store floating-point numbers.
How IEEE-754 Stores Floating Points
The IEEE-754 specification uses a special scientific representation of floating-point numbers based on a 2-cardinality.
| Basic decimal number | Scientific notation | Index means | Coefficient | Base | Index | Decimal | |----------------|---------------------|----------------|-------------|------|----------|----------| | 700 | 7e+2 | 7 * 10^2 | 7 | 10 | 2 | 0 | | 4,900,000,000 | 4.9e+9 | 4.9 * 10^9 | 4.9 | 10 | 9 | .9 | | 5362.63 | 5.36263e+3 | 5.36263 * 10^3 | 5.36263 | 10 | 3 | .36263 | | -0.00345 | 3.45e-3 | 3.45 * 10^-3 | 3.45 | 10 | -3 | .45 | | 0.085 | 1.36e-4 | 1.36 * 2^-4 | 1.36 | 2 | -4 | .36 |
Differences between 32-bit single-precision and 64-bit double-precision floating-point numbers
| Precision | Symbol | Index | Decimal | Offset | |------------------|--------|------------|---------------|------| | Single (32 Bits) | 1 [31] | 8 [30-23] | 23 [22-00] | 127 | | Double (64 Bits) | 1 [63] | 11 [62-52] | 52 [51-00] | 1023 |
Symbol bits: 1 is negative, 0 is positive.
Exponential bits: Stores the exponent minus the offset, which is designed to express negative numbers.
Decimal places: The exact or closest value of the decimal places of the storage factor.
Take the number 0.085 for example.
| Symbol Bit | Index Bit (123) | Decimal Bit (.36) | | |------|----------------|------------------------------| | 0 | 0111 1011 | 010 1110 0001 0100 0111 1011 |
Calculation of decimal places
Take 0.36 for example: 010 1110 0001 0100 0111 1011 = 0.36 (the first digit represents 1/2, the second digit is 1/4...)
The calculation steps after decomposition are:
| Bit | Value | Fraction | Decimal | Total | |-----|---------|-----------|------------------|------------------| | 2 | 4 | 1⁄4 | 0.25 | 0.25 | | 4 | 16 | 1⁄16 | 0.0625 | 0.3125 | | 5 | 32 | 1⁄32 | 0.03125 | 0.34375 | | 6 | 64 | 1⁄64 | 0.015625 | 0.359375 | | 11 | 2048 | 1⁄2048 | 0.00048828125 | 0.35986328125 | | 13 | 8192 | 1⁄8192 | 0.0001220703125 | 0.3599853515625 | | 17 | 131072 | 1⁄131072 | 0.00000762939453 | 0.35999298095703 | | 18 | 262144 | 1⁄262144 | 0.00000381469727 | 0.3599967956543 | | 19 | 524288 | 1⁄524288 | 0.00000190734863 | 0.35999870300293 | | 20 | 1048576 | 1⁄1048576 | 0.00000095367432 | 0.35999965667725 | | 22 | 4194304 | 1⁄4194304 | 0.00000023841858 | 0.35999989509583 | | 23 | 8388608 | 1⁄8388608 | 0.00000011920929 | 0.36000001430512 |
go Language Display Floating Point - Verify Previous Theory
math.Float32bits can print out the binary representation of numbers for us.
The go code below outputs a binary representation of 0.085.
To verify the correctness of the previous theory, the original decimal 0.085 represented by the binary representation is deduced backwards.
package main import ( "fmt" "math" ) func main() { var number float32 = 0.085 fmt.Printf("Starting Number: %f\n\n", number) // Float32bits returns the IEEE 754 binary representation bits := math.Float32bits(number) binary := fmt.Sprintf("%.32b", bits) fmt.Printf("Bit Pattern: %s | %s %s | %s %s %s %s %s %s\n\n", binary[0:1], binary[1:5], binary[5:9], binary[9:12], binary[12:16], binary[16:20], binary[20:24], binary[24:28], binary[28:32]) bias := 127 sign := bits & (1 << 31) exponentRaw := int(bits >> 23) exponent := exponentRaw - bias var mantissa float64 for index, bit := range binary[9:32] { if bit == 49 { position := index + 1 bitValue := math.Pow(2, float64(position)) fractional := 1 / bitValue mantissa = mantissa + fractional } } value := (1 + mantissa) * math.Pow(2, float64(exponent)) fmt.Printf("Sign: %d Exponent: %d (%d) Mantissa: %f Value: %f\n\n", sign, exponentRaw, exponent, mantissa, value) }
Output:
Starting Number: 0.085000 Bit Pattern: 0 | 0111 1011 | 010 1110 0001 0100 0111 1011 Sign: 0 Exponent: 123 (-4) Mantissa: 0.360000 Value: 0.085000
Classic Question: How to tell that a floating point number actually stores integers
Think for 10 seconds....
The following is a go code implementation to determine if a floating point number is an integer. Let's analyze the function line by line.It can enhance understanding of floating-point numbers
func IsInt(bits uint32, bias int) { exponent := int(bits >> 23) - bias - 23 coefficient := (bits & ((1 << 23) - 1)) | (1 << 23) intTest := (coefficient & (1 << uint32(-exponent) - 1)) fmt.Printf("\nExponent: %d Coefficient: %d IntTest: %d\n", exponent, coefficient, intTest) if exponent < -23 { fmt.Printf("NOT INTEGER\n") return } if exponent < 0 && intTest != 0 { fmt.Printf("NOT INTEGER\n") return } fmt.Printf("INTEGER\n") }
To be an integer, an important condition is that the exponential bit is greater than 127. If the exponential bit is 127, the exponential bit is greater than 127, the exponential bit is greater than 127, the exponential bit is greater than 0, and vice versa. Let's take the number 234523 as an example:
Starting Number: 234523.000000 Bit Pattern: 0 | 1001 0000 | 110 0101 0000 0110 1100 0000 Sign: 0 Exponent: 144 (17) Mantissa: 0.789268 Value: 234523.000000 Exponent: -6 Coefficient: 15009472 IntTest: 0 INTEGER
The first step is to calculate the index.Since 23 is subtracted, the criterion in the first judgment is exponent < -23
exponent := int(bits >> 23) - bias - 23
Step 2, (bits & ((1 < < 23) - 1)) calculates the decimal places.
coefficient := (bits & ((1 << 23) - 1)) | (1 << 23) Bits: 01001000011001010000011011000000 (1 << 23) - 1: 00000000011111111111111111111111 bits & ((1 << 23) - 1): 00000000011001010000011011000000
`|(1 << 23)` means that 1 is added in front.
bits & ((1 << 23) - 1): 00000000011001010000011011000000 (1 << 23): 00000000100000000000000000000000 coefficient: 00000000111001010000011011000000
1 +Decimal = Coefficient.
The third step, calculating intTest, is an integer only if the exponential multiple can make up for the smallest decimal place.As shown below, the exponent is 17, and it cannot make up for the last 6 decimal places.That is, it cannot make up for 1/2^18 decimal.It's an integer because 2^18 bits are followed by 0.
exponent: (144 - 127 - 23) = -6 1 << uint32(-exponent): 000000 (1 << uint32(-exponent)) - 1: 111111 coefficient: 00000000111001010000011011000000 1 << uint32(-exponent)) - 1: 00000000000000000000000000111111 intTest: 00000000000000000000000000000000
Extended reading: Concepts: Normal number and denormal (or subnormal) number
wiki explains:
In computing, a normal number is a non-zero number in a floating-point representation which is within the balanced range supported by a given floating-point format: it is a floating point number that can be represented without leading zeros in its significand.
What does that mean?There is an offset at the exponential position in IEEE-754, which is designed to express negative numbers.For example, 0.085 in single precision, the actual exponent is -3, stored in the exponential bit is 123.
So there is an upper limit to the negative number expressed.This upper limit is 2^-126.If it is smaller than this negative number, for example, 2^-127, it should be expressed as 0.1 * 2 ^-126. Then the coefficient becomes a number that is not 1 leading, which is called denormal (or subnormal) number.
Normal coefficients are numbers leading by one, called Normal number s.
Extended reading: Concepts: Accuracy
Precision is a very complex concept, in which the author is discussing the 10-digit precision of binary floating-point numbers.
Precision D means within a range if we convert the D-bit 10-digit (expressed in scientific notation) to binary.Convert binary to D-bit 10-bit.No loss of data means that there is d precision in this range.
The reason for accuracy is that when data is converted from one process to another, it is not matched accurately, but to the nearest number.
For the time being, we will not go into any further discussion here, but will draw a conclusion:
float32 has a precision of 6-8 bits.
The precision of float64 is 15-17 bits
Accuracy is dynamic and may vary from range to range.A simple hint here is that the intersection between the powers of 2 and 10 is different.
summary
This paper introduces the specific storage method of floating point number in the IEEE-754 standard used by go language.
This article helps readers understand how floating-point numbers are stored using snippets of actual code and a brainstorming twist.
This paper introduces two important concepts, normal number and precision.
Reference material