The fourth day of 21 days from Java to Go - drop by drop (basic data)

Data types in Go language

  • Foundation type

  • Aggregation type

  • reference type

  • Interface type

Foundation type

number

integer

  • Signed integer
    -- int8 int16 int32 int64 int
  • Unsigned integer
    -- uint8 uint16 uint32 uint64 uint
  • The number of int uint bits depends on the compiler 32-bit or 64 bit
  • byte type is the same as unit8 type
  • Run type is the same as int32 type
  • uintptr is used for underlying programming. The size is not clear enough to store pointers completely.

Floating point number

  • float32
  • float64
  • Math package (math. IsNaN)

complex

  • complex64
  • complex128
	c := complex(1, 2)
	r := real(c)
	i := imag(c)
	fmt.Printf("real part:%f imaginary part:%f\n",r,i)

character string

  • A string is an immutable sequence of bytes and can contain any data, including 0 value bytes. The text string is interpreted as a Unicode code point (text symbol) according to UTF-8.
  • The len(string) function returns the number of bytes. It uses subscript access to get the ith character. 0<= i <len(s)
s := "Hello, hello world プログラム"
	fmt.Println("Bytes:",len(s))//37
	fmt.Println("First three characters:",s[0],s[1],s[2],s[3])// 228 189 160 229
		fmt.Println(s[len(s)]) //panic: runtime error: index out of range [37] with length 37

  • The plus sign (+) connects two strings to generate a new string, + = generates a new string
	s+="123"
	fmt.Println(s) //Hello, hello world 123
  • The string cannot be changed, and the data inside the string cannot be modified. Immutability means that two strings can safely share the same underlying memory, making the cost of copying strings of any length cheap.
s[0] = 'L' //Compilation error: s[0] cannot be assigned
//s[7:] shares the underlying byte array with s
  • String literal double quotation marks and original literal ` ` the original string literal can be expanded into multiple lines. The only special treatment is that the carriage return character will be deleted.
  • unicode includes all characters of all document systems in the world, and each of them is given a standard number called unicode code point. In Go language, these character marks are called rune The natural data type suitable for storing a single text symbol is int32, which is adopted by Go. For this reason, run type is used as an alias of int32. We can represent the sequence of text symbols as an int32 value sequence. The encoding length of each unicode code point is the same, which is 32 bits. This encoding is simple and uniform, but most computer-readable texts are ASCII, Each character only needs 8 bits, or 1 byte, which leads to unnecessary storage space consumption. The number of widely used characters is less than 65556, and the characters can be accommodated with 16 bits. Can we make improvements?
  • UTF-8
    UTF-8 encodes Unicode code points with variable length in bytes. UTF-8 is a current Unicode standard, which was invented by Ken Thompson and Rob Pike, the two founders of Go. Each character symbol is represented by 1 ~ 4 bytes, the encoding of ASCII characters accounts for only 1 byte, and the encoding of other document characters is 2 or 3 bytes. The high bit of the first byte of a character symbol indicates how many bytes are left behind Few bytes. If the highest bit is 0, it indicates that it is a 7-Bit ASCII code, and the encoding of text symbols only accounts for 1 byte, which is consistent with the traditional ACSII code. If the highest bits are 110, the encoding of text symbols occupies 2 bytes, and the second byte starts with 10. The longer encoding is pushed in this way.
    Variable length encoded strings cannot directly access the nth character in the following table. However, sometimes, UTF-8 has many useful features. Tracing back up to 3 bytes can locate the starting position of a character. UTF-8 is a prefix encoding, so it can be decoded from left to right without ambiguity and no pre reading.
  • The unicode package provides functions for a single literal symbol
  • The unicode/utf8 package provides functions to encode and decode text symbols according to UTF-8.
package main

import (
	"fmt"
	"unicode/utf8"
)

func main() {
	s := "Hello, hello world プログラム"
	fmt.Println("Bytes:",len(s))//37
	fmt.Println("First three characters:",s[0],s[1],s[2],s[3])// 228 189 160 229
	//fmt.Println(s[len(s)]) //panic: runtime error: index out of range [37] with length 37

	s+="123"
	fmt.Println(s)
//s=	``
//	fmt.Println(s)

	fmt.Println(utf8.RuneCountInString(s)) //Returns the number of UTF-8 encoded code values in string s.

	for i, i2 := range s { //Implicit
		fmt.Println("for range:",i,i2)
	}

	for i := 0; i < len(s); {
		inString, size := utf8.DecodeRuneInString(s[i:])  //Decode the first UTF-8 encoding sequence in string s and return the code value and length.
		i+=size
		fmt.Println(inString,size)
		fmt.Println(string(inString))
	}

	runes := []rune(s)
	fmt.Println(runes)

	fmt.Println(string(runes))
}

  • The four standard packages are particularly important for string operations bytes strings strconv unicode

Boolean

Aggregation type

array

structural morphology

reference type

Pointer

slice

map

function

passageway

Posted by robertvideo on Tue, 30 Nov 2021 17:46:23 -0800