Go 1.8rc3 source code learning: token

Keywords: Go Programming encoding Attribute

Preface

token package contains data structures and methods related to golang lexical analysis. The source code is located in <go-src>/src/go/token.

token.go

The comments in the source code are great!

Token type

Token is the set of lexical tokens of the Go programming language

type Token int

tokens

The list of tokens(token ids)

const (
    // Special tokens
    ILLEGAL Token = iota
    EOF
    COMMENT

    literal_begin
    ...
    literal_end

    operator_beg
    ...
    operator_end

    keyword_beg
    ...
    keyword_end
)

The use of const defines tokens in the Go language. Here's one thing worth learning: using xxx_beg and x xxx_end as different token group s to distinguish token types quickly, such as whether token id is a keyword?

func (tok Token) IsKeyword() bool { return keyword_beg < tok && tok < keyword_end }

Next, the token string corresponds to the const mentioned above.

var tokens = [...]string {
    ILLEGAL: "ILLEGAL",

    EOF:     "EOF",
    COMMENT: "COMMENT",
    ...
}

Query token string based on token id

Query the tokens array and check for array crossings before doing so

func (tok Token) String() string {
    s := ""
    if 0 <= tok && tok < Token(len(tokens)) {
        s = tokens[tok]
    }
    if s == "" {
        s = "token(" + strconv.Itoa(int(tok)) + ")"
    }
    return s
}

keywords

Use map to save keyword and token id correspondence

var keywords map[string]Token

Query whether a string is an identifier or a keyword

func Lookup(ident string) Token {
    if tok, is_keyword := keywords[ident]; is_keyword {
        return tok
    }
    return IDENT
}

position.go

Position type

Position describes an arbitrary source position including the file, line, and column location

type Position struct {
    Filename string // filename, if any
    Offset   int    // offset, starting at 0
    Line     int    // line number, starting at 1
    Column   int    // column number, starting at 1 (byte count)
}

File type

A File is a handle for a file belonging to a FileSet,it has a name, size and line offset table

type File struct {
    set  *FileSet
    name string // file name as provided to AddFile
    base int    // Pos value range for this file is [base...base+size]
    size int    // file size as provided to AddFile

    lines []int 
    infos []lineInfo
}
  • lines and infos are protected by set.mutex

  • lines contains the offset of the first character for each line (the first entry is always 0)

lines(line table)

There are several ways to set (initialize) File line table, such as by file content

func (f *File) SetLinesForContent(content []byte) {
    var lines []int
    line := 0
    for offset, b := range content {
        if line >= 0 {
            lines = append(lines, line)
        }
        line = -1
        if b == '\n' {
            line = offset + 1
        }
    }

    // set lines table
    f.set.mutex.Lock()
    f.lines = lines
    f.set.mutex.Unlock()
}
  • Lines are defined as a slice, and since line = 0 is initialized, the first value added to lines is 0.

  • Save the new line offset + 1 to line when a newline character'n'is detected, where + 1 is to skip the N character

  • The mutex of FileSet is used for lock protection when setting File lines fields

Pos type

Pos is a compact encoding of a source position within a file set.

type Pos int

Pos here is easy to confuse with Position above. Position describes the location information in File and Pos describes the (encoded) location information in FileSet. They can be converted to each other.

func (f *File) PositionFor(p Pos, adjusted bool) (pos Position) {
    if p != NoPos {
        if int(p) < f.base || int(p) > f.base+f.size {
            panic("illegal Pos value")
        }
        pos = f.position(p, adjusted)
    }
    return
}

func (f *File) position(p Pos, adjusted bool) (pos Position) {
    offset := int(p) - f.base
    pos.Offset = offset
    pos.Filename, pos.Line, pos.Column = f.unpack(offset, adjusted)
    return
}

As mentioned above, File's base attribute represents File's "starting" position in FileSet, so int(p) - f.base in position method obtains the offset of position represented by P in file, and f.unpack searches lines and Inflineos to calculate the row and column offset based on the offset.

func (f *File) unpack(offset int, adjusted bool) (filename string, line, column int) {
    filename = f.name
    if i := searchInts(f.lines, offset); i >= 0 {
        line, column = i+1, offset-f.lines[i]+1
    }
    if adjusted && len(f.infos) > 0 {
        // almost no files have extra line infos
        if i := searchLineInfos(f.infos, offset); i >= 0 {
            alt := &f.infos[i]
            filename = alt.Filename
            if i := searchInts(f.lines, alt.Offset); i >= 0 {
                line += alt.Line - i - 1
            }
        }
    }
    return
}

FileSet type

summary

Posted by Andrew W Peace on Sat, 06 Apr 2019 20:27:31 -0700