This article is reproduced. Links to the original text
Prior to Go 1.6, the built-in map type was partially goroutine secure, concurrent reading was not a problem, and concurrent writing might be a problem. Since go 1.6, read and write maps concurrently will cause errors. This problem exists in some well-known open source libraries, so the solution before go 1.9 is to bind an additional lock, encapsulate it into a new struct or use it alone.
This article takes you deep into the concrete implementation of sync.Map, to see how the code becomes complex in order to add a function, and the author's ideas in implementing sync.Map.
map with concurrency problems
The official faq has mentioned that built-in map s are not goroutine-safe.
First, let's look at a piece of code that reads and writes concurrently. In the following programs, one goroutine reads all the time and one goroutine writes the same key value. That is, even if the keys read and write are different, and the map does not have "expansion" operations, the code will still report errors.
package main func main() { m := make(map[int]int) go func() { for { _ = m[1] } }() go func() { for { m[2] = 2 } }() select {} }
The error message is: fatal error: concurrent map read and map write.
If you look at the source code of Go: hashmap_fast.go L118, you will see that when you read it, you will check the hashWriting flag, and if you have this flag, you will report concurrent errors.
This flag will be set when writing: hashmap. go L542
h.flags |= hashWriting
hashmap.go#L628 is set up and the tag is cancelled.
Of course, there are several concurrent reading and writing checks in the code, such as checking concurrent writing when writing, deleting keys like writing, traversing concurrent reading and writing problems, etc.
Sometimes, map concurrency problems are not so easy to find, you can use the - race parameter to check.
Solutions before Go 1.9
However, many times, we use map objects concurrently, especially in projects of a certain size, where map always holds data shared by goroutine. In Go maps in action, a simple solution is provided.
var counter = struct{ sync.RWMutex m map[string]int }{m: make(map[string]int)}
It uses embedded struct to add a read-write lock to the map.
It's easy to lock when reading data:
counter.RLock() n := counter.m["some_key"] counter.RUnlock() fmt.Println("some_key:", n)
When writing data:
unter.Lock() counter.m["some_key"]++ counter.Unlock()
sync.Map
It can be said that the above solution is quite concise, and the use of read-write locks instead of Mutex can further reduce the performance of read-write locks.
However, it also has some problems in some scenarios. If you are familiar with Java, you can compare the implementation of Concurrent HashMap in java. When the data of map is very large, a lock will lead to a large concurrent client competing for a lock. Java's solution is shard, which uses multiple locks internally and shares one lock per interval, thus reducing the data sharing and one lock. orcaman provides an implementation of this idea: concurrent-map. He also asks the developers concerned with Go whether they can implement this solution in Go. Because of the complexity of the implementation, the answer is Yes, we consider it. However, unless there are special performance improvements and application scenarios, there is no further development news.
So how is sync.Map implemented in Go 1.9? How does it solve concurrency and improve performance?
The implementation of sync.Map has several optimization points, which are listed here first, and we will analyze them later.
Space changes time. Through two redundant data structures (read and dirty), the effect of locking on performance is realized.
Use read-only data to avoid read-write conflicts.
After dynamic adjustment, the number of miss es is increased, the dirty data is upgraded to read.
double-checking.
Delayed deletion. Deleting a key value is only marked, and the deleted data is cleaned up only when dirty is promoted.
Read, update and delete from read first, because read does not need lock.
Let's introduce the key code of sync.Map in order to understand its implementation idea.
First, let's look at the data structure of sync.Map:
type Map struct { // When it comes to operations involving dirty data, you need to use this lock mu Mutex // A read-only data structure, because read-only, so there will be no read-write conflict. // So it's always safe to read from this data. // In fact, entries of this data are actually updated, and if entries are unexpunged, no locks are required. If entry has been deleted, it needs to be locked to update the dirty data. read atomic.Value // readOnly // The dirty data contains entries contained in the current map. It contains the latest entries (including data not deleted in read, although redundant, but it is very fast to upgrade the dirty field to read, instead of replicating one by one, it directly takes this data structure as part of the read field), and some data may not be moved to read field. // The operation of dirty needs to be locked because there may be read-write competition for its operation. // When dirty is empty, such as initialization or just upgrading, the next write operation copies the data not deleted from the read field into the data. dirty map[interface{}]*entry // When reading entries from Map, if read does not contain this entry, it will try to read entries from dirty, which will add misses to one. // When misses accumulate to the length of dirty, dirty is upgraded to read to avoid missing too many times from dirty. Because the operation dirty needs to be locked. misses int }
Its data structure is simple, and the value contains four fields: read, mu, dirty, misses.
It uses redundant data structures read and dirty. The entries deleted in read will be included in dirty, and the newly added entries will be added to dirty.
read's data structure is:
type readOnly struct { m map[interface{}]*entry amended bool // If Map.dirty does not have some data in it, this value is true }
amended indicates that there is readOnly.m data not included in Map.dirty, so if you can't find data from Map.read, you need to go further to Map.dirty.
Modification of Map.read is done by atomic manipulation.
Although read and dirty have redundant data, they point to the same data through pointers, so although the value of Map will be large, the redundant space occupied is still limited.
The value types stored by readOnly.m and Map.dirty are * entry, which contains a pointer p pointing to the value stored by the user.
type entry struct { p unsafe.Pointer // *interface{} }
p has three values:
nil: entry has been deleted and m.dirty is nil
expunged: entry has been deleted, and m.dirty is not nil, and this entry does not exist in m.dirty
Others: entry is a normal value
This is the data structure of sync.Map. Next, let's focus on Load, Store, Delete and Range. Other auxiliary methods can be understood by referring to these four methods.
Load
Loading method, that is, to provide a key to find the corresponding value, if it does not exist, reflected by ok:
func (m *Map) Load(key interface{}) (value interface{}, ok bool) { // 1. First, we get read-only readOnly from m.read, and look it up from its map without locking. read, _ := m.read.Load().(readOnly) e, ok := read.m[key] // 2. If it is not found and there is new data in m.dirty, it needs to be looked up from m.dirty, and locks are needed at this time. if !ok && read.amended { m.mu.Lock() // Double checking to avoid locking, m.dirty is promoted to m.read, at which point m.read may be replaced. read, _ = m.read.Load().(readOnly) e, ok = read.m[key] // If there is still no data in m.read and new data in m.dirty if !ok && read.amended { // Find from m.dirty e, ok = m.dirty[key] // Increase misses count by one regardless of the presence or absence of m.dirty // m.dirty is raised when conditions are met in missLocked(). m.missLocked() } m.mu.Unlock() } if !ok { return nil, false } return e.load() }
Here are two values to focus on. One is to first load from m.read, in the absence of it, and m.dirty has new data, lock, and then load from m.dirty.
The second is the use of double-check processing, because in the following two statements, these two statements are not an atomic operation.
if !ok && read.amended { m.mu.Lock()
Although the condition is satisfied when the first sentence is executed, m.dirty may be upgraded to m.read before locking, so m.read must be checked again after locking. This method is used in subsequent methods.
Java programmers are very familiar with the technology of double checking. One of the implementations of the singleton pattern is to use the technology of double checking.
As you can see, if the key value of our query just exists in m.read, it does not need to be locked and returns directly, so it has excellent performance in theory. Even if it does not exist in m.read, after miss es several times, m.dirty will be promoted to m.read, and will be looked up from m.read. So for less updates / additions and loaded key case s, the performance is basically similar to that of unlocked map s.
Let's look at how m.dirty was promoted. The missLocked method may elevate m.dirty.
func (m *Map) missLocked() { m.misses++ if m.misses < len(m.dirty) { return } m.read.Store(readOnly{m: m.dirty}) m.dirty = nil m.misses = 0 }
The last three lines of code above are to upgrade m.dirty. It's very simple to use m.dirty as the M field of readOnly and update m.read atomically. m.dirty, m.misses reset after promotion, and m.read.amended is false.
Store
This method is to update or add a new entry.
func (m *Map) Store(key, value interface{}) { // If m.read has this key and the entry is not marked and deleted, try to store it directly. // Because m.dirty also points to this entry, m.dirty also keeps the latest entry. read, _ := m.read.Load().(readOnly) if e, ok := read.m[key]; ok && e.tryStore(&value) { return } // If `m.read'does not exist or has been marked off m.mu.Lock() read, _ = m.read.Load().(readOnly) if e, ok := read.m[key]; ok { if e.unexpungeLocked() { //Marked as not deleted m.dirty[key] = e //This key does not exist in m.dirty, so m.dirty is added. } e.storeLocked(&value) //To update } else if e, ok := m.dirty[key]; ok { // m.dirty has this key, update e.storeLocked(&value) } else { //New key value if !read.amended { //There is no new data in m.dirty. Add the first new key to m.dirty m.dirtyLocked() //Copy Undeleted Data from m.read m.read.Store(readOnly{m: read.m, amended: true}) } m.dirty[key] = newEntry(value) //Add this entry to m.dirty } m.mu.Unlock() } func (m *Map) dirtyLocked() { if m.dirty != nil { return } read, _ := m.read.Load().(readOnly) m.dirty = make(map[interface{}]*entry, len(read.m)) for k, e := range read.m { if !e.tryExpungeLocked() { m.dirty[k] = e } } } func (e *entry) tryExpungeLocked() (isExpunged bool) { p := atomic.LoadPointer(&e.p) for p == nil { // Mark the deleted data marked nil as expunged if atomic.CompareAndSwapPointer(&e.p, nil, expunged) { return true } p = atomic.LoadPointer(&e.p) } return p == expunged }
As you can see, all of the above operations start with m.read, unlock again if the condition is not satisfied, and then operate m.dirty.
Store may duplicate data from m.read in some situation (initialization or just after m.dirty has been promoted), and if the amount of data in m.read is very large at this time, it may affect performance.
Delete
Delete a key value.
func (m *Map) Delete(key interface{}) { read, _ := m.read.Load().(readOnly) e, ok := read.m[key] if !ok && read.amended { m.mu.Lock() read, _ = m.read.Load().(readOnly) e, ok = read.m[key] if !ok && read.amended { delete(m.dirty, key) } m.mu.Unlock() } if ok { e.delete() } }
Similarly, the deletion operation starts in m.read, and if the entry does not exist in m.read and there is new data in m.dirty, the lock attempts to delete from m.dirty.
Note that double checks are still required. Delete directly from m.dirty, just as it does not exist, but if it is deleted from m.read, it will not be deleted directly, but marked:
func (e *entry) delete() (hadValue bool) { for { p := atomic.LoadPointer(&e.p) // Marked as deleted if p == nil || p == expunged { return false } // Atomic operation, e.p marked as nil if atomic.CompareAndSwapPointer(&e.p, p, nil) { return true } } }
Range
Because for... range map is a built-in language feature, there is no way to traverse sync.Map using for range, but you can use its Range method to traverse through callbacks.
func (m *Map) Range(f func(key, value interface{}) bool) { read, _ := m.read.Load().(readOnly) // If there is new data in m.dirty, raise m.dirty, and then traverse it if read.amended { //Upgrade m.dirty m.mu.Lock() read, _ = m.read.Load().(readOnly) //Double examination if read.amended { read = readOnly{m: m.dirty} m.read.Store(read) m.dirty = nil m.misses = 0 } m.mu.Unlock() } // Traversal, for range is safe for k, e := range read.m { v, ok := e.load() if !ok { continue } if !f(k, v) { break } } }
A m.dirty boost may be made before the Range method is called, but boosting m.dirty is not a time-consuming operation.
Performance of sync.Map
Performance testing is provided in the Go 1.9 source code: map_bench_test.go, map_reference_test.go
I also modified the code to get the following test data. Compared with previous solutions, the performance has improved a little. If you pay special attention to performance, you can consider sync.Map.
BenchmarkHitAll/*sync.RWMutexMap-4 20000000 83.8 ns/op BenchmarkHitAll/*sync.Map-4 30000000 59.9 ns/op BenchmarkHitAll_WithoutPrompting/*sync.RWMutexMap-4 20000000 96.9 ns/op BenchmarkHitAll_WithoutPrompting/*sync.Map-4 20000000 64.1 ns/op BenchmarkHitNone/*sync.RWMutexMap-4 20000000 79.1 ns/op BenchmarkHitNone/*sync.Map-4 30000000 43.3 ns/op BenchmarkHit_WithoutPrompting/*sync.RWMutexMap-4 20000000 81.5 ns/op BenchmarkHit_WithoutPrompting/*sync.Map-4 30000000 44.0 ns/op BenchmarkUpdate/*sync.RWMutexMap-4 5000000 328 ns/op BenchmarkUpdate/*sync.Map-4 10000000 146 ns/op BenchmarkUpdate_WithoutPrompting/*sync.RWMutexMap-4 5000000 336 ns/op BenchmarkUpdate_WithoutPrompting/*sync.Map-4 5000000 324 ns/op BenchmarkDelete/*sync.RWMutexMap-4 10000000 155 ns/op BenchmarkDelete/*sync.Map-4 30000000 55.0 ns/op BenchmarkDelete_WithoutPrompting/*sync.RWMutexMap-4 10000000 173 ns/op BenchmarkDelete_WithoutPrompting/*sync.Map-4 10000000 147 ns/op
Other
sync.Map has no Len method, and there is no sign to add (issue #680), so if you want to get the number of valid entries in the current Map, you need to use Range method to traverse once, comparing X pain.
The LoadOrStore method returns the existing value (Load) if the supplied key exists, otherwise the supplied key value (Store) is saved.