An article to understand ThreadLocal

Keywords: Java

How does ThreadLocal ensure that objects are only accessed by the current thread?

Let's dig into the internal implementation of ThreadLocal.

Naturally, we need to focus on the set() method and get() method of ThreadLocal.

set

Let's start with the set() method:

    /**
     * Sets the current thread's copy of this thread-local variable
     * to the specified value.  Most subclasses will have no need to
     * override this method, relying solely on the {@link #initialValue}
     * method to set the values of thread-locals.
     *
     * @param value the value to be stored in the current thread's copy of
     *        this thread-local.
     */
    public void set(T value) {
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null)
            map.set(this, value);
        else
            createMap(t, value);
    }

When set ting, first obtain the current thread object, then get the ThreadLocalMap of the thread through the getMap() method, and store the value in ThreadLocalMap.

ThreadLocalMap can be understood as a Map (although it is not, you can simply interpret it as a HashMap), but it is a member defined inside the Thread.

Note that the following definitions are extracted from the Thread class

    /* ThreadLocal values pertaining to this thread. This map is maintained
     * by the ThreadLocal class. */
    ThreadLocal.ThreadLocalMap threadLocals = null;

The data set to ThreadLocal is also written to the Map of threadLocals.

Where, key is the current ThreadLocal object, and value is the value we need.

Threadlocales itself saves all the "local variables" of the current thread, that is, a collection of ThreadLocal variables.

Here is also a benefit for developers who want to improve Amway: Advanced Java notes, full PDF, click here for free.

get

When performing the get() method operation, it is natural to take out the data in the Map.

    /**
     * Returns the value in the current thread's copy of this
     * thread-local variable.  If the variable has no value for the
     * current thread, it is first initialized to the value returned
     * by an invocation of the {@link #initialValue} method.
     *
     * @return the current thread's value of this thread-local
     */
    public T get() {
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null) {
            ThreadLocalMap.Entry e = map.getEntry(this);
            if (e != null) {
                @SuppressWarnings("unchecked")
                T result = (T)e.value;
                return result;
            }
        }
        return setInitialValue();
    }

The get() method first obtains the ThreadLocalMap object of the current thread, and then obtains the internal actual data by using itself as a key.

Thread.exit()

After understanding the internal implementation of ThreadLocal, we will naturally lead to a problem:

That is, these variables are maintained inside the Thread class (the class where the ThreadLocalMap is defined), which also means that as long as the Thread does not exit, the object reference will always exist.

When the Thread exits, the Thread class will do some cleaning work, including cleaning up ThreadLocalMap.

    /**
     * This method is called by the system to give a Thread
     * a chance to clean up before it actually exits.
     */
    private void exit() {
        if (group != null) {
            group.threadTerminated(this);
            group = null;
        }
        /* Aggressively null out all reference fields: see bug 4006245 */
        target = null;
        /* Speed the release of some of these resources */
        threadLocals = null;
        inheritableThreadLocals = null;
        inheritedAccessControlContext = null;
        blocker = null;
        uncaughtExceptionHandler = null;
    }

Therefore, using a thread pool means that the current thread may not exit (for example, for a fixed size thread pool, threads always exist).

If so, setting some large objects into ThreadLocal (which is actually saved in the ThreadLocalMap held by the thread) may lead to memory leakage in the system.

What I mean here is: you set the object to Threadlocal, but don't clean it. After you use it several times, the object is no longer useful, but it can't be recycled.

At this time, if you want to recycle the object in time, you'd better use the ThreadLocal.remove() method to remove this variable, just as we habitually close the database connection.

If you really don't need this object, you should tell the virtual machine to recycle it to prevent memory leakage.

tl = null

Another interesting situation is that JDK may also allow you to release ThreadLocal like normal variables.

For example, we sometimes write code like obj = null to speed up garbage collection.

If this is done, the object pointed to by obj will be more easily discovered by the garbage collector, thus speeding up the collection.

Similarly, if we manually set the ThreadLocal variable to null, such as tl = null, the local variables of all threads corresponding to this ThreadLocal may be recycled.

What's the secret?

Let's start with a simple example.

package com.shockang.study.java.concurrent.thread_local;

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class ThreadLocalDemo_Gc {
    static volatile ThreadLocal<SimpleDateFormat> tl = new ThreadLocal<SimpleDateFormat>() {
        protected void finalize() throws Throwable {
            System.out.println(this.toString() + " is gc");
        }
    };
    static volatile CountDownLatch cd = new CountDownLatch(10000);

    public static class ParseDate implements Runnable {
        int i = 0;

        public ParseDate(int i) {
            this.i = i;
        }

        public void run() {
            try {
                if (tl.get() == null) {
                    tl.set(new SimpleDateFormat("yyyy-MM-dd HH:mm:ss") {
                        protected void finalize() throws Throwable {
                            System.out.println(this.toString() + " is gc");
                        }
                    });
                    System.out.println(Thread.currentThread().getId() + ":create SimpleDateFormat");
                }
                Date t = tl.get().parse("2015-03-29 19:29:" + i % 60);
            } catch (ParseException e) {
                e.printStackTrace();
            } finally {
                cd.countDown();
            }
        }
    }

    public static void main(String[] args) throws InterruptedException {
        ExecutorService es = Executors.newFixedThreadPool(10);
        for (int i = 0; i < 10000; i++) {
            es.execute(new ParseDate(i));
        }
        cd.await();
        System.out.println("mission complete!!");
        tl = null;
        System.gc();
        System.out.println("first GC complete!!");
        //When ThreadLocal is set, invalid objects in ThreadLocalMap will be cleared
        tl = new ThreadLocal<SimpleDateFormat>();
        cd = new CountDownLatch(10000);
        for (int i = 0; i < 10000; i++) {
            es.execute(new ParseDate(i));
        }
        cd.await();
        Thread.sleep(1000);

        System.gc();
        System.out.println("second GC complete!!");

    }
}

The above case is to track the garbage collection of ThreadLocal object and internal SimpleDateFormat object.

To do this, we overloaded the finalize() method.

In this way, we can see the trace of objects when they are recycled.

In the main function, there are two task submissions, 10000 tasks each time.

After the first task submission, we set tl to null and perform a GC.

Then, we submit the task for the second time, and then perform GC again.

Executing the above code, the most likely output is as follows.

19:create SimpleDateFormat
15:create SimpleDateFormat
17:create SimpleDateFormat
18:create SimpleDateFormat
20:create SimpleDateFormat
14:create SimpleDateFormat
11:create SimpleDateFormat
12:create SimpleDateFormat
13:create SimpleDateFormat
16:create SimpleDateFormat
mission complete!!
first GC complete!!
com.shockang.study.java.concurrent.thread_local.ThreadLocalDemo_Gc$1@5041865d is gc
11:create SimpleDateFormat
14:create SimpleDateFormat
20:create SimpleDateFormat
12:create SimpleDateFormat
16:create SimpleDateFormat
13:create SimpleDateFormat
18:create SimpleDateFormat
15:create SimpleDateFormat
17:create SimpleDateFormat
19:create SimpleDateFormat
second GC complete!!

Note what these outputs represent.

First, each of the 10 threads in the thread pool creates a SimpleDateFormat object instance.

After the first GC, you can see that the ThreadLocal object is recycled (anonymous class is used here, so the class name looks a little strange. This class is the t object created at the beginning).

Submit the second task. This time, 10 SimpleDateFormat objects are created, and then perform the second GC.

After the second GC, all the 10 subclass instances of SimpleDateFormat created for the first time are recycled.

Although we do not remove these objects manually, it is still possible for the system to recycle them.

ThreadLocal.ThreadLocalMap

To understand the above recycling mechanism, we need to further understand the implementation of ThreadLocal.ThreadLocalMap.

As we said before, ThreadLocalMap is something similar to HashMap.

More precisely, it is more similar to WeakHashMap.

The implementation of ThreadLocalMap uses weak references.

A weak reference is a reference that is much weaker than a strong reference.

During garbage collection, if a weak reference is found in the Java virtual machine, it will be recycled immediately.

ThreadLocalMap is internally composed of a series of entries, each of which is a WeakReference < ThreadLocal >.

        /**
         * The entries in this hash map extend WeakReference, using
         * its main ref field as the key (which is always a
         * ThreadLocal object).  Note that null keys (i.e. entry.get()
         * == null) mean that the key is no longer referenced, so the
         * entry can be expunged from table.  Such entries are referred to
         * as "stale entries" in the code that follows.
         */
        static class Entry extends WeakReference<ThreadLocal<?>> {
            /** The value associated with this ThreadLocal. */
            Object value;

            Entry(ThreadLocal<?> k, Object v) {
                super(k);
                value = v;
            }
        }

Here, the parameter k is the key of Map, and v is the value of Map, where k is also a ThreadLocal instance, which is used as a weak reference.

super(k) is the constructor that calls WeakReference

Therefore, although ThreadLocal is used as the key of Map, it does not really hold the reference of ThreadLocal.

When the external strong reference of ThreadLocal is recycled, the key in ThreadLocalMap will become null.

When the system cleans up ThreadLocalMap (for example, adding new variables to the table will automatically clean up. Although JDK may not perform a thorough scan, it obviously works in this case), it will recycle these garbage data.

Recycling mechanism of ThreadLocal

The recycling mechanism of ThreadLocal is shown in the figure.

Posted by MickeySox on Tue, 16 Nov 2021 04:03:40 -0800