Use and Analysis of [Curator] Path Cache

Keywords: Apache Java

Path Cache

Path Cache is actually used to monitor zk nodes. Path Cache can respond to changes in the state and data of the set of child nodes, whether it is added, updated or removed.

1. Key API s

org.apache.curator.framework.recipes.cache.PathChildrenCache

org.apache.curator.framework.recipes.cache.PathChildrenCacheEvent

org.apache.curator.framework.recipes.cache.PathChildrenCacheListener

org.apache.curator.framework.recipes.cache.ChildData

2. Mechanisms

PathChildrenCache uses a command pattern to encapsulate various operations:

  • Operating interface: org.apache.curator.framework.recipes.cache.Operation
    • Refresh operation: org.apache.curator.framework.recipes.cache.RefreshOperation
    • Triggered event operation: org.apache.curator.framework.recipes.cache.EventOperation
    • Getting data operations: org.apache.curator.framework.recipes.cache.GetDataOperation

These operation objects, in turn, accept the PathChildrenCache reference in the constructor so that the cache (callback) can be processed in the operation:

EventOperation(PathChildrenCache cache, PathChildrenCacheEvent event)
{
    this.cache = cache;
    this.event = event;
}
GetDataOperation(PathChildrenCache cache, String fullPath)
{
    this.cache = cache;
    this.fullPath = PathUtils.validatePath(fullPath);
}
RefreshOperation(PathChildrenCache cache, PathChildrenCache.RefreshMode mode)
{
    this.cache = cache;
    this.mode = mode;
}

These operations are also invoked using a single thread pool, resulting in asynchronous calls.

  • A private final Set< Operation> operationsQuantizer = Sets. newSetFromMap (Maps. & lt; Operation, Boolean> newConcurrent Map ()) is used as the task receiving queue for the thread pool.
    • Using set avoids repetitive operations in concurrent situations
    • Because of single thread, all kinds of operations are executed in sequence.
  • So to avoid blocking curator's monitoring mechanism
    • In both childrenWatcher and dataWatcher, commands are executed asynchronously

Trigger operation:

void offerOperation(final Operation operation)
{
    if ( operationsQuantizer.add(operation) )
    {
        submitToExecutor
        (
            new Runnable()
            {
                @Override
                public void run()
                {
                    try
                    {
                        operationsQuantizer.remove(operation);
                        operation.invoke();
                    }
                    catch ( InterruptedException e )
                    {
                        //We expect to get interrupted during shutdown,
                        //so just ignore these events
                        if ( state.get() != State.CLOSED )
                        {
                            handleException(e);
                        }
                        Thread.currentThread().interrupt();
                    }
                    catch ( Exception e )
                    {
                        ThreadUtils.checkInterrupted(e);
                        handleException(e);
                    }
                }
            }
        );
    }
}
private synchronized void submitToExecutor(final Runnable command)
{
    if ( state.get() == State.STARTED )
    {
        executorService.submit(command);
    }
}
  • Considering the interruption of various operations
  • Considering the state
  • Exception handling for unified operations
  • The delivery method submitToExecutor uses synchronized
    • Because it is possible for the listener to trigger, the state needs to be checked.
      • If it is shut down first and then returned by a listener, it will cause unnecessary operation.
    • The check action is not atomic, so synchronous lock is needed.

3. Usage

3.1 Creation

public PathChildrenCache(CuratorFramework client,
                         String path,
                         boolean cacheData)
  • cacheData
    • If true is set, the content data of the node is added to stat as a cache.

3.2 Use

  • Cache must call the start() method before using it
    • There are two start() methods
      1. void start()
        • No ginseng
      2. void start(PathChildrenCache.StartMode mode)
        • You can choose how to initialize by parameters
        • StartMode
          • NORMAL
          • BUILD_INITIAL_CACHE
          • POST_INITIALIZED_EVENT
  • You need to call the close() method when you're done with it
  • At any time, state information can be obtained by calling getCurrentData().
  • You can add listeners to call back when data changes
    • public void addListener(PathChildrenCacheListener listener)

4. Error handling

The PathChildrenCache instance monitors the link status through the ConnectionStateListener. If the link state changes, the cache will be reset (Path ChildrenCacheListener will be subject to a RESET event)

5. Source code analysis

Class 5.1 Definitions

public class PathChildrenCache implements Closeable{}
  • Implement java.io.Closeable interface

5.2 Membership Variables

public class PathChildrenCache implements Closeable
{
    private final Logger log = LoggerFactory.getLogger(getClass());
    private final CuratorFramework client;
    private final String path;
    private final CloseableExecutorService executorService;
    private final boolean cacheData;
    private final boolean dataIsCompressed;
    private final ListenerContainer<PathChildrenCacheListener> listeners = new ListenerContainer<PathChildrenCacheListener>();
    private final ConcurrentMap<String, ChildData> currentData = Maps.newConcurrentMap();
    private final AtomicReference<Map<String, ChildData>> initialSet = new AtomicReference<Map<String, ChildData>>();
    private final Set<Operation> operationsQuantizer = Sets.newSetFromMap(Maps.<Operation, Boolean>newConcurrentMap());
    private final AtomicReference<State> state = new AtomicReference<State>(State.LATENT);
    private final EnsureContainers ensureContainers;

    private enum State
    {
        LATENT,
        STARTED,
        CLOSED
    }

    private static final ChildData NULL_CHILD_DATA = new ChildData("/", null, null);

    private static final boolean USE_EXISTS = Boolean.getBoolean("curator-path-children-cache-use-exists");

    private volatile Watcher childrenWatcher = new Watcher()
    {
        @Override
        public void process(WatchedEvent event)
        {
            offerOperation(new RefreshOperation(PathChildrenCache.this, RefreshMode.STANDARD));
        }
    };

    private volatile Watcher dataWatcher = new Watcher()
    {
        @Override
        public void process(WatchedEvent event)
        {
            try
            {
                if ( event.getType() == Event.EventType.NodeDeleted )
                {
                    remove(event.getPath());
                }
                else if ( event.getType() == Event.EventType.NodeDataChanged )
                {
                    offerOperation(new GetDataOperation(PathChildrenCache.this, event.getPath()));
                }
            }
            catch ( Exception e )
            {
                ThreadUtils.checkInterrupted(e);
                handleException(e);
            }
        }
    };

    @VisibleForTesting
    volatile Exchanger<Object> rebuildTestExchanger;

    private volatile ConnectionStateListener connectionStateListener = new ConnectionStateListener()
    {
        @Override
        public void stateChanged(CuratorFramework client, ConnectionState newState)
        {
            handleStateChange(newState);
        }
    };
    private static final ThreadFactory defaultThreadFactory = ThreadUtils.newThreadFactory("PathChildrenCache");
}
  • log
  • client
  • path
    • Cache the corresponding zk node path
  • executorService
    • org.apache.curator.utils.CloseableExecutorService
    • Thread pool
    • To perform various operations
    • See Chapter 2.
  • cacheData
    • Do you need to cache data?
  • dataIsCompressed
    • Is the data compressed
  • listeners
    • org.apache.curator.framework.listen.ListenerContainer
    • Listener container (managing multiple listeners)
    • Business Monitor
    • You can add your own listeners
  • currentData
    • java.util.concurrent.ConcurrentMap
    • Current data
    • &lt;String, ChildData&gt;
    • Store multiple org.apache.curator.framework.recipes.cache.ChildData
  • initialSet
    • AtomicReference
    • Initialization set
    • Place nodes to track whether each node is initialized
      • If all nodes are initialized, the PathChildrenCacheEvent.Type.INITIALIZED event is triggered.
  • operationsQuantizer
    • Task Receiving Queue Equivalent to Thread Pool
  • state
    • state
    • AtomicReference
  • ensureContainers
    • org.apache.curator.framework.EnsureContainers
    • Create path nodes thread-safe
  • State
    • Internal enumeration
      • LATENT
      • STARTED
      • CLOSED
  • NULL_CHILD_DATA
    • Private Constants
    • Empty Data Node
  • USE_EXISTS
    • Private Constants
    • Use curator-path-children-cache-use-exists values in system configuration
  • childrenWatcher
    • volatile
    • Subnode Change Monitor
  • dataWatcher
    • volatile
    • Data Change Monitor
  • rebuildTestExchanger
    • java.util.concurrent.Exchanger
    • Used to pass values between concurrent threads
    • Pass a signal object through this object when rebuilding the cache
    • Used for testing
  • connectionStateListener
    • Link status listener
  • defaultThreadFactory
    • Thread Factory

5.3 Constructor

public PathChildrenCache(CuratorFramework client, String path, PathChildrenCacheMode mode)
{
    this(client, path, mode != PathChildrenCacheMode.CACHE_PATHS_ONLY, false, new CloseableExecutorService(Executors.newSingleThreadExecutor(defaultThreadFactory), true));
}

public PathChildrenCache(CuratorFramework client, String path, PathChildrenCacheMode mode, ThreadFactory threadFactory)
{
    this(client, path, mode != PathChildrenCacheMode.CACHE_PATHS_ONLY, false, new CloseableExecutorService(Executors.newSingleThreadExecutor(threadFactory), true));
}

public PathChildrenCache(CuratorFramework client, String path, boolean cacheData)
{
    this(client, path, cacheData, false, new CloseableExecutorService(Executors.newSingleThreadExecutor(defaultThreadFactory), true));
}

public PathChildrenCache(CuratorFramework client, String path, boolean cacheData, ThreadFactory threadFactory)
{
    this(client, path, cacheData, false, new CloseableExecutorService(Executors.newSingleThreadExecutor(threadFactory), true));
}

public PathChildrenCache(CuratorFramework client, String path, boolean cacheData, boolean dataIsCompressed, ThreadFactory threadFactory)
{
    this(client, path, cacheData, dataIsCompressed, new CloseableExecutorService(Executors.newSingleThreadExecutor(threadFactory), true));
}

public PathChildrenCache(CuratorFramework client, String path, boolean cacheData, boolean dataIsCompressed, final ExecutorService executorService)
{
    this(client, path, cacheData, dataIsCompressed, new CloseableExecutorService(executorService));
}

public PathChildrenCache(CuratorFramework client, String path, boolean cacheData, boolean dataIsCompressed, final CloseableExecutorService executorService)
{
    this.client = client;
    this.path = PathUtils.validatePath(path);
    this.cacheData = cacheData;
    this.dataIsCompressed = dataIsCompressed;
    this.executorService = executorService;
    ensureContainers = new EnsureContainers(client, path);
}

There are seven constructors, all of which end up calling the last one. However, it can also be seen that:

  • By default, newSingleThreadExecutor single-threaded thread pool is used
  • Default does not compress data

5.4 Start-up

The cache needs to call start() before it can be used.

public enum StartMode
    {
        NORMAL,
        BUILD_INITIAL_CACHE,
        POST_INITIALIZED_EVENT
    }

public void start() throws Exception
{
    start(StartMode.NORMAL);
}

@Deprecated
public void start(boolean buildInitial) throws Exception
{
    start(buildInitial ? StartMode.BUILD_INITIAL_CACHE : StartMode.NORMAL);
}

public void start(StartMode mode) throws Exception
{
    Preconditions.checkState(state.compareAndSet(State.LATENT, State.STARTED), "already started");
    mode = Preconditions.checkNotNull(mode, "mode cannot be null");

    client.getConnectionStateListenable().addListener(connectionStateListener);

    switch ( mode )
    {
        case NORMAL:
        {
            offerOperation(new RefreshOperation(this, RefreshMode.STANDARD));
            break;
        }

        case BUILD_INITIAL_CACHE:
        {
            rebuild();
            break;
        }

        case POST_INITIALIZED_EVENT:
        {
            initialSet.set(Maps.<String, ChildData>newConcurrentMap());
            offerOperation(new RefreshOperation(this, RefreshMode.POST_INITIALIZED));
            break;
        }
    }
}

private void processChildren(List<String> children, RefreshMode mode) throws Exception
{
    Set<String> removedNodes = Sets.newHashSet(currentData.keySet());
    for ( String child : children ) {
        removedNodes.remove(ZKPaths.makePath(path, child));
    }

    for ( String fullPath : removedNodes )
    {
        remove(fullPath);
    }

    for ( String name : children )
    {
        String fullPath = ZKPaths.makePath(path, name);

        if ( (mode == RefreshMode.FORCE_GET_DATA_AND_STAT) || !currentData.containsKey(fullPath) )
        {
            getDataAndStat(fullPath);
        }

        updateInitialSet(name, NULL_CHILD_DATA);
    }
    maybeOfferInitializedEvent(initialSet.get());
}
  • Parametric start()
    • StartMode.NORMAL policy is used by default
  • Start (boolean build Initial) not recommended
    • true
      • Using StartMode.BUILD_INITIAL_CACHE Policy
    • false
      • Use StartMode.NORMAL policy
  • Link status listeners were added at startup

You can see that the startup process has three strategies:

  1. NORMAL mode
    1. Execute the refresh command org.apache.curator.framework.recipes.cache.RefreshOperation (command mode)
      • RefreshMode.STANDARD refresh mode
      • Call the org. apache. curator. framework. recipes. cache. PathChildrenCache refresh method
        1. Call org.apache.curator.framework.EnsureContainers#ensure to create nodes
        2. Add a child ren Watcher listener to the node
        3. The callback triggers the refresh of org.apache.curator.framework.recipes.cache.PathChildrenCache#processChildren
          1. Clean up other nodes in the cached local data
            1. Screen out data nodes that are not native cache
            2. Clean up from the local initial collection
          2. If the cache node is not synchronized locally, or specified as RefreshMode.FORCE_GET_DATA_AND_STAT mode
            1. Synchronize node data and state immediately
              1. If you don't need to cache data, check only whether the node exists (cache only the node and state, excluding data)
              2. Otherwise, read the data (decompress the data if you need to decompress it) and build the ChildData cache
                1. Put new data into current data
                2. Trigger events as appropriate (wake up listeners)
                  • PathChildrenCacheEvent.Type.CHILD_ADDED Event
                  • PathChildrenCacheEvent.Type.CHILD_UPDATED Event
                3. Update initial Set data (replace unsynchronized NULL_CHILD_DATA data with read data)
          3. Update initialSet
            1. If the Map of the initialSet is not empty
              • In NORMAL mode, this is empty
              • See POST_INITIALIZED_EVENT mode
  2. BUILD_INITIAL_CACHE Model
    1. Call the rebuild method (which blocks execution)
      • Requery all required data
      • It does not trigger anything.
        1. Secure Creation path
        2. Clear the current Data cache
        3. Reload subnodes under path and reconstruct caches one by one
          • Read node data and status one by one
          • Building ChildData into CurrtData
        4. Send the signal object through rebuildTest Exchanger
  3. POST_INITIALIZED_EVENT Mode
    1. Initialize initialSet
    2. Refresh the cache in RefreshMode.POST_INITIALIZED mode
      • See NORMAL mode, but the difference is
        • Updating the initialSet
          1. If the Map of the initialSet is not empty
            • In POST_INITIALIZED_EVENT mode, Map has been initialized here
          2. If all the data in the initialSet has been synchronized (not equal to NULL_CHILD_DATA)
            1. Empty the initialSet
            2. Trigger the PathChildrenCacheEvent.Type.INITIALIZED event

5.5 Node Change

start() has added a listener child ren Watcher to the path

private volatile Watcher childrenWatcher = new Watcher()
{
    @Override
    public void process(WatchedEvent event)
    {
        offerOperation(new RefreshOperation(PathChildrenCache.this, RefreshMode.STANDARD));
    }
};
  • Refresh the cache in RefreshMode.STANDARD mode
    • Local cached data and zk nodes are compared
    • Just process new cached data
  • Note the operation parameter PathChildrenCache.this
    • this is different.

5.6 Data change

Each time the cached data is retrieved (getDataAndStat method), a listener dataWatcher is added to each cache:

private volatile Watcher dataWatcher = new Watcher()
{
    @Override
    public void process(WatchedEvent event)
    {
        try
        {
            if ( event.getType() == Event.EventType.NodeDeleted )
            {
                remove(event.getPath());
            }
            else if ( event.getType() == Event.EventType.NodeDataChanged )
            {
                offerOperation(new GetDataOperation(PathChildrenCache.this, event.getPath()));
            }
        }
        catch ( Exception e )
        {
            ThreadUtils.checkInterrupted(e);
            handleException(e);
        }
    }
};
  • When a node is deleted
    • Clean up the cache
    • Trigger the PathChildrenCacheEvent.Type.CHILD_REMOVED event
  • When data changes
    • Execute the GetData Operation operation
      • That is, to execute the getDataAndStat method again
  • Note the operation parameter PathChildrenCache.this
    • this is different.

5.7 Getting Current Data

public List<ChildData> getCurrentData()
{
    return ImmutableList.copyOf(Sets.<ChildData>newTreeSet(currentData.values()));
}

public ChildData getCurrentData(String fullPath)
{
    return currentData.get(fullPath);
}

All from local data

5.8 Clearance

5.8.1 Cleaning Cache

public void clear()
{
    currentData.clear();
}
public void clearAndRefresh() throws Exception
{
    currentData.clear();
    offerOperation(new RefreshOperation(this, RefreshMode.STANDARD));
}

Clean up local data

RefreshMode.STANDARD mode is used if necessary to refresh

5.8.2 Cleaning Cached Data

public void clearDataBytes(String fullPath)
{
    clearDataBytes(fullPath, -1);
}
public boolean clearDataBytes(String fullPath, int ifVersion)
{
    ChildData data = currentData.get(fullPath);
    if ( data != null )
    {
        if ( (ifVersion < 0) || (ifVersion == data.getStat().getVersion()) )
        {
            if ( data.getData() != null )
            {
                currentData.replace(fullPath, data, new ChildData(data.getPath(), data.getStat(), null));
            }
            return true;
        }
    }
    return false;
}

Keep the cache information, but the data is partially empty

5.9 Link State Change

connectionStateListener listener is added to the link at startup (start():

private volatile ConnectionStateListener connectionStateListener = new ConnectionStateListener()
{
    @Override
    public void stateChanged(CuratorFramework client, ConnectionState newState)
    {
        handleStateChange(newState);
    }
};

private void handleStateChange(ConnectionState newState)
{
    switch ( newState )
    {
    case SUSPENDED:
    {
        offerOperation(new EventOperation(this, new PathChildrenCacheEvent(PathChildrenCacheEvent.Type.CONNECTION_SUSPENDED, null)));
        break;
    }

    case LOST:
    {
        offerOperation(new EventOperation(this, new PathChildrenCacheEvent(PathChildrenCacheEvent.Type.CONNECTION_LOST, null)));
        break;
    }

    case CONNECTED:
    case RECONNECTED:
    {
        try
        {
            offerOperation(new RefreshOperation(this, RefreshMode.FORCE_GET_DATA_AND_STAT));
            offerOperation(new EventOperation(this, new PathChildrenCacheEvent(PathChildrenCacheEvent.Type.CONNECTION_RECONNECTED, null)));
        }
        catch ( Exception e )
        {
            ThreadUtils.checkInterrupted(e);
            handleException(e);
        }
        break;
    }
    }
}

It is mainly based on the link state, triggering different operations, and triggering business listeners to execute.

  • Because the data is cached, when the link is lost or broken, the event is triggered only when the link is broken, and the data is not made unavailable.
  • RE CONNECTED triggers a refresh operation of RefreshMode.FORCE_GET_DATA_AND_STAT mode when the link is created and restored.

5.10 Close

After use, you need to call the close() method:

public void close() throws IOException
{
    if ( state.compareAndSet(State.STARTED, State.CLOSED) )
    {
        client.getConnectionStateListenable().removeListener(connectionStateListener);
        listeners.clear();
        executorService.close();
        client.clearWatcherReferences(childrenWatcher);
        client.clearWatcherReferences(dataWatcher);

        // TODO
        // This seems to enable even more GC - I'm not sure why yet - it
        // has something to do with Guava's cache and circular references
        connectionStateListener = null;
        childrenWatcher = null;
        dataWatcher = null;
    }
}
  • Atomic operation, update status to CLOSED
  • Remove Link Status Listener
  • Cleaning Business Monitor
  • Close thread pool
  • Clean-up Node Monitor
  • Empty Data Monitor

6. Summary

Path ChildrenCache is named with Cache. But it's not a complete cache.

It should be said that it is only a unified management of many nodes under the path. When these nodes change or data changes, they can be found by Path ChildrenCache and synchronized to the local Map. To achieve a caching concept.

As you can see from the API, it can only get data. As for placing caches, it needs to be implemented separately.

  • In fact, it's also simple. Just create a new node directly under the path and write the data.

You can control the cache more carefully by adding a custom listener and getListenable().addListener(listener).

7. Examples

Here you can refer to Official examples

Posted by JohnResig on Sun, 23 Jun 2019 14:33:50 -0700