Shared Lock
and Shared Reentrant Lock Similar, but not reentrant
> A complete distributed lock means that there will be no two holders for the same lock at the same time point. That is, at the same time, the same lock, there is at most one holder.
1. Key API s
org.apache.curator.framework.recipes.locks.InterProcessSemaphoreMutex
2. Mechanisms
There is also no Lock in the class name.
The name implies that semaphores are mutually exclusive between processes.
Shared Lock is actually a Shared Reentrant Lock that customizes lease management.
3. Usage
> How to use it and Shared Reentrant Lock Like that, I won't repeat it here.
4. Error handling
Also with Shared Reentrant Lock Like that, I won't repeat it here.
5. Source code analysis
Class 5.1 Definitions
Let's first look at class definitions:
public class InterProcessSemaphoreMutex implements InterProcessLock{}
- Only the interface of org.apache.curator.framework.recipes.locks.InterProcessLock is implemented.
- api defining lock operation
- acquire
- release
- isAcquiredInThisProcess
- api defining lock operation
5.2 Membership Variables
public class InterProcessSemaphoreMutex implements InterProcessLock { private final InterProcessSemaphoreV2 semaphore; private volatile Lease lease; }
- semaphore
- final
- Semaphore
- org.apache.curator.framework.recipes.locks.InterProcessSemaphoreV2
- Managing a lease locked between processes
- lease
- volatile
- org.apache.curator.framework.recipes.locks.Lease
- lease
- Represents a lease obtained from semaphore
5.2.1 InterProcessSemaphoreV2
InterProcess Semaphore Mutex's internal operation logic relies heavily on InterProcess Semaphore V2, so it's necessary to look at this class:
public class InterProcessSemaphoreV2 { private final Logger log = LoggerFactory.getLogger(getClass()); private final InterProcessMutex lock; private final CuratorFramework client; private final String leasesPath; private final Watcher watcher = new Watcher() { @Override public void process(WatchedEvent event) { notifyFromWatcher(); } }; private volatile byte[] nodeData; private volatile int maxLeases; private static final String LOCK_PARENT = "locks"; private static final String LEASE_PARENT = "leases"; private static final String LEASE_BASE_NAME = "lease-"; public static final Set<String> LOCK_SCHEMA = Sets.newHashSet( LOCK_PARENT, LEASE_PARENT ); }
- log
- lock
- final
- org.apache.curator.framework.recipes.locks.InterProcessMutex
- Client: ZK Client
- leasesPath
- final
- zk node path corresponding to lease
- watcher
- Monitor
- nodeData
- volatile
- Data written in a node
- maxLeases
- volatile
- Maximum group approximation
- LOCK_PARENT
- Private Constants
- LEASE_PARENT
- Private Constants
- LEASE_BASE_NAME
- Private Constants
- LOCK_SCHEMA
- Common Constants
You can see that there is an InterProcess Mutex inside InterProcess Semaphore V2.( Shared Reentrant Lock)
As you can see here, == Shared Lock== is actually a customized lease management== Shared Reentrant Lock==.
5.3 Constructor
Only one:
public InterProcessSemaphoreMutex(CuratorFramework client, String path) { this.semaphore = new InterProcessSemaphoreV2(client, path, 1); }
It's actually to initialize an InterProcess Semaphore V2 with a maximum lease of 1, without using org.apache.curator.framework.recipes.shared.SharedCountReader.
5.3.1 InterProcessSemaphoreV2
public InterProcessSemaphoreV2(CuratorFramework client, String path, int maxLeases) { this(client, path, maxLeases, null); } public InterProcessSemaphoreV2(CuratorFramework client, String path, SharedCountReader count) { this(client, path, 0, count); } private InterProcessSemaphoreV2(CuratorFramework client, String path, int maxLeases, SharedCountReader count) { this.client = client; path = PathUtils.validatePath(path); lock = new InterProcessMutex(client, ZKPaths.makePath(path, LOCK_PARENT)); this.maxLeases = (count != null) ? count.getCount() : maxLeases; leasesPath = ZKPaths.makePath(path, LEASE_PARENT); if ( count != null ) { count.addListener ( new SharedCountListener() { @Override public void countHasChanged(SharedCountReader sharedCount, int newCount) throws Exception { InterProcessSemaphoreV2.this.maxLeases = newCount; notifyFromWatcher(); } @Override public void stateChanged(CuratorFramework client, ConnectionState newState) { // no need to handle this here - clients should set their own connection state listener } } ); } }
- Initialized member variables
- Initialized distributed lock
- If SharedCountReader mode is used, a counter listener is added
- Shared Lock uses maxLeases mode, so listeners are not added here
5.4 Locking
As can be seen from Section 3.2, the locking action is accomplished by the acquire method. So let's see how to lock it.
public void acquire() throws Exception { lease = semaphore.acquire(); } public boolean acquire(long time, TimeUnit unit) throws Exception { Lease acquiredLease = semaphore.acquire(time, unit); if ( acquiredLease == null ) { return false; // important - don't overwrite lease field if couldn't be acquired } lease = acquiredLease; return true; }
The simpler logic is essentially the process of applying for semaphores.
- It can be found from the constructor that the maximum lease of this semaphore is only 1, so naturally, it becomes a non-reentrant lock implementation.
- All logic is implemented in semaphore
5.4.1 InterProcessSemaphoreV2
Let's look at how this semaphore is applied:
public Lease acquire() throws Exception { Collection<Lease> leases = acquire(1, 0, null); return leases.iterator().next(); } public Collection<Lease> acquire(int qty) throws Exception { return acquire(qty, 0, null); } public Lease acquire(long time, TimeUnit unit) throws Exception { Collection<Lease> leases = acquire(1, time, unit); return (leases != null) ? leases.iterator().next() : null; } public Collection<Lease> acquire(int qty, long time, TimeUnit unit) throws Exception { long startMs = System.currentTimeMillis(); boolean hasWait = (unit != null); long waitMs = hasWait ? TimeUnit.MILLISECONDS.convert(time, unit) : 0; Preconditions.checkArgument(qty > 0, "qty cannot be 0"); ImmutableList.Builder<Lease> builder = ImmutableList.builder(); boolean success = false; try { while ( qty-- > 0 ) { int retryCount = 0; long startMillis = System.currentTimeMillis(); boolean isDone = false; while ( !isDone ) { switch ( internalAcquire1Lease(builder, startMs, hasWait, waitMs) ) { case CONTINUE: { isDone = true; break; } case RETURN_NULL: { return null; } case RETRY_DUE_TO_MISSING_NODE: { // gets thrown by internalAcquire1Lease when it can't find the lock node // this can happen when the session expires, etc. So, if the retry allows, just try it all again if ( !client.getZookeeperClient().getRetryPolicy().allowRetry(retryCount++, System.currentTimeMillis() - startMillis, RetryLoop.getDefaultRetrySleeper()) ) { throw new KeeperException.NoNodeException("Sequential path not found - possible session loss"); } // try again break; } } } } success = true; } finally { if ( !success ) { returnAll(builder.build()); } } return builder.build(); }
As you can see, InterProcess Semaphore V2 has four acquisition methods. Essentially all logic is implemented by the last acquisition (int qty, long time, TimeUnit unit), and the other three are just templates. So let's focus on the logic of this approach.
Let's first look at what javadoc says about this method:
> Acquire qty leases. If there are not enough leases available, this method blocks until either the maximum number of leases is increased enough or other clients/processes close enough leases. However, this method will only block to a maximum of the time parameters given. If time expires before all leases are acquired, the subset of acquired leases are automatically closed.
> The client must close the leases when it is done with them. You should do this in a finally block. NOTE: You can use returnAll(Collection) for this.
> Used to apply for qty semaphores. If there are not enough available semaphores, this method will block until the maximum number of available semaphores is increased enough, or when other clients/processes release enough semaphores. This method, then, does not permanently block the wait, and can specify the maximum duration of the wait through parameters. When it expires, if not enough semaphores are obtained, all the assigned semaphores will be released.
> For the client that acquires the semaphore, it must actively release the semaphore after processing. The release action should be completed in the final code block, and the return All (Collection) method can be called to complete the release.
Back in the source code section, let's see how the above logic is implemented: Several local variables are defined:
- startMs: Start time
- hasWait: Is there a waiting period?
- WatMs: Converted to milliseconds by length of time per parameter unit
- An immutable List is defined to store the semaphore lease applied for. com.google.common.collect.ImmutableList.Builder
- Repeat the application action as long as there are not enough qty semaphores
- Each round of application (each semaphore application) initializes several variables
- retryCount: Number of retries
- startMillis: The start time of this round of applications
- isDone: Completed or not
- As long as it's not done, keep trying.
- Using a state machine routine
- The basis for the next action based on the return status of internal Acquire1Lease
- CONTINUE
- Continue
- This round of application is completed and the next round can be started.
- RETURN_NULL
- Failure of application
- Return null
- RETRY_DUE_TO_MISSING_NODE
- Node information error
- For example, when the link is disconnected, session fails, etc.
- If not, try again
- If it has timed out, throw KeeperException. NoNoNodeException
- Node information error
- CONTINUE
- As long as it's not done, keep trying.
- If it doesn't all succeed
- The final block of code cleans up the established semaphores
It can be found that the application action for a single semaphore is actually completed by the internal Acquire1Lease (builder, startMs, hasWait, waitMs) method, so let's see what this method does:
private InternalAcquireResult internalAcquire1Lease(ImmutableList.Builder<Lease> builder, long startMs, boolean hasWait, long waitMs) throws Exception { if ( client.getState() != CuratorFrameworkState.STARTED ) { return InternalAcquireResult.RETURN_NULL; } if ( hasWait ) { long thisWaitMs = getThisWaitMs(startMs, waitMs); if ( !lock.acquire(thisWaitMs, TimeUnit.MILLISECONDS) ) { return InternalAcquireResult.RETURN_NULL; } } else { lock.acquire(); } Lease lease = null; try { PathAndBytesable<String> createBuilder = client.create().creatingParentContainersIfNeeded().withProtection().withMode(CreateMode.EPHEMERAL_SEQUENTIAL); String path = (nodeData != null) ? createBuilder.forPath(ZKPaths.makePath(leasesPath, LEASE_BASE_NAME), nodeData) : createBuilder.forPath(ZKPaths.makePath(leasesPath, LEASE_BASE_NAME)); String nodeName = ZKPaths.getNodeFromPath(path); lease = makeLease(path); if ( debugAcquireLatch != null ) { debugAcquireLatch.await(); } synchronized(this) { for(;;) { List<String> children; try { children = client.getChildren().usingWatcher(watcher).forPath(leasesPath); } catch ( Exception e ) { if ( debugFailedGetChildrenLatch != null ) { debugFailedGetChildrenLatch.countDown(); } returnLease(lease); // otherwise the just created ZNode will be orphaned causing a dead lock throw e; } if ( !children.contains(nodeName) ) { log.error("Sequential path not found: " + path); returnLease(lease); return InternalAcquireResult.RETRY_DUE_TO_MISSING_NODE; } if ( children.size() <= maxLeases ) { break; } if ( hasWait ) { long thisWaitMs = getThisWaitMs(startMs, waitMs); if ( thisWaitMs <= 0 ) { returnLease(lease); return InternalAcquireResult.RETURN_NULL; } wait(thisWaitMs); } else { wait(); } } } } finally { lock.release(); } builder.add(Preconditions.checkNotNull(lease)); return InternalAcquireResult.CONTINUE; }
- First, determine the state of the zk client, if the RETURN_NULL is not played back correctly
- Failure of application
- Locking using internal distributed locks
- Decide which lock method to call based on whether to wait or not
- If the lock fails within the set time limit, RETURN_NULL is replaced
- Failure of application
- Once the lock is acquired, a temporarily ordered node is created for semaphore recording.
- Call makeLease to wrap the node path of the previous step into a Lease object
- After that, the operation Lease essentially operates on the temporarily ordered node created in step 3.
- Synchronized plus synchronized lock
- Continuous retries to determine whether the semaphore is available
- Get the semaphore node list
- If the list does not contain the current semaphore node
- Explain that the current session is invalid, or that the zk node is deleted by mistake, etc.
- Returning to RETRY_DUE_TO_MISSING_NODE requires a new application.
- If the number of lists is less than or equal to maxleases
- Explain that the semaphore for the application is available
- Exit the cycle
- If you need to wait, calculate the waiting time and wait.
- Continuous retries to determine whether the semaphore is available
- finally releases the distributed lock in step 2
- Put the resulting semaphores in an immutable List
- Play back CONTINUE
Here are a few points to explain.
- Locking problem
- InterProcess Mutex is used internally. Shared Reentrant Lock As mentioned, this lock is reentrant
- That is, threads holding locks can be re-entered directly
- So in step 5 above, when a distributed lock has been acquired, a synchronized synchronized lock is still needed to control the concurrency of local multithreads.
- InterProcess Mutex is used internally. Shared Reentrant Lock As mentioned, this lock is reentrant
- zk node
- Nodes for distributed locks: path +/locks
- Temporary ordered nodes for locks
- This node will be cleared (deleted) when the lock is released as the semaphore application action is completed (whether the application is successful or not).
- Nodes for semaphores: path +/leases
- Temporary ordered nodes of semaphores
- This node path is encapsulated in Lease objects
- As the semaphore is used, the close method is executed
- This method clears the semaphore nodes (deletes)
- As the semaphore is used, the close method is executed
- Nodes for distributed locks: path +/locks
5.4.2 Summary
The process of locking can be summarized briefly.
- The locking process is controlled by a signal with a maximum of 1.
- Solve the competition between processes through a distributed re-entrainable lock
- Resolve the competition between threads through a synchronized synchronized lock
- Since only one semaphore is available globally, there is only one lock between processes, threads and the same thread.
- As long as the lock is not released, even the same thread cannot apply for the lock again.
5.5 release lock
public void release() throws Exception { Lease lease = this.lease; Preconditions.checkState(lease != null, "Not acquired"); this.lease = null; lease.close(); }
You can see that Shared Lock's release logic is to turn off semaphores. So let's look at the implementation logic of org. apache. curator. framework. recipes. locks. Lease close.
As described in the previous section, Lease is implemented in: org. apache. curator. framework. recipes. locks. InterProcess Semaphore V2 # makeLease
private Lease makeLease(final String path) { return new Lease() { @Override public void close() throws IOException { try { client.delete().guaranteed().forPath(path); } catch ( KeeperException.NoNodeException e ) { log.warn("Lease already released", e); } catch ( Exception e ) { ThreadUtils.checkInterrupted(e); throw new IOException(e); } } @Override public byte[] getData() throws Exception { return client.getData().forPath(path); } @Override public String getNodeName() { return ZKPaths.getNodeFromPath(path); } }; }
You can see the close method, in fact, is to delete the semaphore node, so as to release the semaphore.
6. Testing
Because of this and Shared Reentrant Lock Similarly, this example looks at reentrancing in a single thread.
package com.roc.curator.demo.locks import org.apache.commons.lang3.RandomStringUtils import org.apache.curator.framework.CuratorFramework import org.apache.curator.framework.CuratorFrameworkFactory import org.apache.curator.framework.recipes.locks.InterProcessSemaphoreMutex import org.apache.curator.retry.ExponentialBackoffRetry import org.junit.Before import org.junit.Test import java.util.* import java.util.concurrent.TimeUnit /** * Created by roc on 2017/5/30. */ class InterProcessSemaphoreMutexTest { val LATCH_PATH: String = "/test/locks/ipsm" var client: CuratorFramework = CuratorFrameworkFactory.builder() .connectString("0.0.0.0:8888") .connectionTimeoutMs(5000) .retryPolicy(ExponentialBackoffRetry(1000, 10)) .sessionTimeoutMs(3000) .build() @Before fun init() { client.start() } @Test fun runTest() { var id: String = RandomStringUtils.randomAlphabetic(10) println("id : $id ") val time = Date() var lock: InterProcessSemaphoreMutex = InterProcessSemaphoreMutex(client, LATCH_PATH) while (true) { if (lock.acquire(3, TimeUnit.SECONDS)) { println("$id Successful Locking $time") while (lock.isAcquiredInThisProcess) { println("$id implement $time") TimeUnit.SECONDS.sleep(2) if (Math.random() > 0.5) { if (lock.acquire(3, TimeUnit.SECONDS)) { println("$id Successful lock-up again $time") } else { println("$id Failed to lock again $time") } } if (Math.random() > 0.5) { println("$id Release lock $time") lock.release() } } } else { println("$id Failure to lock $time") } } println("$id End: $time") } }
Operation:
id : xPZcpRyivX xPZcpRyivX Successful Locking Tue May 30 16:02:10 CST 2017 xPZcpRyivX implement Tue May 30 16:02:10 CST 2017 xPZcpRyivX Release lock Tue May 30 16:02:10 CST 2017 xPZcpRyivX Successful Locking Tue May 30 16:02:10 CST 2017 xPZcpRyivX implement Tue May 30 16:02:10 CST 2017 xPZcpRyivX implement Tue May 30 16:02:10 CST 2017 xPZcpRyivX Release lock Tue May 30 16:02:10 CST 2017 xPZcpRyivX Successful Locking Tue May 30 16:02:10 CST 2017 xPZcpRyivX implement Tue May 30 16:02:10 CST 2017 xPZcpRyivX Failed to lock again Tue May 30 16:02:10 CST 2017 xPZcpRyivX implement Tue May 30 16:02:10 CST 2017 xPZcpRyivX implement Tue May 30 16:02:10 CST 2017 xPZcpRyivX Release lock Tue May 30 16:02:10 CST 2017 xPZcpRyivX Successful Locking Tue May 30 16:02:10 CST 2017 xPZcpRyivX implement Tue May 30 16:02:10 CST 2017 xPZcpRyivX Failed to lock again Tue May 30 16:02:10 CST 2017 xPZcpRyivX implement Tue May 30 16:02:10 CST 2017 xPZcpRyivX Release lock Tue May 30 16:02:10 CST 2017 xPZcpRyivX Successful Locking Tue May 30 16:02:10 CST 2017 xPZcpRyivX implement Tue May 30 16:02:10 CST 2017 xPZcpRyivX Failed to lock again Tue May 30 16:02:10 CST 2017
As you can see, in the same thread, re-locking fails and fails.
zookeeper node:
ls /test/locks/ipsm [leases, locks] ls /test/locks/ipsm/locks [_c_cad2ad46-127d-4871-a6f8-7c11c0175f9a-lock-0000000014] get /test/locks/ipsm/locks/_c_1b35ce47-ff75-46bc-8aea-483117fbf803-lock-0000000020 192.168.60.165 cZxid = 0x1e21a ctime = Tue May 30 16:03:28 CST 2017 mZxid = 0x1e21a mtime = Tue May 30 16:03:28 CST 2017 pZxid = 0x1e21a cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x15156529fae07fa dataLength = 14 numChildren = 0 ls /test/locks/ipsm/leases [_c_a3198a00-da20-4d43-8b1e-76c2101ce5ef-lease-0000000043, _c_0dc5f8e5-dc77-4a83-a782-6184d584a014-lease-0000000041] get /test/locks/ipsm/leases/_c_a774be87-1867-4041-9ad7-ef14dcdfa315-lease-0000000049 192.168.60.165 cZxid = 0x1e290 ctime = Tue May 30 16:05:22 CST 2017 mZxid = 0x1e290 mtime = Tue May 30 16:05:22 CST 2017 pZxid = 0x1e290 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x15156529fae07fa dataLength = 14 numChildren = 0
You can see that
- Under / test/locks/ipsm, there are two nodes, one is locked and the other is semaphore.
- All are temporarily ordered nodes
- Occasionally you see two semaphore nodes
- This is not to say that two locks were created.
- According to the above analysis of the locking process, we first create semaphore nodes, and then query the list for quantitative judgment.
- So, the occasional occurrence of two semaphore nodes does not mean that two nodes are allocated, but that lock competition occurs at that point in time.