lockd: fix races in client GRANTED_MSG wait logic

After the wait for a grant is done (for whatever reason), nlmclnt_block
updates the status of the nlm_rqst with the status of the block. At the
point it does this, however, the block is still queued its status could
change at any time.

This is particularly a problem when the waiting task is signaled during
the wait. We can end up giving up on the lock just before the GRANTED_MSG
callback comes in, and accept it even though the lock request gets back
an error, leaving a dangling lock on the server.

Since the nlm_wait never lives beyond the end of nlmclnt_lock, put it on
the stack and add functions to allow us to enqueue and dequeue the
block. Enqueue it just before the lock/wait loop, and dequeue it
just after we exit the loop instead of waiting until the end of
the function. Also, scrape the status at the time that we dequeue it to
ensure that it's final.

Reported-by: Yongcheng Yang <yoyang@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2063818
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
This commit is contained in:
Jeff Layton
2023-03-03 07:16:00 -05:00
committed by Chuck Lever
parent f0aa4852e6
commit 2005f5b9c3
3 changed files with 44 additions and 34 deletions

View File

@@ -211,9 +211,11 @@ struct nlm_rqst * nlm_alloc_call(struct nlm_host *host);
int nlm_async_call(struct nlm_rqst *, u32, const struct rpc_call_ops *);
int nlm_async_reply(struct nlm_rqst *, u32, const struct rpc_call_ops *);
void nlmclnt_release_call(struct nlm_rqst *);
struct nlm_wait * nlmclnt_prepare_block(struct nlm_host *host, struct file_lock *fl);
void nlmclnt_finish_block(struct nlm_wait *block);
int nlmclnt_block(struct nlm_wait *block, struct nlm_rqst *req, long timeout);
void nlmclnt_prepare_block(struct nlm_wait *block, struct nlm_host *host,
struct file_lock *fl);
void nlmclnt_queue_block(struct nlm_wait *block);
__be32 nlmclnt_dequeue_block(struct nlm_wait *block);
int nlmclnt_wait(struct nlm_wait *block, struct nlm_rqst *req, long timeout);
__be32 nlmclnt_grant(const struct sockaddr *addr,
const struct nlm_lock *lock);
void nlmclnt_recovery(struct nlm_host *);