为什么Rust RwLock和fork一起使用时会有意外的行为？

Question

当我使用RwLock和fork时，我看到一些我无法解释的行为。基本上，子进程报告RwLock仍在获取中，而父进程则没有，尽管它们都在相同的代码路径上运行。我的理解是，子进程应该收到父进程内存空间的独立副本，包括锁，所以它们报告不同的结果是没有意义的。

预期的行为是子进程和父进程都报告 "mutex held: false"。有趣的是，当使用Mutex而不是RwLock时，这和预期一样。

Rust Playground链接

use libc::fork;
use std::error::Error;
use std::sync::RwLock;

fn main() -> Result<(), Box<dyn Error>> {
    let lock = RwLock::new(());

    let guard = lock.write();
    let res = unsafe { fork() };
    drop(guard);

    match res {
        0 => {
            let held = lock.try_write().is_err();
            println!("CHILD mutex held: {}", held);
        }
        _child_pid => {
            let held = lock.try_write().is_err();
            println!("PARENT mutex held: {}", held);
        }
    }
    Ok(())
}

输出。

PARENT mutex held: false
CHILD mutex held: true

Answer 1

我猜你在这里运行的是Linux系统。 Rust这样做是因为glibc这样做，而Rust的 RwLock 是基于glibc的pthreads在使用glibc的Linux系统上的实现。

你可以用一个等价的C程序来确认这个行为。

#include <pthread.h>
#include <unistd.h>
#include <stdio.h>

int main(void)
{
    pthread_rwlock_t lock = PTHREAD_RWLOCK_INITIALIZER;

    pthread_rwlock_wrlock(&lock);
    pid_t pid = fork();
    int res = pthread_rwlock_unlock(&lock);
    int res2 = pthread_rwlock_trywrlock(&lock);

    printf("%s unlock_errno=%d trywrlock_errno=%d\n", (pid == 0) ? "child" : "parent", res, res2);
    return 0;
}

打印出以下内容

parent unlock_errno=0 trywrlock_errno=0
child unlock_errno=0 trywrlock_errno=16

16是 EBUSY 在我的系统上。

glibc出现这种情况的原因是POSIX为rwlocks指定了一个单一的解锁函数，glibc存储了当前线程ID来判断当前线程持有的锁是读锁还是写锁。如果当前线程ID等于存储的值，则该线程有一个写锁，否则，它有一个读锁。所以你实际上并没有解锁子程序中的任何东西，但你很可能已经破坏了锁中的读取计数器。

正如评论中提到的，根据POSIX，这在子程序中是未定义的行为，因为解锁的线程不是持有锁的线程。为了使这一行为有效，Rust必须像Go那样实现自己的同步基元，而这通常是一个重大的可移植性噩梦。

为什么Rust RwLock和fork一起使用时会有意外的行为？

问题描述投票：2回答：1

1个回答

最新问题

为什么Rust RwLock和fork一起使用时会有意外的行为？

问题描述 投票：2回答：1

1个回答

最新问题

问题描述投票：2回答：1