c ++ 11:具有互斥量的线程看到原子变量的值发生了变化,尽管这是唯一可以更改它的代码

问题描述 投票:0回答:2

一个原子变量(在这种情况下为128位结构)正在更新,这令仅有的一个能够对其进行更新的线程感到惊讶。怎么样?

这是一个最小的示例,因此它没有做任何有意义的事情,但是:alloc()函数返回一个malloc的缓冲区100次,然后分配一个新的缓冲区,它将返回100次,依此类推,甚至面对被多个线程调用的情况。

我有一个原子变量,它是一个带有指针,32位int和另一个32位计数器的结构,旨在避免ABA问题。

我有一个包含两个部分的函数。第一部分,如果返回计数为非零,将CAS结构减小返回计数(并增加ABA计数器),然后返回指针。否则,第二部分将获得一个互斥体,为新的指针分配内存,并且CAS将使用新的指针,一个新的非零返回计数器以及一个ABA计数器的增量完全使用小结构。

简而言之,当计数器大于零时,每个线程都可以更新此结构。但是,一旦它为零,第一个获取互斥锁的线程我认为

将是唯一可以再次CAS更新此结构的线程。

除非有时此CAS失败!我的问题是“它怎么会失败”。

这里是一个正在运行的示例。可以使用g++ lockchange.cxx -o lockchange -latomic -pthread进行编译。它在Fedora 31的gcc version 9.2.1 20190827 (Red Hat 9.2.1-1) (GCC)上运行。

#include <algorithm>
#include <atomic>
#include <chrono>
#include <cassert>
#include <cstring>
#include <mutex>
#include <thread>
#include <vector>

using namespace std;


struct MyPair { /* Hungarian: pair */

    char*    pc;         /* a buffer to be used n times */
    int32_t  iRemaining; /* number of times left to use pc */
    uint32_t iUpdates;   /* to avoid ABA problem */
};



const int iThreads{ 200 };
const int iThreadIterations{ 1000000 };
const int iSizeItem{ 128 };

mutex mux;

atomic<MyPair> pairNext;



char* alloc() {

 TRY_AGAIN:
  MyPair pairCur = pairNext.load();

  // CASE 1: We can use the existing buffer?

  while ( pairCur.iRemaining ) {
      char* pcRV = pairCur.pc;

      MyPair pairNew = { pairCur.pc,
                         pairCur.iRemaining - 1,
                         pairCur.iUpdates + 1 };

      if ( pairNext.compare_exchange_weak( pairCur, pairNew ) )
          return pcRV;

      // Otherwise, pairNext was changed out from under us and pairCur
      // will have been updated.  Try again, as long as iRemaining
      // non-zero.
  }



  // CASE 2: We've used pc as many times as allowed, so allocate a new pc.

  // Get a mutex as we'll be changing too many fields to do atomically.
  lock_guard<mutex> guard( mux );

  // If multiple threads saw iRemaining = 0, they all will
  // have tried for the mutex; only one will have gotten it, so
  // there's a good chance that by the time we get the mutex, a
  // sibling thread will have allocated a new pc and placed it at
  // pairNext, so we don't need to allocate after all.

  if ( pairNext.load().iRemaining ) // <=============================== it's as if this line isn't seeing the update made by the line below in real time.
      goto TRY_AGAIN;

  // Get a new buffer.
  char* pcNew = (char*) malloc( iSizeItem );

  MyPair pairNew = { pcNew, 100, pairCur.iUpdates + 1 };

  if ( pairNext.compare_exchange_strong( pairCur, pairNew ) ) { //<===== the update that's not being seen above in real time
      // *** other stuff with pcNew that needs mutex protection ***;
      return pcNew;

  } else {

      // CASE 2c: after allocating a new page, we find that
      // another thread has beaten us to it.  I CAN'T FIGURE OUT
      // HOW THAT'S POSSIBLE THOUGH.  Our response should be safe
      // enough: put our allocation back, and start all over again
      // because who knows what else we missed.  I see this error
      // like 813 times out of 40 BILLION allocations in the
      // hammer test, ranging from 1 to 200 threads.

      printf( "unexpected: had lock but pairNext changed when iRemaining=0\n" );
      // In fact the following free and goto should and seem to
      // recover fine, but to be clear my question is how we can
      // possibly end up here in the first place.
      abort();
      free( pcNew );
      goto TRY_AGAIN;
  }
}



void Test( int iThreadNumber ) {

  for ( int i = 0; i < iThreadIterations; i++ )
      alloc();
}



int main( int nArg, char* apszArg[] ) {

  vector<thread> athr;

  for ( int i = 0; i < iThreads; i++ )
      athr.emplace_back( Test, i );

  for ( auto& thr: athr )
      thr.join();
}

一个原子变量(在这种情况下为128位结构)正在更新,这令唯一具有更新能力的线程感到惊讶。为何如此?这是一个最小的示例,所以它不起作用...

c++ mutex memory-barriers stdatomic lock-guard
2个回答
1
投票

该问题被称为“ ABA问题”,我可以总结为检查无锁多线程编码中的变量,并认为它没有改变,但是已经改变了。


0
投票
请注意, if ( pairNext.load().iUpdates != pairCur.iUpdates ) // <=============================== it's as if this line isn't seeing the update made by the line below in real time. 会解锁互斥锁,因为您要跳回到构造goto TRY_AGAIN;之前。通常,人们将lock_guard<mutex>放在示波器的顶部,并把锁的位置放在顶部,以使情况更清楚(并控制何时发生解锁)。我没有检查ISO C ++规则以查看是否需要这样做,但是至少在G ++和clang ++实现它的方式上,{}确实可以解锁。 (将RAII锁定与goto混合似乎是较差的设计)。
© www.soinside.com 2019 - 2024. All rights reserved.