c ++ 11：具有互斥量的线程看到原子变量的值发生了变化，尽管这是唯一可以更改它的代码

Question

一个原子变量（在这种情况下为128位结构）正在更新，这令仅有的一个能够对其进行更新的线程感到惊讶。怎么样？

这是一个最小的示例，因此它没有做任何有意义的事情，但是：alloc（）函数返回一个malloc的缓冲区100次，然后分配一个新的缓冲区，它将返回100次，依此类推，甚至面对被多个线程调用的情况。

我有一个原子变量，它是一个带有指针，32位int和另一个32位计数器的结构，旨在避免ABA问题。

我有一个包含两个部分的函数。第一部分，如果返回计数为非零，将CAS结构减小返回计数（并增加ABA计数器），然后返回指针。否则，第二部分将获得一个互斥体，为新的指针分配内存，并且CAS将使用新的指针，一个新的非零返回计数器以及一个ABA计数器的增量完全使用小结构。

简而言之，当计数器大于零时，每个线程都可以更新此结构。但是，一旦它为零，第一个获取互斥锁的线程我认为

将是唯一可以再次CAS更新此结构的线程。

除非有时此CAS失败！我的问题是“它怎么会失败”。

这里是一个正在运行的示例。可以使用g++ lockchange.cxx -o lockchange -latomic -pthread进行编译。它在Fedora 31的gcc version 9.2.1 20190827 (Red Hat 9.2.1-1) (GCC)上运行。

#include <algorithm>
#include <atomic>
#include <chrono>
#include <cassert>
#include <cstring>
#include <mutex>
#include <thread>
#include <vector>

using namespace std;


struct MyPair { /* Hungarian: pair */

    char*    pc;         /* a buffer to be used n times */
    int32_t  iRemaining; /* number of times left to use pc */
    uint32_t iUpdates;   /* to avoid ABA problem */
};



const int iThreads{ 200 };
const int iThreadIterations{ 1000000 };
const int iSizeItem{ 128 };

mutex mux;

atomic<MyPair> pairNext;



char* alloc() {

 TRY_AGAIN:
  MyPair pairCur = pairNext.load();

  // CASE 1: We can use the existing buffer?

  while ( pairCur.iRemaining ) {
      char* pcRV = pairCur.pc;

      MyPair pairNew = { pairCur.pc,
                         pairCur.iRemaining - 1,
                         pairCur.iUpdates + 1 };

      if ( pairNext.compare_exchange_weak( pairCur, pairNew ) )
          return pcRV;

      // Otherwise, pairNext was changed out from under us and pairCur
      // will have been updated.  Try again, as long as iRemaining
      // non-zero.
  }



  // CASE 2: We've used pc as many times as allowed, so allocate a new pc.

  // Get a mutex as we'll be changing too many fields to do atomically.
  lock_guard<mutex> guard( mux );

  // If multiple threads saw iRemaining = 0, they all will
  // have tried for the mutex; only one will have gotten it, so
  // there's a good chance that by the time we get the mutex, a
  // sibling thread will have allocated a new pc and placed it at
  // pairNext, so we don't need to allocate after all.

  if ( pairNext.load().iRemaining ) // <=============================== it's as if this line isn't seeing the update made by the line below in real time.
      goto TRY_AGAIN;

  // Get a new buffer.
  char* pcNew = (char*) malloc( iSizeItem );

  MyPair pairNew = { pcNew, 100, pairCur.iUpdates + 1 };

  if ( pairNext.compare_exchange_strong( pairCur, pairNew ) ) { //<===== the update that's not being seen above in real time
      // *** other stuff with pcNew that needs mutex protection ***;
      return pcNew;

  } else {

      // CASE 2c: after allocating a new page, we find that
      // another thread has beaten us to it.  I CAN'T FIGURE OUT
      // HOW THAT'S POSSIBLE THOUGH.  Our response should be safe
      // enough: put our allocation back, and start all over again
      // because who knows what else we missed.  I see this error
      // like 813 times out of 40 BILLION allocations in the
      // hammer test, ranging from 1 to 200 threads.

      printf( "unexpected: had lock but pairNext changed when iRemaining=0\n" );
      // In fact the following free and goto should and seem to
      // recover fine, but to be clear my question is how we can
      // possibly end up here in the first place.
      abort();
      free( pcNew );
      goto TRY_AGAIN;
  }
}



void Test( int iThreadNumber ) {

  for ( int i = 0; i < iThreadIterations; i++ )
      alloc();
}



int main( int nArg, char* apszArg[] ) {

  vector<thread> athr;

  for ( int i = 0; i < iThreads; i++ )
      athr.emplace_back( Test, i );

  for ( auto& thr: athr )
      thr.join();
}

一个原子变量（在这种情况下为128位结构）正在更新，这令唯一具有更新能力的线程感到惊讶。为何如此？这是一个最小的示例，所以它不起作用...

Answer 1

该问题被称为“ ABA问题”，我可以总结为检查无锁多线程编码中的变量，并认为它没有改变，但是已经改变了。

Answer 2

请注意，

  if ( pairNext.load().iUpdates != pairCur.iUpdates ) // <=============================== it's as if this line isn't seeing the update made by the line below in real time.

会解锁互斥锁，因为您要跳回到构造goto TRY_AGAIN;之前。通常，人们将lock_guard<mutex>放在示波器的顶部，并把锁的位置放在顶部，以使情况更清楚（并控制何时发生解锁）。我没有检查ISO C ++规则以查看是否需要这样做，但是至少在G ++和clang ++实现它的方式上，{}确实可以解锁。（将RAII锁定与goto混合似乎是较差的设计）。

c ++ 11：具有互斥量的线程看到原子变量的值发生了变化，尽管这是唯一可以更改它的代码

问题描述投票：0回答：2

2个回答

最新问题

c ++ 11：具有互斥量的线程看到原子变量的值发生了变化，尽管这是唯一可以更改它的代码

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2