PETERSON の相互排除アルゴリズム |セット 2 (CPU サイクルとメモリフェンス)

問題： 2 つのプロセス i と j があるとすると、追加のハードウェアサポートなしで 2 つのプロセス間の相互排他を保証できるプログラムを作成する必要があります。

CPUクロックサイクルの無駄遣い

平たく言えば、スレッドが順番を待っているときに、1 秒あたり何百万回も条件をテストする長い while ループが終了し、不必要な計算が実行されます。もっと良い待機方法があります。 '収率' 。

これが何を行うのかを理解するには、Linux でプロセススケジューラがどのように動作するかを深く掘り下げる必要があります。ここで述べたアイデアはスケジューラの簡略化されたバージョンですが、実際の実装には多くの複雑な点があります。

次の例を考えてみましょう
P1、P2、P3 の 3 つのプロセスがあります。プロセス P3 には、あまり有用な計算を実行しないコード内のループと同様の while ループがあり、P2 が実行を終了したときにのみループから存在します。スケジューラはそれらすべてをラウンドロビンキューに入れます。ここで、プロセッサのクロック速度が 1000000/秒で、各反復で各プロセスに 100 クロックが割り当てられるとします。次に、最初の P1 が 100 クロック (0.0001 秒) 実行され、次に P2 (0.0001 秒)、続いて P3 (0.0001 秒) が実行されます。これは、これ以上プロセスがないため、このサイクルは P2 が終了するまで繰り返され、続いて P3 が実行され、最終的に終了します。

これは 100 CPU クロックサイクルの完全な無駄です。これを回避するために、CPU タイムスライス、つまりイールドを相互に放棄し、実質的にこのタイムスライスを終了し、スケジューラは実行する次のプロセスを選択します。ここで、条件を一度テストしてから、CPU を放棄します。テストに 25 クロックサイクルかかることを考慮すると、タイムスライスでの計算の 75% が節約されます。これを図示すると

Peterson の相互排除アルゴリズム |セット 2 (CPU サイクルとメモリフェンス)

プロセッサのクロック速度が 1MHz であることを考えると、これは大幅な節約になります。
ディストリビューションが異なれば、この機能を実現するための機能も異なります。 Linux が提供する sched_yield() 。

    void     lock  (  int     self  )   {      flag  [  self  ]     =     1  ;      turn     =     1  -  self  ;      while     (  flag  [  1  -  self  ]     ==     1     &&      turn     ==     1  -  self  )          // Only change is the addition of      // sched_yield() call      sched_yield  ();   }

思い出の柵。

前のチュートリアルのコードはほとんどのシステムで動作する可能性がありますが、100% 正しいわけではありません。ロジックは完璧でしたが、最新の CPU のほとんどはパフォーマンスの最適化を採用しており、その結果、実行順序が狂ってしまう可能性があります。このメモリ操作 (ロードとストア) の順序変更は、通常、単一スレッドの実行内では気付かれませんが、並行プログラムでは予期しない動作を引き起こす可能性があります。
この例を考えてみましょう

       while     (  f     ==     0  );          // Memory fence required here      print     x  ;

上記の例では、コンパイラーは 2 つのステートメントが互いに独立していると見なすため、それらのステートメントを再順序付けしてコード効率を向上させようとしますが、これが並行プログラムの問題を引き起こす可能性があります。これを回避するために、メモリフェンスを配置して、バリアを越えたステートメント間の考えられる関係についてコンパイラにヒントを与えます。

したがって、ステートメントの順序は

フラグ[自分] = 1;
ターン = 1-自分;
while (ターン状態チェック)
収率（）;

ロックが機能するには、まったく同じである必要があります。そうでない場合は、デッドロック状態になります。

このコンパイラが、この障壁を越えてステートメントの順序付けを防止する命令を提供できるようにするためです。 gccの場合は、 __sync_synchronize() 。
したがって、変更されたコードは次のようになります
C での完全な実装:

C++

    // Filename: peterson_yieldlock_memoryfence.cpp   // Use below command to compile:   // g++ -pthread peterson_yieldlock_memoryfence.cpp -o peterson_yieldlock_memoryfence   #include       #include      #include       std  ::  atomic   <  int  >     flag  [  2  ];   std  ::  atomic   <  int  >     turn  ;   const     int     MAX     =     1e9  ;   int     ans     =     0  ;   void     lock_init  ()   {      // Initialize lock by resetting the desire of      // both the threads to acquire the locks.      // And giving turn to one of them.      flag  [  0  ]     =     flag  [  1  ]     =     0  ;      turn     =     0  ;   }   // Executed before entering critical section   void     lock  (  int     self  )   {      // Set flag[self] = 1 saying you want      // to acquire lock      flag  [  self  ]  =  1  ;      // But first give the other thread the      // chance to acquire lock      turn     =     1  -  self  ;      // Memory fence to prevent the reordering      // of instructions beyond this barrier.      std  ::  atomic_thread_fence  (  std  ::  memory_order_seq_cst  );      // Wait until the other thread loses the      // desire to acquire lock or it is your      // turn to get the lock.      while     (  flag  [  1  -  self  ]  ==  1     &&     turn  ==  1  -  self  )      // Yield to avoid wastage of resources.      std  ::  this_thread  ::  yield  ();   }   // Executed after leaving critical section   void     unlock  (  int     self  )   {      // You do not desire to acquire lock in future.      // This will allow the other thread to acquire      // the lock.      flag  [  self  ]  =  0  ;   }   // A Sample function run by two threads created   // in main()   void     func  (  int     s  )   {      int     i     =     0  ;      int     self     =     s  ;      std  ::  cout      < <     'Thread Entered: '      < <     self      < <     std  ::  endl  ;      lock  (  self  );      // Critical section (Only one thread      // can enter here at a time)      for     (  i  =  0  ;     i   <  MAX  ;     i  ++  )      ans  ++  ;      unlock  (  self  );   }   // Driver code   int     main  ()   {         // Initialize the lock       lock_init  ();      // Create two threads (both run func)      std  ::  thread     t1  (  func       0  );      std  ::  thread     t2  (  func       1  );      // Wait for the threads to end.      t1  .  join  ();      t2  .  join  ();      std  ::  cout      < <     'Actual Count: '      < <     ans      < <     ' | Expected Count: '      < <     MAX  *  2      < <     std  ::  endl  ;      return     0  ;   }

    // Filename: peterson_yieldlock_memoryfence.c   // Use below command to compile:   // gcc -pthread peterson_yieldlock_memoryfence.c -o peterson_yieldlock_memoryfence   #include      #include      #include     'mythreads.h'   int     flag  [  2  ];   int     turn  ;   const     int     MAX     =     1e9  ;   int     ans     =     0  ;   void     lock_init  ()   {      // Initialize lock by resetting the desire of      // both the threads to acquire the locks.      // And giving turn to one of them.      flag  [  0  ]     =     flag  [  1  ]     =     0  ;      turn     =     0  ;   }   // Executed before entering critical section   void     lock  (  int     self  )   {      // Set flag[self] = 1 saying you want      // to acquire lock      flag  [  self  ]  =  1  ;      // But first give the other thread the      // chance to acquire lock      turn     =     1  -  self  ;      // Memory fence to prevent the reordering      // of instructions beyond this barrier.      __sync_synchronize  ();      // Wait until the other thread loses the      // desire to acquire lock or it is your      // turn to get the lock.      while     (  flag  [  1  -  self  ]  ==  1     &&     turn  ==  1  -  self  )      // Yield to avoid wastage of resources.      sched_yield  ();   }   // Executed after leaving critical section   void     unlock  (  int     self  )   {      // You do not desire to acquire lock in future.      // This will allow the other thread to acquire      // the lock.      flag  [  self  ]  =  0  ;   }   // A Sample function run by two threads created   // in main()   void  *     func  (  void     *  s  )   {      int     i     =     0  ;      int     self     =     (  int     *  )  s  ;      printf  (  'Thread Entered: %d  n  '    self  );      lock  (  self  );      // Critical section (Only one thread      // can enter here at a time)      for     (  i  =  0  ;     i   <  MAX  ;     i  ++  )      ans  ++  ;      unlock  (  self  );   }   // Driver code   int     main  ()   {         pthread_t     p1       p2  ;      // Initialize the lock       lock_init  ();      // Create two threads (both run func)      Pthread_create  (  &  p1       NULL       func       (  void  *  )  0  );      Pthread_create  (  &  p2       NULL       func       (  void  *  )  1  );      // Wait for the threads to end.      Pthread_join  (  p1       NULL  );      Pthread_join  (  p2       NULL  );      printf  (  'Actual Count: %d | Expected Count:'      ' %d  n  '    ans    MAX  *  2  );      return     0  ;   }

Java

    import     java.util.concurrent.atomic.AtomicInteger  ;   public     class   PetersonYieldLockMemoryFence     {      static     AtomicInteger  []     flag     =     new     AtomicInteger  [  2  ]  ;      static     AtomicInteger     turn     =     new     AtomicInteger  ();      static     final     int     MAX     =     1000000000  ;      static     int     ans     =     0  ;      static     void     lockInit  ()     {      flag  [  0  ]     =     new     AtomicInteger  ();      flag  [  1  ]     =     new     AtomicInteger  ();      flag  [  0  ]  .  set  (  0  );      flag  [  1  ]  .  set  (  0  );      turn  .  set  (  0  );      }      static     void     lock  (  int     self  )     {      flag  [  self  ]  .  set  (  1  );      turn  .  set  (  1     -     self  );      // Memory fence to prevent the reordering of instructions beyond this barrier.      // In Java volatile variables provide this guarantee implicitly.      // No direct equivalent to atomic_thread_fence is needed.      while     (  flag  [  1     -     self  ]  .  get  ()     ==     1     &&     turn  .  get  ()     ==     1     -     self  )      Thread  .  yield  ();      }      static     void     unlock  (  int     self  )     {      flag  [  self  ]  .  set  (  0  );      }      static     void     func  (  int     s  )     {      int     i     =     0  ;      int     self     =     s  ;      System  .  out  .  println  (  'Thread Entered: '     +     self  );      lock  (  self  );      // Critical section (Only one thread can enter here at a time)      for     (  i     =     0  ;     i      <     MAX  ;     i  ++  )      ans  ++  ;      unlock  (  self  );      }      public     static     void     main  (  String  []     args  )     {      // Initialize the lock      lockInit  ();      // Create two threads (both run func)      Thread     t1     =     new     Thread  (()     ->     func  (  0  ));      Thread     t2     =     new     Thread  (()     ->     func  (  1  ));      // Start the threads      t1  .  start  ();      t2  .  start  ();      try     {      // Wait for the threads to end.      t1  .  join  ();      t2  .  join  ();      }     catch     (  InterruptedException     e  )     {      e  .  printStackTrace  ();      }      System  .  out  .  println  (  'Actual Count: '     +     ans     +     ' | Expected Count: '     +     MAX     *     2  );      }   }

Python

    import   threading   flag   =   [  0     0  ]   turn   =   0   MAX   =   10  **  9   ans   =   0   def   lock_init  ():   # This function initializes the lock by resetting the flags and turn.   global   flag     turn   flag   =   [  0     0  ]   turn   =   0   def   lock  (  self  ):   # This function is executed before entering the critical section. It sets the flag for the current thread and gives the turn to the other thread.   global   flag     turn   flag  [  self  ]   =   1   turn   =   1   -   self   while   flag  [  1  -  self  ]   ==   1   and   turn   ==   1  -  self  :   pass   def   unlock  (  self  ):   # This function is executed after leaving the critical section. It resets the flag for the current thread.   global   flag   flag  [  self  ]   =   0   def   func  (  s  ):   # This function is executed by each thread. It locks the critical section increments the shared variable and then unlocks the critical section.   global   ans   self   =   s   print  (  f  'Thread Entered:   {  self  }  '  )   lock  (  self  )   for   _   in   range  (  MAX  ):   ans   +=   1   unlock  (  self  )   def   main  ():   # This is the main function where the threads are created and started.   lock_init  ()   t1   =   threading  .  Thread  (  target  =  func     args  =  (  0  ))   t2   =   threading  .  Thread  (  target  =  func     args  =  (  1  ))   t1  .  start  ()   t2  .  start  ()   t1  .  join  ()   t2  .  join  ()   print  (  f  'Actual Count:   {  ans  }   | Expected Count:   {  MAX  *  2  }  '  )   if   __name__   ==   '__main__'  :   main  ()

JavaScript

    class     PetersonYieldLockMemoryFence     {      static     flag     =     [  0       0  ];      static     turn     =     0  ;      static     MAX     =     1000000000  ;      static     ans     =     0  ;      // Function to acquire the lock      static     async     lock  (  self  )     {      PetersonYieldLockMemoryFence  .  flag  [  self  ]     =     1  ;      PetersonYieldLockMemoryFence  .  turn     =     1     -     self  ;      // Asynchronous loop with a small delay to yield      while     (  PetersonYieldLockMemoryFence  .  flag  [  1     -     self  ]     ==     1     &&      PetersonYieldLockMemoryFence  .  turn     ==     1     -     self  )     {      await     new     Promise  (  resolve     =>     setTimeout  (  resolve       0  ));      }      }      // Function to release the lock      static     unlock  (  self  )     {      PetersonYieldLockMemoryFence  .  flag  [  self  ]     =     0  ;      }      // Function representing the critical section      static     func  (  s  )     {      let     i     =     0  ;      let     self     =     s  ;      console  .  log  (  'Thread Entered: '     +     self  );          // Lock the critical section      PetersonYieldLockMemoryFence  .  lock  (  self  ).  then  (()     =>     {      // Critical section (Only one thread can enter here at a time)      for     (  i     =     0  ;     i      <     PetersonYieldLockMemoryFence  .  MAX  ;     i  ++  )     {      PetersonYieldLockMemoryFence  .  ans  ++  ;      }          // Release the lock      PetersonYieldLockMemoryFence  .  unlock  (  self  );      });      }      // Main function      static     main  ()     {      // Create two threads (both run func)      const     t1     =     new     Thread  (()     =>     PetersonYieldLockMemoryFence  .  func  (  0  ));      const     t2     =     new     Thread  (()     =>     PetersonYieldLockMemoryFence  .  func  (  1  ));      // Start the threads      t1  .  start  ();      t2  .  start  ();      // Wait for the threads to end.      setTimeout  (()     =>     {      console  .  log  (  'Actual Count: '     +     PetersonYieldLockMemoryFence  .  ans     +     ' | Expected Count: '     +     PetersonYieldLockMemoryFence  .  MAX     *     2  );      }     1000  );     // Delay for a while to ensure threads finish      }   }   // Define a simple Thread class for simulation   class     Thread     {      constructor  (  func  )     {      this  .  func     =     func  ;      }      start  ()     {      this  .  func  ();      }   }   // Run the main function   PetersonYieldLockMemoryFence  .  main  ();

C++

    // mythread.h (A wrapper header file with assert statements)   #ifndef __MYTHREADS_h__   #define __MYTHREADS_h__   #include         #include         #include         // Function to lock a pthread mutex   void     Pthread_mutex_lock  (  pthread_mutex_t     *  m  )   {      int     rc     =     pthread_mutex_lock  (  m  );      assert  (  rc     ==     0  );     // Assert that the mutex was locked successfully   }       // Function to unlock a pthread mutex   void     Pthread_mutex_unlock  (  pthread_mutex_t     *  m  )   {      int     rc     =     pthread_mutex_unlock  (  m  );      assert  (  rc     ==     0  );     // Assert that the mutex was unlocked successfully   }       // Function to create a pthread   void     Pthread_create  (  pthread_t     *  thread       const     pthread_attr_t     *  attr           void     *  (  *  start_routine  )(  void  *  )     void     *  arg  )   {      int     rc     =     pthread_create  (  thread       attr       start_routine       arg  );      assert  (  rc     ==     0  );     // Assert that the thread was created successfully   }   // Function to join a pthread   void     Pthread_join  (  pthread_t     thread       void     **  value_ptr  )   {      int     rc     =     pthread_join  (  thread       value_ptr  );      assert  (  rc     ==     0  );     // Assert that the thread was joined successfully   }   #endif   // __MYTHREADS_h__

    // mythread.h (A wrapper header file with assert   // statements)   #ifndef __MYTHREADS_h__   #define __MYTHREADS_h__   #include         #include          #include         void     Pthread_mutex_lock  (  pthread_mutex_t     *  m  )   {      int     rc     =     pthread_mutex_lock  (  m  );      assert  (  rc     ==     0  );   }       void     Pthread_mutex_unlock  (  pthread_mutex_t     *  m  )   {      int     rc     =     pthread_mutex_unlock  (  m  );      assert  (  rc     ==     0  );   }       void     Pthread_create  (  pthread_t     *  thread       const     pthread_attr_t     *  attr           void     *  (  *  start_routine  )(  void  *  )     void     *  arg  )   {      int     rc     =     pthread_create  (  thread       attr       start_routine       arg  );      assert  (  rc     ==     0  );   }   void     Pthread_join  (  pthread_t     thread       void     **  value_ptr  )   {      int     rc     =     pthread_join  (  thread       value_ptr  );      assert  (  rc     ==     0  );   }   #endif   // __MYTHREADS_h__

Python

    import   threading   import   ctypes   # Function to lock a thread lock   def   Thread_lock  (  lock  ):   lock  .  acquire  ()   # Acquire the lock   # No need for assert in Python acquire will raise an exception if it fails   # Function to unlock a thread lock   def   Thread_unlock  (  lock  ):   lock  .  release  ()   # Release the lock   # No need for assert in Python release will raise an exception if it fails   # Function to create a thread   def   Thread_create  (  target     args  =  ()):   thread   =   threading  .  Thread  (  target  =  target     args  =  args  )   thread  .  start  ()   # Start the thread   # No need for assert in Python thread.start() will raise an exception if it fails   # Function to join a thread   def   Thread_join  (  thread  ):   thread  .  join  ()   # Wait for the thread to finish   # No need for assert in Python thread.join() will raise an exception if it fails