Parallelism Execution

Sep 4, 2024
  • Parallelism Idea:
    • Try to execute transactions that are not dependent on each other in multiple threads
      • For example: We have Block 1 consist 10s txs with following dependencies:
        • 💡
          Tx1 → Tx3 → Tx 5 → Tx10 Tx2 → Tx4 Tx6 → Tx8 Tx7 Tx9
      • If we already know the flow of dependencies and the chain has 4 threads, the process to execute this block would look like:
        • 💡
          Step #1: Run in Parallel in 4 threads for Tx1, Tx2, Tx6, Tx7 Step #2: Run in Parallel in 4 threads for Tx3, Tx4, Tx8, Tx9 Step #3: Run in Parallel for Tx5 Step #4: Run in Parallel for Tx10
           
    • But problem here, how we know this dependencies in advance
      • We using Strawman Algorithm, the idea will be like:
        • 💡
          Step #1 : Random run parallel for all txs Step #2..N: Keep re-run for fail txs
      • With above example, process will be look like:
        • 💡
          Step #1:
          1. Run in Parallel in 4 threads for Tx1,Tx2,Tx3,Tx4
          1. Run in Parallel in 4 threads for Tx5,Tx6,Tx7,Tx8
          1. Run in Parallel in 2 threads for Tx9,Tx10
          Step #2: After Step 1, we got [Tx3,Tx4,Tx8] Fail, re-execute
  • Aptos:
    • Block STM: https://arxiv.org/pdf/2203.06871
    • Current EVM design run in sequential execution
      • Block STM is a parallel execution engine
    • Block-STM improve from above solutions by finding a path from simple dependency tracking.
      • We consider an example with dependencies like:
        • 💡
          Tx1 → Tx2 → Tx3 → Tx4 Tx3 → Tx6 Tx3 → Tx9
      • With this sample, above solution need 7-8 step ( unit time) to finish. Meanwhile, best execution will be like:
        • 💡
          Step#1: Run random 1st time Tx1, Tx2, Tx3, Tx4 Step#2: Run Tx2, Tx5, Tx7, Tx8 Step#3: Run Tx3, Tx10 Step#4: Run Tx4, Tx6, Tx9
    • So, Block-STM , adding 2 variables nextPrelimExecution (initially 1),  nextValidation (initially n+1)
      • // per thread main loop: repeat { task := "NA" // if available, steal next read-set validation task atomic { if nextValidation < nextPrelimExecution (task, j) := ("validate", nextValidation) nextValidation.increment(); } if task = "validate" validate TXj // if available, steal next execution task else atomic { if nextPrelimExecution <= n (task, j) := ("execute", nextPrelimExecution) nextPrelimExecution.increment(); } if task = "execute" execute TXj validate TXj } until nextPrelimExecution > n, nextValidation > n, and no task is still running read-set validation of TXj { re-read TXj read-set if read-set differs from original read-set of the latest TXj execution mark the TXj write-set ABORTED atomic { nextValidation := min(nextValidation, j+1) } re-execute TXj } execution of TXj { (re-)process TXj, generating a read-set and write-set resume any TX waiting for TXj's ABORTED write-set if the new TXj write-set contains locations not marked ABORTED atomic { nextValidation := min(nextValidation, j+1) } }
         
      • The flow of above example will be like
          1. Run Tx1, Tx2, Tx3, Tx4.
              • Tx2, Tx3, Tx4 Fail.
              • Marked Tx3 (Tx2 write-set); Tx4, Tx6, Tx9 (Tx3 write-set) as ABORTED
          1. Run Tx2, Tx5, Tx7, Tx8
              • resume Tx3 ( when Tx2 is executed)
          1. Run Tx3, Tx10
              • resume all Tx4, Tx6, Tx9
          1. Run Tx4, Tx6, Tx9