FASTMMW

Updated 464 days ago
  • ID: 11866514/127
If we apply the definition we have a M27 algorithm; if we apply a fast algorithm, we have M23. The matrices now are smaller and we have to handle matrix operations of sizes . If we have a single GPU, we save , which is better than before. If we have 2 GPUs, we need as temporary space , which is slightly more than we need for M7 with one GPU. However, with 2 GPUs M23 requires 12 steps and M27 requires 14: thus we gain . Still ahead but not as good as . With 3 GPUs, M27 requires 9 steps and M23 requires 8. We are coming ahead by . The relative gain is worse than M7 with one GPU and we need more temporary space as well. With 4 GPUS, M23 requires 6 steps (4+4+4+4+4+3) and M27 requires 7; we are back saving . I do not have more GPUs …... This year I could finally show that 3x3x3 can be used in between M/2 and M. Thus, we have a hierarchical algorithm that changes strategy as function of the problem size and of the architecture. To show this practical performance advantage and the existence..
  • 0
  • 0
Interest Score
1
HIT Score
0.00
Domain
fastmmw.com

Actual
www.fastmmw.com

IP
65.254.227.224

Status
OK

Category
Other
0 comments Add a comment