next up previous contents
Next: More Operations and Information Up: Matrix-Matrix Operations Previous: Performance results

Querying Algorithmic Blocking Size

    In the implementations in Section gif, an algorithmic blocking size is passed as a parameter to the parallel matrix-matrix multiplication routines. Thus, a natural question is what the value of this parameter should be. Notice that the matrix-matrix multiplication examples all used one of the following basic operations: panel-panel update (rank-k update), matrix-panel multiply, or panel-matrix multiply. Thus, whatever blocking size makes these operations optimal can be expected to yield fast implementations of matrix-matrix multiply. In general, all level-3 BLAS can be implemented using these basic operation, and the equivalents that only operation with the upper or lower portion of the matrix: symmetric panel-panel update (symmetric rank-k), triangular matrix-panel multiply, and panel-triangular matrix multiply. Thus, we provide an environment inquiry routine that, given which of these operations underlies the algorithm being implemented, returns a suggested algorithmic blocking size. place HR here

figure10812

figure10816

figure10819

place HR here Currently, the input parameter operation can take on the values

tabular10538


next up previous contents
Next: More Operations and Information Up: Matrix-Matrix Operations Previous: Performance results

rvdg@cs.utexas.edu