next up previous contents
Next: Towards further performance improvements Up: Left-Looking Variant Previous: Level-2 BLAS implementation

Level-3 BLAS implementation

To derive a level-3 BLAS left-looking variant for computing the factorization, consider the partitioning

eqnarray11128

The assumption is that bold-face parts of the lower triangular matrix have already been computed, and have overwritten the corresponding parts of A . The rest of the matrix has not been updated at all, and the object of the next step is to compute the next parts of the lower triangular matrix, tex2html_wrap_inline16321 and tex2html_wrap_inline16323 , overwriting the corresponding parts of A . From the above equation, we derive

eqnarray11159

or

     eqnarray11184

The algorithm for the left looking version of the Cholesky factorization can be given as follows using the above equations

  1. Update the current panel according to Equations gif and gif.
  2. Perform a Cholesky factorization of updated tex2html_wrap_inline16327 (Equation gif).
  3. Now that tex2html_wrap_inline16329 known, compute tex2html_wrap_inline16331 from Equation gif.
  4. Continue recursively by repartitioning the matrix.
The PLAPACK implementation using global level-3 BLAS is given in Figure 8.4. This time the bulk of the computation is in the update tex2html_wrap_inline16333 , which is a matrix-panel operation. Thus, the algorithmic blocking size b is determined by the call
PLA_Environ_nb_alg( PLA_OP_MAT_PAN, template, &nb_alg );
Notice how the code reflect the above described algorithm, which could have been taken straight from a number of textbooks (e.g., []).


next up previous contents
Next: Towards further performance improvements Up: Left-Looking Variant Previous: Level-2 BLAS implementation

rvdg@cs.utexas.edu