HPL_spreadT - Spread row panel U and forward current column
  panel.
#include "hpl.h"
void HPL_spreadT( HPL_T_panel * PBCST,
    int * IFLAG, HPL_T_panel * PANEL, const enum
    HPL_SIDE SIDE, const int N, double *
    U, const int LDU, const int SRCDIST,
    const int * IPLEN, const int * IPMAP, const
    int * IPMAPM1 );
HPL_spreadT spreads the local array containing local pieces
    of U, so that on exit to this function, a piece of U is contained in every
    process row. The array IPLEN contains the number of columns of U, that
    should be spread on any given process row. This function also probes for the
    presence of the column panel PBCST. If available, this panel will be
    forwarded. If PBCST is NULL on input, this probing mechanism will be
    disabled.
  - PBCST (local input/output)
    HPL_T_panel *
- On entry, PBCST points to the data structure containing the panel (to be
      broadcast) information.
- IFLAG (local input/output)
    int *
- On entry, IFLAG indicates whether or not the broadcast has already been
      completed. If not, probing will occur, and the outcome will be contained
      in IFLAG on exit.
- PANEL (local input/output)
    HPL_T_panel *
- On entry, PANEL points to the data structure containing the panel (to be
      spread) information.
- SIDE (global input) const enum
    HPL_SIDE
- On entry, SIDE specifies whether the local piece of U located in process
      IPMAP[SRCDIST] should be spread to the right or to the left. This feature
      is used by the equilibration process.
- N (global input) const int
- On entry, N specifies the local number of rows of U. N must be at least
      zero.
- U (local input/output) double
    *
- On entry, U is an array of dimension (LDU,*) containing the local pieces
      of U.
- LDU (local input) const
    int
- On entry, LDU specifies the local leading dimension of U. LDU should be at
      least MAX(1,N).
- SRCDIST (local input)
    const int
- On entry, SRCDIST specifies the source process that spreads its piece of
      U.
- IPLEN (global input) const
    int *
- On entry, IPLEN is an array of dimension NPROW+1. This array is such that
      IPLEN[i+1] - IPLEN[i] is the number of rows of U in each process before
      process IPMAP[i], with the convention that IPLEN[nprow] is the total
      number of rows. In other words IPLEN[i+1] - IPLEN[i] is the local number
      of rows of U that should be moved to process IPMAP[i].
- IPMAP (global input) const
    int *
- On entry, IPMAP is an array of dimension NPROW. This array contains the
      logarithmic mapping of the processes. In other words, IPMAP[myrow] is the
      absolute coordinate of the sorted process.
- IPMAPM1 (global input)
    const int *
- On entry, IPMAPM1 is an array of dimension NPROW. This array contains the
      inverse of the logarithmic mapping contained in IPMAP: For i in [0..
      NPROW) IPMAPM1[IPMAP[i]] = i.