|
LOCALIZE, from the parallel version corresponds
to X(IA(I)) terms from the sequential loop. It translates the
I_BLK_COUNT global subscripts in IA to local subscripts,
which are returned in the array LOCAL_IA. Also, it builds up a
communication schedule, which is returned in
SCHEDULE_IA. Setting up the communication schedule
involves resolving the requests of accesses, sending lists of accessed
elements to the owner processors, detecting proper accumulation and
redundancy eliminations, etc. The result is some list of messages
that contains the local sources and destinations of the data.
Another input argument of LOCALIZE is the descriptor of the
data array, DAD_X. The second call works in the similar way with
respect to Y(IA(I)).
We have seen the inspector phrase for the loop. The next is the
executor phrase where actual computations and communications of data
elements occurs.
A collective call, GATHER fetches necessary data elements from
Y into the target ghost regions which begins at
Y(Y_BLK_SIZE + 1). The argument, SCHEDULE_IB, includes the
communication schedule. The next call ZERO_OUT_BUFFER make the
value of all elements of the ghost region of X zero.
In the main loop the results for locally owned X(IA) elements
are aggregated directly to the local segment X. Moreover, the
results from non-locally owned elements are aggregated to the ghost
region of X.
The final call, SCATTER_ADD, sends the values in the ghost
region of X to the related owners where the values are added in
to the physical region of the segement.
We have seen the inspector-executor model of PARTI.
An important lesson from the model is that construction of communication
schedules must be isolated from execution of those schedules.
The immediate benefit of this separation arises in the common situation
where the form of the inner loop is constant over many iterations of
some outer loop. The same communication schedule can be reused many
times. The inspector phase can be moved out of the main loop. This
pattern is supported by the Adlib library used in HPJava.