For many data parallel jobs, we often need to perform computations on arrays. A non-distributed array is one in which the whole array resides in a worker’s memory space while a distributed array is a single array with its parts distributed across workers in a specific fashion. For example, you may choose to distribute a two-dimensional array distinctly along its rows or columns; as a result, an array distributed across workers saves memory over replicating it on every worker. For modest sized problems, non-distributed arrays may be more convenient at the expense of memory usage. For problems with large arrays, distributing the arrays may be necessary. Furthermore, a distributed array may also enhance the computational efficiency (due to for instance improved cache usage) and communication efficiency (due to less frequent communication).
-
Non-distributed arrays
In the previous sections, we have dealt exclusively with non-distributed arrays. An array that has been defined on the MATLAB client, when referred to in the worker space, is automatically copied into the respective worker space as one of the following three types: replicated array, variant array, or private array.

A non-distributed array shares a common name across all workers and is typed in the MATLAB space as a composite array. It may either be generated directly with the composite command or indirectly through the context of array creation. Here is an example:
>> matlabpool open local 4 >> a = Composite(); % create composite array with 4 entries >> a{2} = magic(3); % use cell notation to create an array on worker 2 >> a{1} = randi(3); a{3} = rand(4); a{4} = ones(2); % defines the rest >> spmd, d = magic(3); end % d is replicated on all workers >> matlabpool close -
Distributed arrays
Arrays may be distributed in two different ways depending on the application. The corresponding utilities for these are
- distributed
This type of array is generated and accessible in the MATLAB client space. The arrays must be distributed along the last dimension. For example, for a two-dimensional array, the array is distributed along the second dimension, or the columns>> c = distributed.rand(200); % NOT in spmd; distributed along columns - codistributed
This is the most general form of distribution. Distribution may be along any of the dimensions>> spmd >> d1 = codistributed.rand(1000, codistributor1d(1)); % codistributed along rows >> d2 = codistributed.ones(500, codistributor1d(2)); % codistributed along columns >> end % end spmd
- distributed
The whos command may be used to ascertain an array’s data type
>> whos
Name Size Bytes Class Attributes
a 1x4 1145 Composite
b 1x4 1145 Composite
c 200x200 1181 distributed
d1 1000x1000 1181 distributed
d2 500x500 1181 distributed
The byte sizes above only reflect the memory overhead for the arrays in the MATLAB space.
Ways to build distributed arrays
- Build by partitioning a larger array

- Build from smaller local arrays

- Build with MATLAB constructor functions

Tips
- Sometimes, you may distribute an array in a two-step process:
>> A = rand(200); >> d = codistributed(A);However, when possible, it is more memory-efficient to generate
ddirectly in one step.>> d = codistributed.rand(200); - Partly due to backward compatibility considerations, there are multiple alternatives to accomplishing the task of array distribution.
|
|
|
| Previous | Home | Next |



