[mpich-discuss] Optimal implementation of an interesting social network model --open for discussion

Jierui XIE jierui.xie at gmail.com
Wed Oct 26 17:03:24 CDT 2011


Dear All,

Here is the simplified model for a social network application with N
nodes (millions of nodes).  Each node has a memory M (the capacity is
T ). This model is also a iterative algorithm. The number of
iterations is T.

At each time step t, each node i does the following two things:
   Job 1: First,it generates a label x based on its memory, i.e., x=gen1(M).
          It then sends c to all its neighbors nbs(i).
   Job 2: Second, it receives labels from all its neighbors (one label
for one neighbor).
          It then computes another label y based on received labels,
i.e., y=gen2(M) and then
          adds y to its memory M.

After all nodes perform the above two jobs, let t=t+1.
The entire algorithm stops when T is reached.
----------------------------------------------------------------
One challenge I see is, it is not easy to have an optimal partition of
nodes, such that the communication cost is lower.
For example, suppose that we partition N nodes to total G groups (each
goes to one process), each group receives and sends labels to ONLY
their neighbors. It seems that in this way, the number of 'labels'
exchanged is minimum. However, each group may has many neighbors in
many other groups (processes), the communications are very likely to
involve many processes (groups).

One plausible solution is to ask a master node to collect all the
labels from Job 1 of ALL nodes, then broadcast ALL these labels to ALL
the processes to perform Job 2. However, since the network is very
large, the storage capacity (for all labels) on each node and also the
cost of broadcasting may become new issues.

Hope I make it clear.
----------------------------------------------------------------
Do you see any other problems in implementing this algorithm or any
other better solutions?

    	

Jerry


More information about the mpich-discuss mailing list