[petsc-users] Gather and Broadcast Parallel Vectors in k-means algorithm
Mills, Richard Tran
rtmills at anl.gov
Mon Apr 6 17:51:48 CDT 2020
Hi Eda,
I think that you probably want to use VecScatter routines, as Junchao
has suggested, instead of the lower level star forest for this. I
believe that VecScatterCreateToZero() is what you want for the broadcast
problem you describe, in the second part of your question. I'm not sure
what you are trying to do in the first part. Taking a parallel vector
and then copying its entire contents to a sequential vector residing on
each process is not scalable, and a lot of the design that has gone into
PETSc is to prevent the user from ever needing to do things like that.
Can you please tell us what you intend to do with these sequential vectors?
I'm also wondering why, later in your message, you say that you get
cluster assignments from Matlab, and then "to cluster row vectors
according to this information, all processors need to have all of the
row vectors". Do you mean you want to get all of the row vectors copied
onto all of the processors so that you can compute the cluster
centroids? If so, computing the cluster centroids can be done without
copying the row vectors onto all processors if you use a communication
operation like MPI_Allreduce().
Lastly, let me add that I've done a fair amount of work implementing
clustering algorithms on distributed memory parallel machines, but
outside of PETSc. I was thinking that I should implement some of these
routines using PETSc. I can't get to this immediately, but I'm wondering
if you might care to tell me a bit more about the clustering problems
you need to solve and how having some support for this in PETSc might
(or might not) help.
Best regards,
Richard
On 4/4/20 1:39 AM, Eda Oktay wrote:
> Hi all,
>
> I created a parallel vector UV, by using VecDuplicateVecs since I need
> row vectors of a matrix. However, I need the whole vector be in all
> processors, which means I need to gather all and broadcast them to all
> processors. To gather, I tried to use VecStrideGatherAll:
>
> Vec UVG;
> VecStrideGatherAll(UV,UVG,INSERT_VALUES);
> VecView(UVG,PETSC_VIEWER_STDOUT_WORLD);
>
> however when I try to view the vector, I get the following error.
>
> [3]PETSC ERROR: Invalid argument
> [3]PETSC ERROR: Wrong type of object: Parameter # 1
> [3]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [3]PETSC ERROR: Petsc Release Version 3.11.1, Apr, 12, 2019
> [3]PETSC ERROR: ./clustering_son_final_edgecut_without_parmetis on a
> arch-linux2-c-debug named localhost.localdomain by edaoktay Sat Apr 4
> 11:22:54 2020
> [3]PETSC ERROR: Wrong type of object: Parameter # 1
> [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.11.1, Apr, 12, 2019
> [0]PETSC ERROR: ./clustering_son_final_edgecut_without_parmetis on a
> arch-linux2-c-debug named localhost.localdomain by edaoktay Sat Apr 4
> 11:22:54 2020
> [0]PETSC ERROR: Configure options --download-mpich --download-openblas
> --download-slepc --download-metis --download-parmetis --download-chaco
> --with-X=1
> [0]PETSC ERROR: #1 VecStrideGatherAll() line 646 in
> /home/edaoktay/petsc-3.11.1/src/vec/vec/utils/vinv.c
> ./clustering_son_final_edgecut_without_parmetis on a
> arch-linux2-c-debug named localhost.localdomain by edaoktay Sat Apr 4
> 11:22:54 2020
> [1]PETSC ERROR: Configure options --download-mpich --download-openblas
> --download-slepc --download-metis --download-parmetis --download-chaco
> --with-X=1
> [1]PETSC ERROR: #1 VecStrideGatherAll() line 646 in
> /home/edaoktay/petsc-3.11.1/src/vec/vec/utils/vinv.c
> Configure options --download-mpich --download-openblas
> --download-slepc --download-metis --download-parmetis --download-chaco
> --with-X=1
> [3]PETSC ERROR: #1 VecStrideGatherAll() line 646 in
> /home/edaoktay/petsc-3.11.1/src/vec/vec/utils/vinv.c
>
> I couldn't understand why I am getting this error. Is this because of
> UV being created by VecDuplicateVecs? How can I solve this problem?
>
> The other question is broadcasting. After gathering all elements of
> the vector UV, I need to broadcast them to all processors. I found
> PetscSFBcastBegin. However, I couldn't understand the PetscSF concept
> properly. I couldn't adjust my question to the star forest concept.
>
> My problem is: If I have 4 processors, I create a matrix whose columns
> are 4 smallest eigenvectors, say of size 72. Then by defining each row
> of this matrix as a vector, I cluster them by using k-means
> clustering algorithm. For now, I cluster them by using MATLAB and I
> obtain a vector showing which row vector is in which cluster. After
> getting this vector, to cluster row vectors according to this
> information, all processors need to have all of the row vectors.
>
> According to this problem, how can I use the star forest concept?
>
> I will be glad if you can help me about this problem since I don't
> have enough knowledge about graph theory. An if you have any idea
> about how can I use k-means algorithm in a more practical way, please
> let me know.
>
> Thanks!
>
> Eda
More information about the petsc-users
mailing list