[ExM Users] scaling Turbine on Vesta

Ketan Maheshwari ketan at mcs.anl.gov
Tue Apr 15 09:22:37 CDT 2014


Hi Tim,

I think I found the issue and got past it. In my C code, I forgot to close
a file. Now in a new version the file gets closed after read. And this one
seems to be scaling well. So far, on Vesta, I was able to scale to 10K
processes on 625 nodes without any issue.

Thanks,
Ketan


On Mon, Apr 14, 2014 at 8:45 PM, Tim Armstrong <tim.g.armstrong at gmail.com>wrote:

>  It's hard to narrow it down from the info - that script seems fairly
> unlikely to cause problems.
>
> What optimisation level? STC/Turbine version? How many processes?  How
> many ADLB servers?  Is it every time you run or just intermittently?
>
> Can you confirm that it's not just getting stuck in the leaf function as
> well?  E.g. log when it enters and exits.
>
> There is a rare race condition that can deadlock things that I'm just
> working on now, but it seems unlikely that you would be encountering that
> with that script.
>
>  - Tim
>
>
> On Mon, Apr 14, 2014 at 6:09 PM, Ketan Maheshwari <ketan at mcs.anl.gov>wrote:
>
>> Hi,
>>
>>  Trying to scale up a simple leaf function on Vesta. It seems that the
>> leaf function runs at max 259 times and beyond that either it does not
>> return any results or crash, but I do not see any error messages or other
>> indications either.
>>
>>  On Vesta, an example is
>> at /home/ketan/turbine-output/2014/04/14/23/04/06
>>
>>  Any clue on this?
>>
>>  The Swift source looks as follows:
>>
>>  import io;
>>
>>  @dispatch=WORKER
>> (int v) leaf_main(string A[]) "leaf_main" "0.0" "leaf_main_wrap";
>> main
>> {
>>   int rc[];
>>   foreach i in [0:9999:1]{
>>     rc[i] = leaf_main([fromint(i)]);
>>   }
>> }
>>
>>
>>  Thanks,
>> Ketan
>>
>> _______________________________________________
>> ExM-user mailing list
>> ExM-user at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/exm-user
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/exm-user/attachments/20140415/26614fdd/attachment-0001.html>


More information about the ExM-user mailing list