[Swift-user] Deep recursion on subroutine "main::stageout" at /home/ketan/work/worker.pl line 1349

Michael Wilde wilde at mcs.anl.gov
Mon May 21 14:35:26 CDT 2012


Ketan, as far as I can tell, that message, coming from worker.pl, is just a warning.

Programing Perl sec 33, Diagnostic Messages: "Deep recursion on subroutine "%s"

(W recursion) This subroutine has called itself (directly or indirectly) 100 times more than it has returned. This probably indicates an infinite recursion, unless you're writing strange benchmark programs, in which case it indicates something else."

The stageout code in worker.pl is indeed recursive, and the warning could be suppressed:

"Try placing

  no warnings 'recursion';

within the same scope as that code ..."

Can you try a simple mod to catsn, using your ext mapper, to see if it is indeed failing due to the deeply recursive stageout?

If you could dig a bit deeper into this, and see whether its really failing when staging back so many files or failing for some other, or related, reason, that would be great.

Thanks,

- Mike

----- Original Message -----
> From: "Ketan Maheshwari" <ketancmaheshwari at gmail.com>
> To: "Swift User" <swift-user at ci.uchicago.edu>
> Sent: Monday, May 21, 2012 1:54:34 PM
> Subject: [Swift-user] Deep recursion on subroutine "main::stageout" at /home/ketan/work/worker.pl line 1349
> Hi,
> 
> 
> I am trying to run the GE mars script on a bag of workstations. I
> tested the script for a sufficient number of tasks and seems to be
> working fine on localhost.
> 
> 
> However, it fails in this setup. I get the error message as follows
> after seemingly right invocation:
> 
> 
> 
> 
> Find: keepalive(120), reconnect - http://128.84.97.46:41287
> Progress: time: Mon, 21 May 2012 14:43:18 -0400 Stage in:7 Submitted:3
> Progress: time: Mon, 21 May 2012 14:43:19 -0400 Stage in:8 Active:2
> Deep recursion on subroutine "main::stageout" at /home/ketan/work/
> worker.pl line 1349.
> Deep recursion on subroutine "main::stageout" at /home/ketan/work/
> worker.pl line 1349.
> Progress: time: Mon, 21 May 2012 14:43:20 -0400 Active:3 Stage out:7
> 
> 
> Obviously the staging out of results fails and seems that the number
> of files in the stageout stage is causing the error. The application
> needs to stage out about 120 files.
> 
> 
> One solution I could quickly think of is to wrap the app in a shell
> and zip the outputs making it just one staged out file.
> 
> 
> However, the current setup would still be useful since we are trying
> to compare the existing Hadoop solution with the Swift one.
> 
> 
> Is there any possible workaround, some env setting or so that I could
> try and get the stageout going?
> 
> 
> The logs are:
> http://www.mcs.anl.gov/~ketan/mars-20120521-1443-d6q9lr0a.log
> and http://www.mcs.anl.gov/~ketan/workerlogs.tgz
> 
> 
> 
> 
> Regards, --
> Ketan
> 
> 
> 
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-user mailing list