From rmcgibbo at gmail.com Mon Jun 3 19:16:31 2013 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Mon, 3 Jun 2013 17:16:31 -0700 Subject: [Swift-user] Setting up Swift at Stanford Message-ID: Hey, We just heard about the swift project from some colleagues at U Chicago, and we're interested in trying it out with some of our compute resources at Stanford to run parallel molecular dynamics and x-ray scatting simulations. Currently, I'm most interested in setting up the environment such that I can submit my swift script on a local workstation, with execution on a few different clusters. The head nodes of our local clusters are accessible via ssh, and then job execution is scheduled with pbs. When I run swift, it can't seem to find qsub on the cluster. rmcgibbo at Roberts-MacBook-Pro-2 ~/projects/swift $ swift -sites.file sites.xml hello.swift -tc.file tc.data Swift 0.94 swift-r6492 cog-r3658 RunID: 20130603-1704-5xii8svc Progress: time: Mon, 03 Jun 2013 17:04:10 -0700 2013-06-03 17:04:10.735 java[77051:1f07] Loading Maximizer into bundle: com.apple.javajdk16.cmd 2013-06-03 17:04:11.410 java[77051:1f07] Maximizer: Unsupported window created of class: CocoaAppWindow Progress: time: Mon, 03 Jun 2013 17:04:13 -0700 Stage in:1 Execution failed: Exception in uname: Arguments: [-a] Host: vsp-compute Directory: hello-20130603-1704-5xii8svc/jobs/y/uname-ydyn5fal Caused by: Cannot submit job: Cannot run program "qsub": error=2, No such file or directory uname, hello.swift, line 8 When I switch the execution provider from pbs to ssh, the hob runs successfully, but only on the head node of the vsp-compute cluster. I'd like to run instead using the cluster's pbs queue. Any help would be greatly appreciated. -Robert Graduate Student, Pande Lab Stanford University, Department of Chemistry p.s. My sitess.xml file is ``` 750 1 default file /scratch/rmcgibbo/swiftwork ``` My SwiftScript is ``` #hello.swift type file; app (file o) uname() { uname "-a" stdout=@o; } file outfile <"uname.txt">; outfile = uname(); ``` -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Mon Jun 3 22:27:45 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 3 Jun 2013 22:27:45 -0500 (CDT) Subject: [Swift-user] Setting up Swift at Stanford In-Reply-To: Message-ID: <396534425.4398.1370316465619.JavaMail.root@mcs.anl.gov> Hi Robert, To run swift from a workstation that can ssh to one or more cluster head nodes, use a sites file like this: 1 100 100 3600 00:05:00 default 5 1 1 1.00 10000 /scratch/rmcgibbo/swiftwork This specifies that Swift should: - use the "coaster" provider, which enables Swift to ssh to another system and qsub from there: - run up to 100 Swift app() tasks in parallel on the remote system: 1.00 10000 - app() tasks should be limited to 5 minutes walltime: 00:05:00 - app() tasks will be run within PBS coaster "pilot" jobs. Each PBS job should have a walltime of 750 seconds: 100 100 750 - Up to 5 concurrent PBS coaster jobs each asking for 1 node will be submitted to the default queue: default 5 1 1 - Swift should run only one app() task at a time within each PBS job slot: 1 - On the remote PBS cluster, create per-run directories under this work directory: /scratch/rmcgibbo/swiftwork - And stage data to the site by using local copy operations: You can make the sites.xml entry more user-independent using, e.g.: /scratch/{env.USER}/swiftwork The overall sites entry above assumes: - That /scratch/rmcgibbo is mounted on both the Swift run host and on the remote PBS system. If there is no common shared filesystem, Swift can use a data transport technique called "coaster provider staging" to move the data for you. This is specified in the swift.properties file. In many cases, with a shared filesystem bewteen the Swift client host and the execution cluster, its desirable to turn off staging altogether. This is done using a mode called "direct" data management (see http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_collective_data_management. This is being simplified for future releases.) - That each PBS job is given one CPU core, not one full node. The PBS ppn attribute can be specified to request a specific number of cores (processors) per node: 16 ...and then that each coaster pilot job should run up to 16 Swift app() tasks at once: 16 For more info on coasters, see: http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_coasters and: http://www.ci.uchicago.edu/swift/papers/UCC-coasters.pdf For more examples on site configurations, see: http://www.ci.uchicago.edu/swift/guides/trunk/siteguide/siteguide.html And lastly, note that in your initial sites.xml below: - Omitting the filesystem provider tag is typically only done when "use.provider.staging" is specified in the swift.properties config file - The stagingMethod tag only applies to provider staging. We're working hard to document all this better and provider a better set of illustrated examples and templates for common site configurations. In the meantime, we'll help you create a set of useful configurations for your site(s). Regards, - Mike > We just heard about the swift project from some colleagues at U > Chicago, and we're interested in trying it out with some of our > compute resources at Stanford to run parallel molecular dynamics and > x-ray scatting simulations. Currently, I'm most interested in > setting up the environment such that I can submit my swift script on > a local workstation, with execution on a few different clusters. The > head nodes of our local clusters are accessible via ssh, and then > job execution is scheduled with pbs. > > When I run swift, it can't seem to find qsub on the cluster. > > rmcgibbo at Roberts-MacBook-Pro-2 ~/projects/swift > $ swift -sites.file sites.xml hello.swift -tc.file tc.data > Swift 0.94 swift-r6492 cog-r3658 > > RunID: 20130603-1704-5xii8svc > Progress: time: Mon, 03 Jun 2013 17:04:10 -0700 > 2013-06-03 17:04:10.735 java[77051:1f07] Loading Maximizer into > bundle: com.apple.javajdk16.cmd > 2013-06-03 17:04:11.410 java[77051:1f07] Maximizer: Unsupported > window created of class: CocoaAppWindow > Progress: time: Mon, 03 Jun 2013 17:04:13 -0700 Stage in:1 > Execution failed: > Exception in uname: > Arguments: [-a] > Host: vsp-compute > Directory: hello-20130603-1704-5xii8svc/jobs/y/uname-ydyn5fal > Caused by: > Cannot submit job: Cannot run program "qsub": error=2, No such file > or directory > uname, hello.swift, line 8 > > When I switch the execution provider from pbs to ssh, the hob runs > successfully, but only on the head node of the vsp-compute cluster. > I'd like to run instead using the cluster's pbs queue. Any help > would be greatly appreciated. > > -Robert > Graduate Student, Pande Lab > Stanford University, Department of Chemistry > > p.s. > > My sitess.xml file is > ``` > > > > > > 750 > 1 > default > file > > /scratch/rmcgibbo/swiftwork > > > > > ``` > > My SwiftScript is > ``` > #hello.swift > type file; > > app (file o) uname() { > uname "-a" stdout=@o; > } > file outfile <"uname.txt">; > > outfile = uname(); > ``` > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From wilde at mcs.anl.gov Mon Jun 3 22:38:55 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 3 Jun 2013 22:38:55 -0500 (CDT) Subject: [Swift-user] Setting up Swift at Stanford In-Reply-To: <396534425.4398.1370316465619.JavaMail.root@mcs.anl.gov> Message-ID: <258082345.4505.1370317135629.JavaMail.root@mcs.anl.gov> I forgot to also mention: the example below with the "ssh-cl" ("ssh command line") provider also assumes that you can do a password-less ssh command from your workstation to your PBS head node. Ie, that you have ssh keys in place on the head node and that youre using an ssh agent. The standard Swift ssh provider (eg using provider=coaster jobmanager=ssh:pbs) uses a file called $HOME/.ssh/auth.defaults to specify ssh passwords or passphrases, or for better security swift will prompt for these. We tend to use and recommend the newer ssh-cl for both security and convenience. - Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Robert McGibbon" > Cc: swift-user at ci.uchicago.edu > Sent: Monday, June 3, 2013 10:27:45 PM > Subject: Re: [Swift-user] Setting up Swift at Stanford > > Hi Robert, > > To run swift from a workstation that can ssh to one or more cluster > head nodes, use a sites file like this: > > > url="vsp-compute-01.stanford.edu"/> > 1 > 100 > key="highOverAllocation">100 > 3600 > 00:05:00 > default > 5 > 1 > 1 > 1.00 > 10000 > /scratch/rmcgibbo/swiftwork > > > This specifies that Swift should: > > - use the "coaster" provider, which enables Swift to ssh to another > system and qsub from there: > > url="vsp-compute-01.stanford.edu"/> > > - run up to 100 Swift app() tasks in parallel on the remote system: > > 1.00 > 10000 > > - app() tasks should be limited to 5 minutes walltime: > > 00:05:00 > > - app() tasks will be run within PBS coaster "pilot" jobs. Each PBS > job should have a walltime of 750 seconds: > > 100 > 100 > 750 > > - Up to 5 concurrent PBS coaster jobs each asking for 1 node will be > submitted to the default queue: > > default > 5 > 1 > 1 > > - Swift should run only one app() task at a time within each PBS job > slot: > > 1 > > - On the remote PBS cluster, create per-run directories under this > work directory: > > /scratch/rmcgibbo/swiftwork > > - And stage data to the site by using local copy operations: > > > > You can make the sites.xml entry more user-independent using, e.g.: > > /scratch/{env.USER}/swiftwork > > The overall sites entry above assumes: > > - That /scratch/rmcgibbo is mounted on both the Swift run host and on > the remote PBS system. > > If there is no common shared filesystem, Swift can use a data > transport technique called "coaster provider staging" to move the > data for you. This is specified in the swift.properties file. > > In many cases, with a shared filesystem bewteen the Swift client host > and the execution cluster, its desirable to turn off staging > altogether. This is done using a mode called "direct" data > management (see > http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_collective_data_management. > This is being simplified for future releases.) > > - That each PBS job is given one CPU core, not one full node. > > The PBS ppn attribute can be specified to request a specific number > of cores (processors) per node: > > 16 > > ...and then that each coaster pilot job should run up to 16 Swift > app() tasks at once: > > 16 > > For more info on coasters, see: > http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_coasters > and: http://www.ci.uchicago.edu/swift/papers/UCC-coasters.pdf > > For more examples on site configurations, see: > > http://www.ci.uchicago.edu/swift/guides/trunk/siteguide/siteguide.html > > And lastly, note that in your initial sites.xml below: > > - Omitting the filesystem provider tag is typically only done when > "use.provider.staging" is specified in the swift.properties config > file > > - The stagingMethod tag only applies to provider staging. > > We're working hard to document all this better and provider a better > set of illustrated examples and templates for common site > configurations. In the meantime, we'll help you create a set of > useful configurations for your site(s). > > Regards, > > - Mike > > > We just heard about the swift project from some colleagues at U > > Chicago, and we're interested in trying it out with some of our > > compute resources at Stanford to run parallel molecular dynamics > > and > > x-ray scatting simulations. Currently, I'm most interested in > > setting up the environment such that I can submit my swift script > > on > > a local workstation, with execution on a few different clusters. > > The > > head nodes of our local clusters are accessible via ssh, and then > > job execution is scheduled with pbs. > > > > When I run swift, it can't seem to find qsub on the cluster. > > > > rmcgibbo at Roberts-MacBook-Pro-2 ~/projects/swift > > $ swift -sites.file sites.xml hello.swift -tc.file tc.data > > Swift 0.94 swift-r6492 cog-r3658 > > > > RunID: 20130603-1704-5xii8svc > > Progress: time: Mon, 03 Jun 2013 17:04:10 -0700 > > 2013-06-03 17:04:10.735 java[77051:1f07] Loading Maximizer into > > bundle: com.apple.javajdk16.cmd > > 2013-06-03 17:04:11.410 java[77051:1f07] Maximizer: Unsupported > > window created of class: CocoaAppWindow > > Progress: time: Mon, 03 Jun 2013 17:04:13 -0700 Stage in:1 > > Execution failed: > > Exception in uname: > > Arguments: [-a] > > Host: vsp-compute > > Directory: hello-20130603-1704-5xii8svc/jobs/y/uname-ydyn5fal > > Caused by: > > Cannot submit job: Cannot run program "qsub": error=2, No such file > > or directory > > uname, hello.swift, line 8 > > > > When I switch the execution provider from pbs to ssh, the hob runs > > successfully, but only on the head node of the vsp-compute cluster. > > I'd like to run instead using the cluster's pbs queue. Any help > > would be greatly appreciated. > > > > -Robert > > Graduate Student, Pande Lab > > Stanford University, Department of Chemistry > > > > p.s. > > > > My sitess.xml file is > > ``` > > > > > > > > > > > > 750 > > 1 > > default > > file > > > > /scratch/rmcgibbo/swiftwork > > > > > > > > > > ``` > > > > My SwiftScript is > > ``` > > #hello.swift > > type file; > > > > app (file o) uname() { > > uname "-a" stdout=@o; > > } > > file outfile <"uname.txt">; > > > > outfile = uname(); > > ``` > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > From rmcgibbo at gmail.com Mon Jun 3 22:59:02 2013 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Mon, 3 Jun 2013 20:59:02 -0700 Subject: [Swift-user] Setting up Swift at Stanford In-Reply-To: <258082345.4505.1370317135629.JavaMail.root@mcs.anl.gov> References: <258082345.4505.1370317135629.JavaMail.root@mcs.anl.gov> Message-ID: Thanks for your help, Michael! With your suggestions, I got swift working. Here's my config: http://rmcgibbo.github.io/blog/2013/06/03/setting-up-swift/ One thing I couldn't get working is . When I have that in my sites.xml, I get $ swift uname.swift Swift 0.94 swift-r6492 cog-r3658 RunID: 20130603-2056-7octkf3a Progress: time: Mon, 03 Jun 2013 20:56:13 -0700 Execution failed: Could not initialize shared directory on vsp-compute Caused by: org.globus.cog.abstraction.impl.file.FileResourceException: Failed to create directory: /home/rmcgibbo/.swiftwork/uname-20130603-2056-7octkf3a/shared uname, uname.swift, line 10 But with , it seems to work just fine. -Robert On Jun 3, 2013, at 8:38 PM, Michael Wilde wrote: > I forgot to also mention: the example below with the "ssh-cl" ("ssh command line") provider also assumes that you can do a password-less ssh command from your workstation to your PBS head node. Ie, that you have ssh keys in place on the head node and that youre using an ssh agent. > > The standard Swift ssh provider (eg using provider=coaster jobmanager=ssh:pbs) uses a file called $HOME/.ssh/auth.defaults to specify ssh passwords or passphrases, or for better security swift will prompt for these. > > We tend to use and recommend the newer ssh-cl for both security and convenience. > > - Mike > > > ----- Original Message ----- >> From: "Michael Wilde" >> To: "Robert McGibbon" >> Cc: swift-user at ci.uchicago.edu >> Sent: Monday, June 3, 2013 10:27:45 PM >> Subject: Re: [Swift-user] Setting up Swift at Stanford >> >> Hi Robert, >> >> To run swift from a workstation that can ssh to one or more cluster >> head nodes, use a sites file like this: >> >> >> > url="vsp-compute-01.stanford.edu"/> >> 1 >> 100 >> > key="highOverAllocation">100 >> 3600 >> 00:05:00 >> default >> 5 >> 1 >> 1 >> 1.00 >> 10000 >> /scratch/rmcgibbo/swiftwork >> >> >> This specifies that Swift should: >> >> - use the "coaster" provider, which enables Swift to ssh to another >> system and qsub from there: >> >> > url="vsp-compute-01.stanford.edu"/> >> >> - run up to 100 Swift app() tasks in parallel on the remote system: >> >> 1.00 >> 10000 >> >> - app() tasks should be limited to 5 minutes walltime: >> >> 00:05:00 >> >> - app() tasks will be run within PBS coaster "pilot" jobs. Each PBS >> job should have a walltime of 750 seconds: >> >> 100 >> 100 >> 750 >> >> - Up to 5 concurrent PBS coaster jobs each asking for 1 node will be >> submitted to the default queue: >> >> default >> 5 >> 1 >> 1 >> >> - Swift should run only one app() task at a time within each PBS job >> slot: >> >> 1 >> >> - On the remote PBS cluster, create per-run directories under this >> work directory: >> >> /scratch/rmcgibbo/swiftwork >> >> - And stage data to the site by using local copy operations: >> >> >> >> You can make the sites.xml entry more user-independent using, e.g.: >> >> /scratch/{env.USER}/swiftwork >> >> The overall sites entry above assumes: >> >> - That /scratch/rmcgibbo is mounted on both the Swift run host and on >> the remote PBS system. >> >> If there is no common shared filesystem, Swift can use a data >> transport technique called "coaster provider staging" to move the >> data for you. This is specified in the swift.properties file. >> >> In many cases, with a shared filesystem bewteen the Swift client host >> and the execution cluster, its desirable to turn off staging >> altogether. This is done using a mode called "direct" data >> management (see >> http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_collective_data_management. >> This is being simplified for future releases.) >> >> - That each PBS job is given one CPU core, not one full node. >> >> The PBS ppn attribute can be specified to request a specific number >> of cores (processors) per node: >> >> 16 >> >> ...and then that each coaster pilot job should run up to 16 Swift >> app() tasks at once: >> >> 16 >> >> For more info on coasters, see: >> http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_coasters >> and: http://www.ci.uchicago.edu/swift/papers/UCC-coasters.pdf >> >> For more examples on site configurations, see: >> >> http://www.ci.uchicago.edu/swift/guides/trunk/siteguide/siteguide.html >> >> And lastly, note that in your initial sites.xml below: >> >> - Omitting the filesystem provider tag is typically only done when >> "use.provider.staging" is specified in the swift.properties config >> file >> >> - The stagingMethod tag only applies to provider staging. >> >> We're working hard to document all this better and provider a better >> set of illustrated examples and templates for common site >> configurations. In the meantime, we'll help you create a set of >> useful configurations for your site(s). >> >> Regards, >> >> - Mike >> >>> We just heard about the swift project from some colleagues at U >>> Chicago, and we're interested in trying it out with some of our >>> compute resources at Stanford to run parallel molecular dynamics >>> and >>> x-ray scatting simulations. Currently, I'm most interested in >>> setting up the environment such that I can submit my swift script >>> on >>> a local workstation, with execution on a few different clusters. >>> The >>> head nodes of our local clusters are accessible via ssh, and then >>> job execution is scheduled with pbs. >>> >>> When I run swift, it can't seem to find qsub on the cluster. >>> >>> rmcgibbo at Roberts-MacBook-Pro-2 ~/projects/swift >>> $ swift -sites.file sites.xml hello.swift -tc.file tc.data >>> Swift 0.94 swift-r6492 cog-r3658 >>> >>> RunID: 20130603-1704-5xii8svc >>> Progress: time: Mon, 03 Jun 2013 17:04:10 -0700 >>> 2013-06-03 17:04:10.735 java[77051:1f07] Loading Maximizer into >>> bundle: com.apple.javajdk16.cmd >>> 2013-06-03 17:04:11.410 java[77051:1f07] Maximizer: Unsupported >>> window created of class: CocoaAppWindow >>> Progress: time: Mon, 03 Jun 2013 17:04:13 -0700 Stage in:1 >>> Execution failed: >>> Exception in uname: >>> Arguments: [-a] >>> Host: vsp-compute >>> Directory: hello-20130603-1704-5xii8svc/jobs/y/uname-ydyn5fal >>> Caused by: >>> Cannot submit job: Cannot run program "qsub": error=2, No such file >>> or directory >>> uname, hello.swift, line 8 >>> >>> When I switch the execution provider from pbs to ssh, the hob runs >>> successfully, but only on the head node of the vsp-compute cluster. >>> I'd like to run instead using the cluster's pbs queue. Any help >>> would be greatly appreciated. >>> >>> -Robert >>> Graduate Student, Pande Lab >>> Stanford University, Department of Chemistry >>> >>> p.s. >>> >>> My sitess.xml file is >>> ``` >>> >>> >>> >>> >>> >>> 750 >>> 1 >>> default >>> file >>> >>> /scratch/rmcgibbo/swiftwork >>> >>> >>> >>> >>> ``` >>> >>> My SwiftScript is >>> ``` >>> #hello.swift >>> type file; >>> >>> app (file o) uname() { >>> uname "-a" stdout=@o; >>> } >>> file outfile <"uname.txt">; >>> >>> outfile = uname(); >>> ``` >>> _______________________________________________ >>> Swift-user mailing list >>> Swift-user at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user >> From tjlane at stanford.edu Mon Jun 3 23:28:07 2013 From: tjlane at stanford.edu (TJ Lane) Date: Mon, 3 Jun 2013 21:28:07 -0700 Subject: [Swift-user] Setting up Swift at Stanford In-Reply-To: <258082345.4505.1370317135629.JavaMail.root@mcs.anl.gov> References: <396534425.4398.1370316465619.JavaMail.root@mcs.anl.gov> <258082345.4505.1370317135629.JavaMail.root@mcs.anl.gov> Message-ID: Mike, Is there support for other schedulers? Specifically I have a cluster running LSF I'd like to also farm jobs out to. Maybe this is documented somewhere? Haven't been able to locate it yet. Thanks a lot for your help! Looking forward to playing more w/Swift. TJ On Mon, Jun 3, 2013 at 8:38 PM, Michael Wilde wrote: > I forgot to also mention: the example below with the "ssh-cl" ("ssh > command line") provider also assumes that you can do a password-less ssh > command from your workstation to your PBS head node. Ie, that you have ssh > keys in place on the head node and that youre using an ssh agent. > > The standard Swift ssh provider (eg using provider=coaster > jobmanager=ssh:pbs) uses a file called $HOME/.ssh/auth.defaults to specify > ssh passwords or passphrases, or for better security swift will prompt for > these. > > We tend to use and recommend the newer ssh-cl for both security and > convenience. > > - Mike > > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Robert McGibbon" > > Cc: swift-user at ci.uchicago.edu > > Sent: Monday, June 3, 2013 10:27:45 PM > > Subject: Re: [Swift-user] Setting up Swift at Stanford > > > > Hi Robert, > > > > To run swift from a workstation that can ssh to one or more cluster > > head nodes, use a sites file like this: > > > > > > > url="vsp-compute-01.stanford.edu"/> > > 1 > > 100 > > > key="highOverAllocation">100 > > 3600 > > 00:05:00 > > default > > 5 > > 1 > > 1 > > 1.00 > > 10000 > > /scratch/rmcgibbo/swiftwork > > > > > > This specifies that Swift should: > > > > - use the "coaster" provider, which enables Swift to ssh to another > > system and qsub from there: > > > > > url="vsp-compute-01.stanford.edu"/> > > > > - run up to 100 Swift app() tasks in parallel on the remote system: > > > > 1.00 > > 10000 > > > > - app() tasks should be limited to 5 minutes walltime: > > > > 00:05:00 > > > > - app() tasks will be run within PBS coaster "pilot" jobs. Each PBS > > job should have a walltime of 750 seconds: > > > > 100 > > 100 > > 750 > > > > - Up to 5 concurrent PBS coaster jobs each asking for 1 node will be > > submitted to the default queue: > > > > default > > 5 > > 1 > > 1 > > > > - Swift should run only one app() task at a time within each PBS job > > slot: > > > > 1 > > > > - On the remote PBS cluster, create per-run directories under this > > work directory: > > > > /scratch/rmcgibbo/swiftwork > > > > - And stage data to the site by using local copy operations: > > > > > > > > You can make the sites.xml entry more user-independent using, e.g.: > > > > /scratch/{env.USER}/swiftwork > > > > The overall sites entry above assumes: > > > > - That /scratch/rmcgibbo is mounted on both the Swift run host and on > > the remote PBS system. > > > > If there is no common shared filesystem, Swift can use a data > > transport technique called "coaster provider staging" to move the > > data for you. This is specified in the swift.properties file. > > > > In many cases, with a shared filesystem bewteen the Swift client host > > and the execution cluster, its desirable to turn off staging > > altogether. This is done using a mode called "direct" data > > management (see > > > http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_collective_data_management > . > > This is being simplified for future releases.) > > > > - That each PBS job is given one CPU core, not one full node. > > > > The PBS ppn attribute can be specified to request a specific number > > of cores (processors) per node: > > > > 16 > > > > ...and then that each coaster pilot job should run up to 16 Swift > > app() tasks at once: > > > > 16 > > > > For more info on coasters, see: > > > http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_coasters > > and: http://www.ci.uchicago.edu/swift/papers/UCC-coasters.pdf > > > > For more examples on site configurations, see: > > > > http://www.ci.uchicago.edu/swift/guides/trunk/siteguide/siteguide.html > > > > And lastly, note that in your initial sites.xml below: > > > > - Omitting the filesystem provider tag is typically only done when > > "use.provider.staging" is specified in the swift.properties config > > file > > > > - The stagingMethod tag only applies to provider staging. > > > > We're working hard to document all this better and provider a better > > set of illustrated examples and templates for common site > > configurations. In the meantime, we'll help you create a set of > > useful configurations for your site(s). > > > > Regards, > > > > - Mike > > > > > We just heard about the swift project from some colleagues at U > > > Chicago, and we're interested in trying it out with some of our > > > compute resources at Stanford to run parallel molecular dynamics > > > and > > > x-ray scatting simulations. Currently, I'm most interested in > > > setting up the environment such that I can submit my swift script > > > on > > > a local workstation, with execution on a few different clusters. > > > The > > > head nodes of our local clusters are accessible via ssh, and then > > > job execution is scheduled with pbs. > > > > > > When I run swift, it can't seem to find qsub on the cluster. > > > > > > rmcgibbo at Roberts-MacBook-Pro-2 ~/projects/swift > > > $ swift -sites.file sites.xml hello.swift -tc.file tc.data > > > Swift 0.94 swift-r6492 cog-r3658 > > > > > > RunID: 20130603-1704-5xii8svc > > > Progress: time: Mon, 03 Jun 2013 17:04:10 -0700 > > > 2013-06-03 17:04:10.735 java[77051:1f07] Loading Maximizer into > > > bundle: com.apple.javajdk16.cmd > > > 2013-06-03 17:04:11.410 java[77051:1f07] Maximizer: Unsupported > > > window created of class: CocoaAppWindow > > > Progress: time: Mon, 03 Jun 2013 17:04:13 -0700 Stage in:1 > > > Execution failed: > > > Exception in uname: > > > Arguments: [-a] > > > Host: vsp-compute > > > Directory: hello-20130603-1704-5xii8svc/jobs/y/uname-ydyn5fal > > > Caused by: > > > Cannot submit job: Cannot run program "qsub": error=2, No such file > > > or directory > > > uname, hello.swift, line 8 > > > > > > When I switch the execution provider from pbs to ssh, the hob runs > > > successfully, but only on the head node of the vsp-compute cluster. > > > I'd like to run instead using the cluster's pbs queue. Any help > > > would be greatly appreciated. > > > > > > -Robert > > > Graduate Student, Pande Lab > > > Stanford University, Department of Chemistry > > > > > > p.s. > > > > > > My sitess.xml file is > > > ``` > > > > > > > > > > > > > > > > > > 750 > > > 1 > > > default > > > file > > > > > > /scratch/rmcgibbo/swiftwork > > > > > > > > > > > > > > > ``` > > > > > > My SwiftScript is > > > ``` > > > #hello.swift > > > type file; > > > > > > app (file o) uname() { > > > uname "-a" stdout=@o; > > > } > > > file outfile <"uname.txt">; > > > > > > outfile = uname(); > > > ``` > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Tue Jun 4 08:26:00 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 4 Jun 2013 08:26:00 -0500 (CDT) Subject: [Swift-user] Setting up Swift at Stanford In-Reply-To: Message-ID: <1201301305.16568.1370352360113.JavaMail.root@mcs.anl.gov> Hi Robert, > One thing I couldn't get working is . > When I have that in my sites.xml, I get > ... > Could not initialize shared directory on vsp-compute > Caused by: > org.globus.cog.abstraction.impl.file.FileResourceException: Failed > to create directory: > /home/rmcgibbo/.swiftwork/uname-20130603-2056-7octkf3a/shared Is /home/rmcgibbo accessible to the local host on which you are running the "swift" command? This error suggests that it is not. That is the likely problem. The swift command is trying to initialize it locally, and can't get to it. As I mentioned in the commented sites file, "filesystem provider=local" assumes "...- That /scratch/rmcgibbo is mounted on both the Swift run host and on the remote PBS system. Can you tell us what the filesystem configuration and sharing arrangement is among your multiple clusters? And the typically input and output file sizes and counts of the app() functions you expect to run? Then we can suggest various data management configuration strategies for you. For modest file sizes, say under 20MB, coaster provider staging is a good choice. To use provider staging, do this: In a local swift.properties -config file (called "cf" here) set: use.provider.staging=true provider.staging.pin.swiftfiles=true status.mode=provider Also, for debugging set: wrapperlog.always.transfer=true sitedir.keep=true execution.retries=0 lazy.errors=false Then in your sites file, omit the filesystem tag, and specify a workdirectory on a fast compute node filesystem capable of handling the transient data volume for the number of jobs you expect the node to process concurrently. A typical choice is say /tmp or /scratch has sufficient space is: /tmp/{env.USER}/swiftwork The specify on your swift command: swift -config cf -tc.file apps -sites.file sites.xml myscript.swift -myarg=etc ... With this configuration, Swift will stage your data from local filesystems on the submit host to the compute node hosts, and back. After each job runs successfully on a compute node, its locally staged data will be removed. If the job fails, its data will be left there (e.g. for debugging) if you specify sitedir.keep=true. Also, your github blog page on Swift looks great! One thing to note there: its almost always unnecessary and often causes trouble to explicitly set SWIFT_HOME. All you need to typically do is put the Swift release's bin/ dir in your PATH. Regards, - Mike From wilde at mcs.anl.gov Tue Jun 4 08:33:27 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 4 Jun 2013 08:33:27 -0500 (CDT) Subject: [Swift-user] Setting up Swift at Stanford In-Reply-To: Message-ID: <2075838649.17216.1370352807300.JavaMail.root@mcs.anl.gov> TJ, I think you can specify execution provider=lsf instead of provider=pbs. LSF support was added in 0.94. I don't know if it has any differences from the PBS provider, but David Kelly who added it can provide information. Swift 0.94 has execution providers for PBS (et al), SGE, Condor, SLURM, LSF, Cobalt, and Globus 2.* and 5.*, ssh (built-in) and ssh command line. All these should also work remotely using coasters, and provide a lot of flexibility. The Swift Site Config guide has information on some of these: http://www.ci.uchicago.edu/swift/guides/trunk/siteguide/siteguide.html - Mike ----- Original Message ----- > From: "TJ Lane" > To: "Michael Wilde" > Cc: "Robert McGibbon" , swift-user at ci.uchicago.edu > Sent: Monday, June 3, 2013 11:28:07 PM > Subject: Re: [Swift-user] Setting up Swift at Stanford > > > > Mike, > > > Is there support for other schedulers? Specifically I have a cluster > running LSF I'd like to also farm jobs out to. Maybe this is > documented somewhere? Haven't been able to locate it yet. > > Thanks a lot for your help! Looking forward to playing more w/Swift. > > TJ From hategan at mcs.anl.gov Tue Jun 4 14:56:50 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 04 Jun 2013 12:56:50 -0700 Subject: [Swift-user] Setting up Swift at Stanford In-Reply-To: References: Message-ID: <1370375810.15402.4.camel@echo> Unfortunately the pbs provider only works locally and the ssh provider only runs remote ssh commands through bash (not queuing things through PBS). I can see the benefit of tunneling what the pbs provider does through ssh, but that's not there now. Your choices are to either install Globus on your clusters or use coasters which install their own service in the background on the cluster. For the latter you need to say: Mihael On Mon, 2013-06-03 at 17:16 -0700, Robert McGibbon wrote: > Hey, > > We just heard about the swift project from some colleagues at U Chicago, and we're interested in trying it out with some of our compute resources at Stanford to run parallel molecular dynamics and x-ray scatting simulations. Currently, I'm most interested in setting up the environment such that I can submit my swift script on a local workstation, with execution on a few different clusters. The head nodes of our local clusters are accessible via ssh, and then job execution is scheduled with pbs. > > When I run swift, it can't seem to find qsub on the cluster. > > rmcgibbo at Roberts-MacBook-Pro-2 ~/projects/swift > $ swift -sites.file sites.xml hello.swift -tc.file tc.data > Swift 0.94 swift-r6492 cog-r3658 > > RunID: 20130603-1704-5xii8svc > Progress: time: Mon, 03 Jun 2013 17:04:10 -0700 > 2013-06-03 17:04:10.735 java[77051:1f07] Loading Maximizer into bundle: com.apple.javajdk16.cmd > 2013-06-03 17:04:11.410 java[77051:1f07] Maximizer: Unsupported window created of class: CocoaAppWindow > Progress: time: Mon, 03 Jun 2013 17:04:13 -0700 Stage in:1 > Execution failed: > Exception in uname: > Arguments: [-a] > Host: vsp-compute > Directory: hello-20130603-1704-5xii8svc/jobs/y/uname-ydyn5fal > Caused by: > Cannot submit job: Cannot run program "qsub": error=2, No such file or directory > uname, hello.swift, line 8 > > When I switch the execution provider from pbs to ssh, the hob runs successfully, but only on the head node of the vsp-compute cluster. I'd like to run instead using the cluster's pbs queue. Any help would be greatly appreciated. > > -Robert > Graduate Student, Pande Lab > Stanford University, Department of Chemistry > > p.s. > > My sitess.xml file is > ``` > > > > > > 750 > 1 > default > file > > /scratch/rmcgibbo/swiftwork > > > > > ``` > > My SwiftScript is > ``` > #hello.swift > type file; > > app (file o) uname() { > uname "-a" stdout=@o; > } > file outfile <"uname.txt">; > > outfile = uname(); > ``` > Hey, > > We just heard about the swift project from some colleagues at U > Chicago, and we're interested in trying it out with some of our > compute resources at Stanford to run parallel molecular dynamics and > x-ray scatting simulations. Currently, I'm most interested in setting > up the environment such that I can submit my swift script on a local > workstation, with execution on a few different clusters. The head > nodes of our local clusters are accessible via ssh, and then job > execution is scheduled with pbs. > > When I run swift, it can't seem to find qsub on the cluster. > > rmcgibbo at Roberts-MacBook-Pro-2 ~/projects/swift > $ swift -sites.file sites.xml hello.swift -tc.file tc.data > Swift 0.94 swift-r6492 cog-r3658 > > RunID: 20130603-1704-5xii8svc > Progress: time: Mon, 03 Jun 2013 17:04:10 -0700 > 2013-06-03 17:04:10.735 java[77051:1f07] Loading Maximizer into > bundle: com.apple.javajdk16.cmd > 2013-06-03 17:04:11.410 java[77051:1f07] Maximizer: Unsupported window > created of class: CocoaAppWindow > Progress: time: Mon, 03 Jun 2013 17:04:13 -0700 Stage in:1 > Execution failed: > Exception in uname: > Arguments: [-a] > Host: vsp-compute > Directory: hello-20130603-1704-5xii8svc/jobs/y/uname-ydyn5fal > Caused by: > Cannot submit job: Cannot run program "qsub": error=2, No such file or > directory > uname, hello.swift, line 8 > > When I switch the execution provider from pbs to ssh, the hob runs > successfully, but only on the head node of the vsp-compute cluster. > I'd like to run instead using the cluster's pbs queue. Any help would > be greatly appreciated. > > -Robert > Graduate Student, Pande Lab > Stanford University, Department of Chemistry > > p.s. > > My sitess.xml file is > ``` > > > > url="vsp-compute-01.stanford.edu"/> > > 750 > 1 > default > file > > /scratch/rmcgibbo/swiftwork > > > > > ``` > > My SwiftScript is > ``` > #hello.swift > type file; > > app (file o) uname() { > uname "-a" stdout=@o; > } > file outfile <"uname.txt">; > > outfile = uname(); > ``` > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From iraicu at cs.iit.edu Sat Jun 8 08:30:40 2013 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Sat, 08 Jun 2013 08:30:40 -0500 Subject: [Swift-user] CFP: ACM Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS) 2013 @ IEEE/ACM Supercomputing/SC 2013 Message-ID: <51B33200.6030000@cs.iit.edu> CALL FOR PAPERS 6th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS) 2013 http://datasys.cs.iit.edu/events/MTAGS13/ Co-located with IEEE/ACM Supercomputing/SC 2013 Denver Colorado -- November 17th, 2013 Overview ------------------------------------------------------------------------------ The 6th workshop on Many-Task Computing on Grids and Supercomputers (MTAGS) will provide the scientific community a dedicated forum for presenting new research, development, and deployment efforts of large-scale many-task computing (MTC) applications on large scale clusters, Grids, Supercomputers, and Cloud Computing infrastructure. MTC, the theme of the workshop encompasses loosely coupled applications, which are generally composed of many tasks (both independent and dependent tasks) to achieve some larger application goal. This workshop will cover challenges that can hamper efficiency and utilization in running applications on large-scale systems, such as local resource manager scalability and granularity, efficient utilization of raw hardware, parallel file system contention and scalability, data management, I/O management, reliability at scale, and application scalability. We welcome paper submissions on all theoretical, simulations, and systems topics related to MTC, but we give special consideration to papers addressing petascale to exascale challenges. Papers will be peer-reviewed, and accepted papers will be published in the workshop proceedings as part of the ACM digital library (pending approval). The workshop will be co-located with the IEEE/ACM Supercomputing 2013 Conference in Denver Colorado on November 18th, 2013. For more information, please see http://datasys.cs.iit.edu/events/MTAGS13/. For more information on past workshops, please see MTAGS12, MTAGS11, MTAGS10, MTAGS09, and MTAGS08. We also ran a Special Issue on Many-Task Computing in the IEEE Transactions on Parallel and Distributed Systems (TPDS) which has appeared in June 2011; the proceedings can be found online at http://www.computer.org/portal/web/csdl/abs/trans/td/2011/06/ttd201106toc.htm. We, the workshop organizers, also published a highly relevant paper that defines Many-Task Computing which was published in MTAGS08, titled Many-Task Computing for Grids and Supercomputers; we encourage potential authors to read this paper, and to clearly articulate in your paper submissions how your papers are related to Many-Task Computing. Topics ------------------------------------------------------------------------------ We invite the submission of original work that is related to the topics below. The papers should be 6 pages, including all figures and references. We aim to cover topics related to Many-Task Computing on each of the three major distributed systems paradigms, Cloud Computing, Grid Computing and Supercomputing. Topics of interest include: Compute Resource Management Scheduling Job execution frameworks Local resource manager extensions Performance evaluation of resource managers in use on large scale systems Dynamic resource provisioning Techniques to manage many-core resources and/or GPUs Challenges and opportunities in running many-task workloads on HPC systems Challenges and opportunities in running many-task workloads on Cloud infrastructure Storage architectures and implementations Distributed file systems Parallel file systems Distributed metadata management Content distribution systems for large data Data caching frameworks and techniques Data management within and across data centers Data-aware scheduling Data-intensive computing applications Eventual-consistency storage usage and management Programming models and tools MapReduce and its generalizations Many-task computing middleware and applications Parallel programming frameworks Ensemble MPI techniques and frameworks Service-oriented science applications Large-Scale Workflow Systems Workflow system performance and scalability analysis Scalability of workflow systems Workflow infrastructure and e-Science middleware Programming paradigms and models Large-Scale Many-Task Applications High-throughput computing (HTC) applications Data-intensive applications Quasi-supercomputing applications, deployments, and experiences Performance Evaluation Performance evaluation Real systems Simulations Reliability of large systems How MTC Addresses Challenges of Petascale and Exascale Computing Concurrency & Programmability I/O & Memory Energy Resilience Heterogeneity Important Dates ------------------------------------------------------------------------------ Paper submission: September 1, 2013 Acceptance notification: October 13, 2013 Final papers due: November 10th, 2013 Paper Submission ------------------------------------------------------------------------------ Authors are invited to submit papers with unpublished, original work of not more than 6 pages of double column text using single spaced 10 point size on 8.5 x 11 inch pages, as per ACM 8.5 x 11 manuscript guidelines; document templates can be found at http://www.acm.org/sigs/publications/proceedings-templates. The final 6 page papers in PDF format must be submitted online at https://cmt.research.microsoft.com/MTAGS2013/ before the deadline of September 1st, 2013 at 11:59PM PST. Papers will be peer-reviewed, and accepted papers will be published in the workshop proceedings as part of the ACM digital library (pending approval). Notifications of the paper decisions will be sent out by October 13th, 2011. Selected excellent work may be eligible for additional post-conference publication as journal articles. Submission implies the willingness of at least one of the authors to register and present the paper. For more information, please see http://datasys.cs.iit.edu/events/MTAGS13 Organization ------------------------------------------------------------------------------ General Chairs Ioan Raicu, Illinois Institute of Technology & Argonne National Laboratory, USA Ian Foster, University of Chicago & Argonne National Laboratory, USA Yong Zhao, University of Electronic Science and Technology of China, China Justin Wozniak, Argonne National Laboratory, USA Steering Committee David Abramson, Monash University, Australia Jack Dongarra, University of Tennessee, USA Geoffrey Fox, Indiana University, USA Manish Parashar, Rutgers University, USA Marc Snir, Argonne National Laboratory & University of Illinois at Urbana Champaign, USA Xian-He Sun, Illinois Institute of Technology, USA Weimin Zheng, Tsinghua University, China Program Committee Samer Al-Kiswany (University of British Columbia) Mihai Budiu (Microsoft Research) Kyle Chard (University of Chicago) Yong Chen (Texas Tech University) Evangelinos Constantinos (Massachusetts Institute of Technology) Catalin Dumitrescu (Fermi National Labs) Alexandru Iosup (Delft University of Technology - Netherlands) Florin Isaila (Universidad Carlos III de Madrid ) Kamil Iskra (Argonne National Laboratory) Hui Jin (Oracle Corporation) Daniel Katz (University of Chicago) Zhiling Lan (Illinois Institute of Technology) Mike Lang (Los Alamos National Laboratory) Christopher Moretti (Princeton University) Bogdan Nicolae (IBM Research) David O'Hallaron (Carnegie Mellon University & Intel Laboratory) Marlon Pierce (Indiana University) Judy Qiu (Indiana University) Wei Tang (Argonne National Laboratory) Edward Walker (Whitworth University) Matthew Woitaszek (Walmart Labs) Ken Yocum (University of California at San Diego) Zhifeng Yun (Louisiana State University) Zhao Zhang (University of Chicago) Ziming Zheng (Illinois Institute of Technology) -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Editor: IEEE TCC, Springer JoCCASA Chair: IEEE/ACM MTAGS, ACM ScienceCloud, IEEE/ACM DataCloud ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ LinkedIn: http://www.linkedin.com/in/ioanraicu Google: http://scholar.google.com/citations?user=jE73HYAAAAAJ ================================================================= ================================================================= From tjlane at stanford.edu Mon Jun 10 20:15:17 2013 From: tjlane at stanford.edu (TJ Lane) Date: Mon, 10 Jun 2013 18:15:17 -0700 Subject: [Swift-user] Advice on mapping input arguments Message-ID: Swift Users, I am wondering if I could get some advice on the best way to do the following in Swift: I want to run a series of simulations performing a parameter scan, for each parameter combination farming the work out to clusters I have access to here at Stanford, and collect the results back onto my desktop. I've gotten some minimal working examples of swift up and running, but hit a roadblock on something quite simple: what's the best way to pass a large number of parameters into a swift script? I have a big list of parameter combinations I'd like to run, and am searching for a sane way to pass all of these into my swift app call. Originally, I thought I'd be able to use the CSV mapper to pass a bunch of arguments from a CSV file into swift -- it seemed perfect! As a bonus, I hoped the CSV file would act as a record of my work, namely what parameters were used to generate what file. But it seems that the CSV mapper automatically maps the entries in the CSV file to swift "mapper" objects -- i.e., it expects my CSV data fields are all files, where as I want some to be ints or floats that get passed directly to the arguments of my command-line script on the slave machine(s). For concreteness, here is a test CSV I was working with: coords,qvals,numphi,numshots,nummolec,photons,parallel,output_filename gold_5nm.coor,gold_qvals.txt,3600,10,1,0.25,12,out0.ring gold_5nm.coor,gold_qvals.txt,3600,10,2,0.5,12,out1.ring gold_5nm.coor,gold_qvals.txt,3600,10,4,1.0,12,out2.ring gold_5nm.coor,gold_qvals.txt,3600,10,8,2.0,12,out3.ring gold_5nm.coor,gold_qvals.txt,3600,10,16,4.0,12,out4.ring and my (non-functional) swift script, which will show what I was trying to do: # shoot.swift type messagefile; type pdbfile; type shotfile; type shootargs{ pdbfile coords; messagefile qvals; int numphi; int numshots; int nummolec; int photons; int parallel; string output_filename; } app (shotfile outputfile) shootsim (shootargs args) { polshoot "-s" @args.coords "-f" @args.qvals "-x" args.numphi "-n" args.numshots "-m" args.nummolec "-g" args.photons "-p" args.parallel "-o" @outputfile; } shootargs myargs[] ; foreach a in myargs { shotfile o; // this could be something like myargs.output_filename o = shootsim(a); } I'm wondering if someone who's worked a bit with swift can give me a recommendation on how to proceed. Right now I'm playing with just writing a huge number of flat text files, each one containing the parameter flags that will then get cat'd into the arguments of my command-line script "polshoot" on the slave end. This is inelegant for obvious reasons, since I'll have a huge number of input files and no easy way to keep track of which input matches what output... If anyone has advice, I'm all ears! Thanks, TJ -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Mon Jun 10 22:58:55 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 10 Jun 2013 22:58:55 -0500 (CDT) Subject: [Swift-user] Advice on mapping input arguments In-Reply-To: Message-ID: <1138174311.2297621.1370923135254.JavaMail.root@mcs.anl.gov> Hi TJ, Here's a quick initial thought. There might be better ways to do this. The problem as you mention is that there is currently no mapper or function that lets you easily handle a mixture of scalars and files in a struct. I think that readData and/or mappers should/could handle this, but Ive been outvoted in past discussions of this. So for now, assuming you have more scalar params than file params, you can just leave the filenames as strings in the parameter struct, then map the smaller number of files from these strings, and call your app() using the param struct to pass all the scalar params, and pass the mapped files as additional input and output parameters. This retains, I think, the benefit of having a parameter file that provides a handy record of the run's parameters. Here's an initial example that I *think* is in the spirit of the longer one you supplied below. If this sounds like a reasonable approach I think we can use the same technique to run your example. I provide I think all the files you need to test this. I'll look more carefully at your example now to see if this is indeed what you're trying to do. - Mike mid$ cat params.txt p1 p2 fn1 fn2 ofn 86 99 f1.dat f2.dat row1.out 87 98 f3.dat f4.dat row2.out mid$ cat structs.swift type file; type params { int p1; int p2; string fn1; string fn2; string ofn; }; app (file out) echo (params p, file f1, file f2) { echo stdout=@out "params:" p.p1 p.p2 "files:" @f1 @f2; } params plist[] = readData("params.txt"); foreach p, i in plist { file f1 ; file f2 ; file of ; of = echo(p,f1,f2); } mid$ cat ~/.swift/swift.properties sites.file=sites.xml tc.file=apps status.mode=provider use.provider.staging=false use.wrapper.staging=false wrapperlog.always.transfer=true execution.retries=0 lazy.errors=false provider.staging.pin.swiftfiles=false sitedir.keep=true file.gc.enabled=false #tcp.port.range=50000,51000 mid$ cat apps localhost echo echo mid$ cat sites.xml /scratch/midway/{env.USER}/swiftwork mid$ ls -l f?.dat -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f1.dat -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f2.dat -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f3.dat -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f4.dat mid$ swift structs.swift Swift 0.94 swift-r6414 (swift modified locally) cog-r3648 RunID: 20130611-0343-so454ho1 Progress: time: Tue, 11 Jun 2013 03:43:56 +0000 Final status: Tue, 11 Jun 2013 03:43:56 +0000 Finished successfully:2 mid$ more row?.out :::::::::::::: row1.out :::::::::::::: params: 86 99 files: f1.dat f2.dat :::::::::::::: row2.out :::::::::::::: params: 87 98 files: f3.dat f4.dat mid$ ----- Original Message ----- > From: "TJ Lane" > To: swift-user at ci.uchicago.edu > Sent: Monday, June 10, 2013 8:15:17 PM > Subject: [Swift-user] Advice on mapping input arguments > > > > > Swift Users, > > I am wondering if I could get some advice on the best way to do the > following in Swift: I want to run a series of simulations performing > a parameter scan, for each parameter combination farming the work > out to clusters I have access to here at Stanford, and collect the > results back onto my desktop. > > I've gotten some minimal working examples of swift up and running, > but hit a roadblock on something quite simple: what's the best way > to pass a large number of parameters into a swift script? I have a > big list of parameter combinations I'd like to run, and am searching > for a sane way to pass all of these into my swift app call. > > Originally, I thought I'd be able to use the CSV mapper to pass a > bunch of arguments from a CSV file into swift -- it seemed perfect! > As a bonus, I hoped the CSV file would act as a record of my work, > namely what parameters were used to generate what file. But it seems > that the CSV mapper automatically maps the entries in the CSV file > to swift "mapper" objects -- i.e., it expects my CSV data fields are > all files, where as I want some to be ints or floats that get passed > directly to the arguments of my command-line script on the slave > machine(s). > > > For concreteness, here is a test CSV I was working with: > > > coords,qvals,numphi,numshots,nummolec,photons,parallel,output_filename > gold_5nm.coor,gold_qvals.txt,3600,10,1,0.25,12,out0.ring > gold_5nm.coor,gold_qvals.txt,3600,10,2,0.5,12,out1.ring > gold_5nm.coor,gold_qvals.txt,3600,10,4,1.0,12,out2.ring > gold_5nm.coor,gold_qvals.txt,3600,10,8,2.0,12,out3.ring > gold_5nm.coor,gold_qvals.txt,3600,10,16,4.0,12,out4.ring > > > and my (non-functional) swift script, which will show what I was > trying to do: > > # shoot.swift > > type messagefile; > type pdbfile; > type shotfile; > > type shootargs{ > pdbfile coords; > messagefile qvals; > int numphi; > int numshots; > int nummolec; > int photons; > int parallel; > string output_filename; > } > > app (shotfile outputfile) shootsim (shootargs args) { > polshoot "-s" @args.coords "-f" @args.qvals "-x" args.numphi "-n" > args.numshots "-m" args.nummolec "-g" args.photons "-p" > args.parallel "-o" @outputfile; > } > > > shootargs myargs[] ; > > foreach a in myargs { > shotfile o; // this could be something like myargs.output_filename > o = shootsim(a); > } > > > I'm wondering if someone who's worked a bit with swift can give me a > recommendation on how to proceed. Right now I'm playing with just > writing a huge number of flat text files, each one containing the > parameter flags that will then get cat'd into the arguments of my > command-line script "polshoot" on the slave end. This is inelegant > for obvious reasons, since I'll have a huge number of input files > and no easy way to keep track of which input matches what output... > > > If anyone has advice, I'm all ears! > > > Thanks, > > TJ > > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From wilde at mcs.anl.gov Mon Jun 10 23:26:17 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 10 Jun 2013 23:26:17 -0500 (CDT) Subject: [Swift-user] Advice on mapping input arguments In-Reply-To: <1138174311.2297621.1370923135254.JavaMail.root@mcs.anl.gov> Message-ID: <523093867.2298181.1370924777038.JavaMail.root@mcs.anl.gov> TJ, For the specific example you posted, the technique shown in the prior message works well. Here's shoot.swift converted to use that approach. - Mike mid$ cat shoot.swift type qvalfile; type coordfile; type shotfile; type shootargs{ string coords; string qvals; int numphi; int numshots; int nummolec; float photons; int parallel; string output_filename; } app (shotfile outfile) shootsim (shootargs args, coordfile coords, qvalfile qvals) { echo "polshoot" "-s" @coords "-f" @qvals "-x" args.numphi "-n" args.numshots "-m" args.nummolec "-g" args.photons "-p" args.parallel "-o" @outfile stdout=@outfile; # Note double use of outfile, just for testing/demo } shootargs myargs[] = readData("particles.dat"); foreach a in myargs { shotfile output ; coordfile coords ; qvalfile qvals ; output = shootsim(a, coords, qvals); } mid$ cat particles.dat coords qvals numphi numshots nummolec photons parallel output_filename gold_5nm.coor gold_qvals.txt 3600 10 1 0.25 12 out0.ring gold_5nm.coor gold_qvals.txt 3600 10 2 0.5 12 out1.ring gold_5nm.coor gold_qvals.txt 3600 10 4 1.0 12 out2.ring gold_5nm.coor gold_qvals.txt 3600 10 8 2.0 12 out3.ring gold_5nm.coor gold_qvals.txt 3600 10 16 4.0 12 out4.ring mid$ swift shoot.swift Swift 0.94 swift-r6414 (swift modified locally) cog-r3648 RunID: 20130611-0423-gfp7pohe Progress: time: Tue, 11 Jun 2013 04:23:18 +0000 Progress: time: Tue, 11 Jun 2013 04:23:19 +0000 Stage in:1 Finished successfully:4 Final status: Tue, 11 Jun 2013 04:23:19 +0000 Finished successfully:5 mid$ more out?.ring :::::::::::::: out0.ring :::::::::::::: polshoot -s gold_5nm.coor -f gold_qvals.txt -x 3600 -n 10 -m 1 -g 0.25 -p 12 -o out0.ring :::::::::::::: out1.ring :::::::::::::: polshoot -s gold_5nm.coor -f gold_qvals.txt -x 3600 -n 10 -m 2 -g 0.5 -p 12 -o out1.ring :::::::::::::: out2.ring :::::::::::::: polshoot -s gold_5nm.coor -f gold_qvals.txt -x 3600 -n 10 -m 4 -g 1 -p 12 -o out2.ring :::::::::::::: out3.ring :::::::::::::: polshoot -s gold_5nm.coor -f gold_qvals.txt -x 3600 -n 10 -m 8 -g 2 -p 12 -o out3.ring :::::::::::::: out4.ring :::::::::::::: polshoot -s gold_5nm.coor -f gold_qvals.txt -x 3600 -n 10 -m 16 -g 4 -p 12 -o out4.ring mid$ ----- Original Message ----- > From: "Michael Wilde" > To: "TJ Lane" > Cc: swift-user at ci.uchicago.edu > Sent: Monday, June 10, 2013 10:58:55 PM > Subject: Re: [Swift-user] Advice on mapping input arguments > > Hi TJ, > > Here's a quick initial thought. There might be better ways to do > this. > > The problem as you mention is that there is currently no mapper or > function that lets you easily handle a mixture of scalars and files > in a struct. I think that readData and/or mappers should/could > handle this, but Ive been outvoted in past discussions of this. > > So for now, assuming you have more scalar params than file params, > you can just leave the filenames as strings in the parameter struct, > then map the smaller number of files from these strings, and call > your app() using the param struct to pass all the scalar params, and > pass the mapped files as additional input and output parameters. > > This retains, I think, the benefit of having a parameter file that > provides a handy record of the run's parameters. > > Here's an initial example that I *think* is in the spirit of the > longer one you supplied below. > > If this sounds like a reasonable approach I think we can use the same > technique to run your example. > > I provide I think all the files you need to test this. > > I'll look more carefully at your example now to see if this is indeed > what you're trying to do. > > - Mike > > mid$ cat params.txt > p1 p2 fn1 fn2 ofn > 86 99 f1.dat f2.dat row1.out > 87 98 f3.dat f4.dat row2.out > > mid$ cat structs.swift > > type file; > > type params { > int p1; > int p2; > string fn1; > string fn2; > string ofn; > }; > > app (file out) echo (params p, file f1, file f2) > { > echo stdout=@out "params:" p.p1 p.p2 "files:" @f1 @f2; > } > > params plist[] = readData("params.txt"); > > foreach p, i in plist { > file f1 ; > file f2 ; > file of ; > of = echo(p,f1,f2); > } > > mid$ cat ~/.swift/swift.properties > > sites.file=sites.xml > tc.file=apps > > status.mode=provider > use.provider.staging=false > use.wrapper.staging=false > wrapperlog.always.transfer=true > execution.retries=0 > lazy.errors=false > provider.staging.pin.swiftfiles=false > sitedir.keep=true > file.gc.enabled=false > #tcp.port.range=50000,51000 > > mid$ cat apps > localhost echo echo > > mid$ cat sites.xml > > > > > /scratch/midway/{env.USER}/swiftwork > > > > mid$ ls -l f?.dat > -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f1.dat > -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f2.dat > -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f3.dat > -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f4.dat > > mid$ swift structs.swift > Swift 0.94 swift-r6414 (swift modified locally) cog-r3648 > > RunID: 20130611-0343-so454ho1 > Progress: time: Tue, 11 Jun 2013 03:43:56 +0000 > Final status: Tue, 11 Jun 2013 03:43:56 +0000 Finished > successfully:2 > > mid$ more row?.out > :::::::::::::: > row1.out > :::::::::::::: > params: 86 99 files: f1.dat f2.dat > :::::::::::::: > row2.out > :::::::::::::: > params: 87 98 files: f3.dat f4.dat > mid$ > > > ----- Original Message ----- > > From: "TJ Lane" > > To: swift-user at ci.uchicago.edu > > Sent: Monday, June 10, 2013 8:15:17 PM > > Subject: [Swift-user] Advice on mapping input arguments > > > > > > > > > > Swift Users, > > > > I am wondering if I could get some advice on the best way to do the > > following in Swift: I want to run a series of simulations > > performing > > a parameter scan, for each parameter combination farming the work > > out to clusters I have access to here at Stanford, and collect the > > results back onto my desktop. > > > > I've gotten some minimal working examples of swift up and running, > > but hit a roadblock on something quite simple: what's the best way > > to pass a large number of parameters into a swift script? I have a > > big list of parameter combinations I'd like to run, and am > > searching > > for a sane way to pass all of these into my swift app call. > > > > Originally, I thought I'd be able to use the CSV mapper to pass a > > bunch of arguments from a CSV file into swift -- it seemed perfect! > > As a bonus, I hoped the CSV file would act as a record of my work, > > namely what parameters were used to generate what file. But it > > seems > > that the CSV mapper automatically maps the entries in the CSV file > > to swift "mapper" objects -- i.e., it expects my CSV data fields > > are > > all files, where as I want some to be ints or floats that get > > passed > > directly to the arguments of my command-line script on the slave > > machine(s). > > > > > > For concreteness, here is a test CSV I was working with: > > > > > > coords,qvals,numphi,numshots,nummolec,photons,parallel,output_filename > > gold_5nm.coor,gold_qvals.txt,3600,10,1,0.25,12,out0.ring > > gold_5nm.coor,gold_qvals.txt,3600,10,2,0.5,12,out1.ring > > gold_5nm.coor,gold_qvals.txt,3600,10,4,1.0,12,out2.ring > > gold_5nm.coor,gold_qvals.txt,3600,10,8,2.0,12,out3.ring > > gold_5nm.coor,gold_qvals.txt,3600,10,16,4.0,12,out4.ring > > > > > > and my (non-functional) swift script, which will show what I was > > trying to do: > > > > # shoot.swift > > > > type messagefile; > > type pdbfile; > > type shotfile; > > > > type shootargs{ > > pdbfile coords; > > messagefile qvals; > > int numphi; > > int numshots; > > int nummolec; > > int photons; > > int parallel; > > string output_filename; > > } > > > > app (shotfile outputfile) shootsim (shootargs args) { > > polshoot "-s" @args.coords "-f" @args.qvals "-x" args.numphi "-n" > > args.numshots "-m" args.nummolec "-g" args.photons "-p" > > args.parallel "-o" @outputfile; > > } > > > > > > shootargs myargs[] ; > > > > foreach a in myargs { > > shotfile o; // this could be something like myargs.output_filename > > o = shootsim(a); > > } > > > > > > I'm wondering if someone who's worked a bit with swift can give me > > a > > recommendation on how to proceed. Right now I'm playing with just > > writing a huge number of flat text files, each one containing the > > parameter flags that will then get cat'd into the arguments of my > > command-line script "polshoot" on the slave end. This is inelegant > > for obvious reasons, since I'll have a huge number of input files > > and no easy way to keep track of which input matches what output... > > > > > > If anyone has advice, I'm all ears! > > > > > > Thanks, > > > > TJ > > > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > From tjlane at stanford.edu Tue Jun 11 13:50:18 2013 From: tjlane at stanford.edu (TJ Lane) Date: Tue, 11 Jun 2013 11:50:18 -0700 Subject: [Swift-user] Advice on mapping input arguments In-Reply-To: <523093867.2298181.1370924777038.JavaMail.root@mcs.anl.gov> References: <1138174311.2297621.1370923135254.JavaMail.root@mcs.anl.gov> <523093867.2298181.1370924777038.JavaMail.root@mcs.anl.gov> Message-ID: Mike, Awesome, thanks for your help! Seems to be working like a charm. TJ On Mon, Jun 10, 2013 at 9:26 PM, Michael Wilde wrote: > TJ, > > For the specific example you posted, the technique shown in the prior > message works well. > > Here's shoot.swift converted to use that approach. > > - Mike > > mid$ cat shoot.swift > > type qvalfile; > type coordfile; > type shotfile; > > type shootargs{ > string coords; > string qvals; > int numphi; > int numshots; > int nummolec; > float photons; > int parallel; > string output_filename; > } > > app (shotfile outfile) shootsim (shootargs args, coordfile coords, > qvalfile qvals) { > echo "polshoot" "-s" @coords "-f" @qvals "-x" args.numphi "-n" > args.numshots > "-m" args.nummolec "-g" args.photons "-p" > args.parallel "-o" @outfile > stdout=@outfile; # Note double use of outfile, just > for testing/demo > } > > shootargs myargs[] = readData("particles.dat"); > > foreach a in myargs { > shotfile output ; > coordfile coords ; > qvalfile qvals ; > output = shootsim(a, coords, qvals); > } > > mid$ cat particles.dat > coords qvals numphi numshots nummolec photons parallel > output_filename > gold_5nm.coor gold_qvals.txt 3600 10 1 0.25 12 > out0.ring > gold_5nm.coor gold_qvals.txt 3600 10 2 0.5 12 > out1.ring > gold_5nm.coor gold_qvals.txt 3600 10 4 1.0 12 > out2.ring > gold_5nm.coor gold_qvals.txt 3600 10 8 2.0 12 > out3.ring > gold_5nm.coor gold_qvals.txt 3600 10 16 4.0 12 > out4.ring > > > mid$ swift shoot.swift > Swift 0.94 swift-r6414 (swift modified locally) cog-r3648 > > RunID: 20130611-0423-gfp7pohe > Progress: time: Tue, 11 Jun 2013 04:23:18 +0000 > Progress: time: Tue, 11 Jun 2013 04:23:19 +0000 Stage in:1 Finished > successfully:4 > Final status: Tue, 11 Jun 2013 04:23:19 +0000 Finished successfully:5 > > mid$ more out?.ring > :::::::::::::: > out0.ring > :::::::::::::: > polshoot -s gold_5nm.coor -f gold_qvals.txt -x 3600 -n 10 -m 1 -g 0.25 -p > 12 -o out0.ring > :::::::::::::: > out1.ring > :::::::::::::: > polshoot -s gold_5nm.coor -f gold_qvals.txt -x 3600 -n 10 -m 2 -g 0.5 -p > 12 -o out1.ring > :::::::::::::: > out2.ring > :::::::::::::: > polshoot -s gold_5nm.coor -f gold_qvals.txt -x 3600 -n 10 -m 4 -g 1 -p 12 > -o out2.ring > :::::::::::::: > out3.ring > :::::::::::::: > polshoot -s gold_5nm.coor -f gold_qvals.txt -x 3600 -n 10 -m 8 -g 2 -p 12 > -o out3.ring > :::::::::::::: > out4.ring > :::::::::::::: > polshoot -s gold_5nm.coor -f gold_qvals.txt -x 3600 -n 10 -m 16 -g 4 -p 12 > -o out4.ring > mid$ > > > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "TJ Lane" > > Cc: swift-user at ci.uchicago.edu > > Sent: Monday, June 10, 2013 10:58:55 PM > > Subject: Re: [Swift-user] Advice on mapping input arguments > > > > Hi TJ, > > > > Here's a quick initial thought. There might be better ways to do > > this. > > > > The problem as you mention is that there is currently no mapper or > > function that lets you easily handle a mixture of scalars and files > > in a struct. I think that readData and/or mappers should/could > > handle this, but Ive been outvoted in past discussions of this. > > > > So for now, assuming you have more scalar params than file params, > > you can just leave the filenames as strings in the parameter struct, > > then map the smaller number of files from these strings, and call > > your app() using the param struct to pass all the scalar params, and > > pass the mapped files as additional input and output parameters. > > > > This retains, I think, the benefit of having a parameter file that > > provides a handy record of the run's parameters. > > > > Here's an initial example that I *think* is in the spirit of the > > longer one you supplied below. > > > > If this sounds like a reasonable approach I think we can use the same > > technique to run your example. > > > > I provide I think all the files you need to test this. > > > > I'll look more carefully at your example now to see if this is indeed > > what you're trying to do. > > > > - Mike > > > > mid$ cat params.txt > > p1 p2 fn1 fn2 ofn > > 86 99 f1.dat f2.dat row1.out > > 87 98 f3.dat f4.dat row2.out > > > > mid$ cat structs.swift > > > > type file; > > > > type params { > > int p1; > > int p2; > > string fn1; > > string fn2; > > string ofn; > > }; > > > > app (file out) echo (params p, file f1, file f2) > > { > > echo stdout=@out "params:" p.p1 p.p2 "files:" @f1 @f2; > > } > > > > params plist[] = readData("params.txt"); > > > > foreach p, i in plist { > > file f1 ; > > file f2 ; > > file of ; > > of = echo(p,f1,f2); > > } > > > > mid$ cat ~/.swift/swift.properties > > > > sites.file=sites.xml > > tc.file=apps > > > > status.mode=provider > > use.provider.staging=false > > use.wrapper.staging=false > > wrapperlog.always.transfer=true > > execution.retries=0 > > lazy.errors=false > > provider.staging.pin.swiftfiles=false > > sitedir.keep=true > > file.gc.enabled=false > > #tcp.port.range=50000,51000 > > > > mid$ cat apps > > localhost echo echo > > > > mid$ cat sites.xml > > > > > > > > > > /scratch/midway/{env.USER}/swiftwork > > > > > > > > mid$ ls -l f?.dat > > -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f1.dat > > -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f2.dat > > -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f3.dat > > -rw-rw-r-- 1 wilde wilde 4 Jun 10 22:40 f4.dat > > > > mid$ swift structs.swift > > Swift 0.94 swift-r6414 (swift modified locally) cog-r3648 > > > > RunID: 20130611-0343-so454ho1 > > Progress: time: Tue, 11 Jun 2013 03:43:56 +0000 > > Final status: Tue, 11 Jun 2013 03:43:56 +0000 Finished > > successfully:2 > > > > mid$ more row?.out > > :::::::::::::: > > row1.out > > :::::::::::::: > > params: 86 99 files: f1.dat f2.dat > > :::::::::::::: > > row2.out > > :::::::::::::: > > params: 87 98 files: f3.dat f4.dat > > mid$ > > > > > > ----- Original Message ----- > > > From: "TJ Lane" > > > To: swift-user at ci.uchicago.edu > > > Sent: Monday, June 10, 2013 8:15:17 PM > > > Subject: [Swift-user] Advice on mapping input arguments > > > > > > > > > > > > > > > Swift Users, > > > > > > I am wondering if I could get some advice on the best way to do the > > > following in Swift: I want to run a series of simulations > > > performing > > > a parameter scan, for each parameter combination farming the work > > > out to clusters I have access to here at Stanford, and collect the > > > results back onto my desktop. > > > > > > I've gotten some minimal working examples of swift up and running, > > > but hit a roadblock on something quite simple: what's the best way > > > to pass a large number of parameters into a swift script? I have a > > > big list of parameter combinations I'd like to run, and am > > > searching > > > for a sane way to pass all of these into my swift app call. > > > > > > Originally, I thought I'd be able to use the CSV mapper to pass a > > > bunch of arguments from a CSV file into swift -- it seemed perfect! > > > As a bonus, I hoped the CSV file would act as a record of my work, > > > namely what parameters were used to generate what file. But it > > > seems > > > that the CSV mapper automatically maps the entries in the CSV file > > > to swift "mapper" objects -- i.e., it expects my CSV data fields > > > are > > > all files, where as I want some to be ints or floats that get > > > passed > > > directly to the arguments of my command-line script on the slave > > > machine(s). > > > > > > > > > For concreteness, here is a test CSV I was working with: > > > > > > > > > coords,qvals,numphi,numshots,nummolec,photons,parallel,output_filename > > > gold_5nm.coor,gold_qvals.txt,3600,10,1,0.25,12,out0.ring > > > gold_5nm.coor,gold_qvals.txt,3600,10,2,0.5,12,out1.ring > > > gold_5nm.coor,gold_qvals.txt,3600,10,4,1.0,12,out2.ring > > > gold_5nm.coor,gold_qvals.txt,3600,10,8,2.0,12,out3.ring > > > gold_5nm.coor,gold_qvals.txt,3600,10,16,4.0,12,out4.ring > > > > > > > > > and my (non-functional) swift script, which will show what I was > > > trying to do: > > > > > > # shoot.swift > > > > > > type messagefile; > > > type pdbfile; > > > type shotfile; > > > > > > type shootargs{ > > > pdbfile coords; > > > messagefile qvals; > > > int numphi; > > > int numshots; > > > int nummolec; > > > int photons; > > > int parallel; > > > string output_filename; > > > } > > > > > > app (shotfile outputfile) shootsim (shootargs args) { > > > polshoot "-s" @args.coords "-f" @args.qvals "-x" args.numphi "-n" > > > args.numshots "-m" args.nummolec "-g" args.photons "-p" > > > args.parallel "-o" @outputfile; > > > } > > > > > > > > > shootargs myargs[] ; > > > > > > foreach a in myargs { > > > shotfile o; // this could be something like myargs.output_filename > > > o = shootsim(a); > > > } > > > > > > > > > I'm wondering if someone who's worked a bit with swift can give me > > > a > > > recommendation on how to proceed. Right now I'm playing with just > > > writing a huge number of flat text files, each one containing the > > > parameter flags that will then get cat'd into the arguments of my > > > command-line script "polshoot" on the slave end. This is inelegant > > > for obvious reasons, since I'll have a huge number of input files > > > and no easy way to keep track of which input matches what output... > > > > > > > > > If anyone has advice, I'm all ears! > > > > > > > > > Thanks, > > > > > > TJ > > > > > > > > > > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjlane at stanford.edu Sat Jun 15 17:15:04 2013 From: tjlane at stanford.edu (TJ Lane) Date: Sat, 15 Jun 2013 15:15:04 -0700 Subject: [Swift-user] Debugging Swift Coaster ServiceManager Message-ID: Swift Users, Finally back to trying out swift after a delay -- thanks for all your help so far. I've got a functional swift script up and running, and am now trying to configure my sites.xml to get it running on 4 remote clusters. I've gotten it working on 2, so 2 more to go! Let's focus on one first. This cluster is running PBS and I'm trying to access it using coasters, via provider="ssh-cl:pbs". Unfortunately, it seems like swift can't boot up the coaster service for some reason, which I haven't been able to figure out. Maybe someone can help me debug this, or at least know where to start poking around! Here's the site xml entry: 00:30:00 100 100 3600 batch 10 1 1 1 1.0 10000 /home/tjlane/swiftwork and here's what gets printed when I try and run a very basic "hello cluster" swift script: tjlane at vspm42 ~/swift_hello $ swift -sites.file ~/opt/swift-0.94/etc/sites.xml -tc.file ~/opt/swift-0.94/etc/tc.data -config swift.properties uname.swift Swift started Swift 0.94 swift-r6492 cog-r3658 RunID: 20130615-1512-h2fskgme Progress: time: Sat, 15 Jun 2013 15:12:32 -0700 Progress: time: Sat, 15 Jun 2013 15:12:34 -0700 Submitted:1 Execution failed: Exception in uname: Arguments: [-a] Host: biox3 Directory: uname-20130615-1512-h2fskgme/jobs/a/uname-aan4rzal Caused by: Could not submit job Caused by: Could not start coaster service Caused by: Task ended before registration was received. Failed to start coaster service java.lang.NullPointerException at java.net.URI.compareTo(libgcj.so.10) at java.net.URI.compareTo(libgcj.so.10) at java.util.TreeMap.compare(libgcj.so.10) at java.util.TreeMap.put(libgcj.so.10) at java.util.TreeSet.addAll(libgcj.so.10) at org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) at org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) at org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) at org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) java.lang.NullPointerException at java.net.URI.compareTo(libgcj.so.10) at java.net.URI.compareTo(libgcj.so.10) at java.util.TreeMap.compare(libgcj.so.10) at java.util.TreeMap.put(libgcj.so.10) at java.util.TreeSet.addAll(libgcj.so.10) at org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) at org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) at org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) at org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) uname, uname.swift, line 12 Finally, here's part of what gets dumped to my log file: 2013-06-15 14:54:22,350-0700 INFO BootstrapService [/171.67.106.68:39309] GET /coaster-bootstrap.jar HTTP/1.0 2013-06-15 14:54:22,713-0700 INFO ServiceManager Service task Task(type=JOB_SUBMISSION, identity=urn:cog-1371333260175) terminated. Removing service. 2013-06-15 14:54:22,713-0700 INFO ServiceManager Service does not appear to be registered with this manager 2013-06-15 14:54:22,713-0700 INFO ServiceManager Coaster service ended. Reason: null stdout: stderr: Failed to start coaster service java.lang.NullPointerException at java.net.URI.compareTo(libgcj.so.10) at java.net.URI.compareTo(libgcj.so.10) at java.util.TreeMap.compare(libgcj.so.10) at java.util.TreeMap.put(libgcj.so.10) at java.util.TreeSet.addAll(libgcj.so.10) at org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) at org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) at org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) at org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) java.lang.NullPointerException at java.net.URI.compareTo(libgcj.so.10) at java.net.URI.compareTo(libgcj.so.10) at java.util.TreeMap.compare(libgcj.so.10) at java.util.TreeMap.put(libgcj.so.10) at java.util.TreeSet.addAll(libgcj.so.10) at org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) at org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) at org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) at org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) 2013-06-15 14:54:22,714-0700 INFO NotificationManager biox3.stanford.edu 2013-06-15 14:54:22,771-0700 INFO RuntimeStats$ProgressTicker Submitted:1 2013-06-15 14:54:22,775-0700 DEBUG swift APPLICATION_EXCEPTION jobid=uname-d77eqzal - Application exception: Caused by: Could not submit job Caused by: Could not start coaster service Caused by: Task ended before registration was received. Failed to start coaster service java.lang.NullPointerException at java.net.URI.compareTo(libgcj.so.10) at java.net.URI.compareTo(libgcj.so.10) at java.util.TreeMap.compare(libgcj.so.10) at java.util.TreeMap.put(libgcj.so.10) at java.util.TreeSet.addAll(libgcj.so.10) at org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) at org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) at org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) at org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) java.lang.NullPointerException at java.net.URI.compareTo(libgcj.so.10) at java.net.URI.compareTo(libgcj.so.10) at java.util.TreeMap.compare(libgcj.so.10) at java.util.TreeMap.put(libgcj.so.10) at java.util.TreeSet.addAll(libgcj.so.10) at org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) at org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) at org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) at org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) Any help or advice on how to resolve this issue, much much appreciated! Thanks, TJ -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Sat Jun 15 18:52:14 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 15 Jun 2013 18:52:14 -0500 (CDT) Subject: [Swift-user] Debugging Swift Coaster ServiceManager In-Reply-To: Message-ID: <1201844867.3625431.1371340334599.JavaMail.root@mcs.anl.gov> Hi TJ, Mihael would be able to interpret these errors better than I, but here are some things to check: - Does the host on which you are running Swift (presumably a cluster login host?) have multiple interfaces? If they are not all reachable from biox3.stanford.edu, then set GLOBUS_HOSTNAME=hostname where hostname is either a dns name or an IP address address of swift host which is reachable from biox3. - If your username on biox3 is different than your username on the swift host, set it in $HOME/.ssh/conf: Host biox3.stanford.edu Hostname biox3.stanford.edu User TJsOtherUsername - also set the workdirectory accordingly for the remote host (ie if your username there is not tjlane) - If biox3 can not connect back to the swift host on any anonymous port (eg due to firewall rules), set the valid port range: export GLOBUS_TCP_PORT_RANGE=50000,51000 # for example export GLOBUS_TCP_SOURCE_RANGE=50000,51000 - make sure java is in your path or available on biox3 by default. I *think* that swift coaster bootstrap causes your login shell's profile/rc to run. Need to check that. Make sure its a reasonable Java: Sun/Oracle 1.6 or later. I see some traces of GCJ in the traceback: thats a possible problem: ("libgcj.so.10") - try ssh'ing a simple test to biox3 and make sure for example that your login there can write to scratch directories. - make sure that biox3 has a queue named "batch" and that it accepts one-hour jobs (maxtime setting) Several of these problems (eg the ast one) wont stop the coaster service from starting, but would prevent coaster workers from starting. Mihael can likely diagnose which of these or other routes are most likely the cause, and what evidence to look for on biox3. Lastly, I see that both the IP listed of your swift host and biox3 are pingable on the public net, so its not likely the first problem (IP reachability). - Mike ----- Original Message ----- > From: "TJ Lane" > To: swift-user at ci.uchicago.edu > Sent: Saturday, June 15, 2013 5:15:04 PM > Subject: [Swift-user] Debugging Swift Coaster ServiceManager > > Swift Users, > > Finally back to trying out swift after a delay -- thanks for all your > help > so far. > > I've got a functional swift script up and running, and am now trying > to > configure my sites.xml to get it running on 4 remote clusters. I've > gotten > it working on 2, so 2 more to go! > > Let's focus on one first. This cluster is running PBS and I'm trying > to > access it using coasters, via provider="ssh-cl:pbs". Unfortunately, > it > seems like swift can't boot up the coaster service for some reason, > which I > haven't been able to figure out. Maybe someone can help me debug > this, or > at least know where to start poking around! > > Here's the site xml entry: > > > > > > 00:30:00 > > 100 > key="highOverAllocation">100 > 3600 > > batch > 10 > 1 > 1 > > 1 > > 1.0 > 10000 > > > > /home/tjlane/swiftwork > > > > and here's what gets printed when I try and run a very basic "hello > cluster" swift script: > > tjlane at vspm42 ~/swift_hello > $ swift -sites.file ~/opt/swift-0.94/etc/sites.xml -tc.file > ~/opt/swift-0.94/etc/tc.data -config swift.properties uname.swift > Swift started > Swift 0.94 swift-r6492 cog-r3658 > > RunID: 20130615-1512-h2fskgme > Progress: time: Sat, 15 Jun 2013 15:12:32 -0700 > Progress: time: Sat, 15 Jun 2013 15:12:34 -0700 Submitted:1 > Execution failed: > Exception in uname: > Arguments: [-a] > Host: biox3 > Directory: uname-20130615-1512-h2fskgme/jobs/a/uname-aan4rzal > > Caused by: > Could not submit job > Caused by: > Could not start coaster service > Caused by: > Task ended before registration was received. > > Failed to start coaster service > java.lang.NullPointerException > at java.net.URI.compareTo(libgcj.so.10) > at java.net.URI.compareTo(libgcj.so.10) > at java.util.TreeMap.compare(libgcj.so.10) > at java.util.TreeMap.put(libgcj.so.10) > at java.util.TreeSet.addAll(libgcj.so.10) > at > org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) > at > org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) > at > org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) > at > org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) > java.lang.NullPointerException > at java.net.URI.compareTo(libgcj.so.10) > at java.net.URI.compareTo(libgcj.so.10) > at java.util.TreeMap.compare(libgcj.so.10) > at java.util.TreeMap.put(libgcj.so.10) > at java.util.TreeSet.addAll(libgcj.so.10) > at > org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) > at > org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) > at > org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) > at > org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) > > > uname, uname.swift, line 12 > > Finally, here's part of what gets dumped to my log file: > > > 2013-06-15 14:54:22,350-0700 INFO BootstrapService > [/171.67.106.68:39309] > GET /coaster-bootstrap.jar HTTP/1.0 > 2013-06-15 14:54:22,713-0700 INFO ServiceManager Service task > Task(type=JOB_SUBMISSION, identity=urn:cog-1371333260175) terminated. > Removing service. > 2013-06-15 14:54:22,713-0700 INFO ServiceManager Service does not > appear > to be registered with this manager > 2013-06-15 14:54:22,713-0700 INFO ServiceManager Coaster service > ended. > Reason: null > stdout: > stderr: Failed to start coaster service > java.lang.NullPointerException > at java.net.URI.compareTo(libgcj.so.10) > at java.net.URI.compareTo(libgcj.so.10) > at java.util.TreeMap.compare(libgcj.so.10) > at java.util.TreeMap.put(libgcj.so.10) > at java.util.TreeSet.addAll(libgcj.so.10) > at > org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) > at > org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) > at > org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) > at > org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) > java.lang.NullPointerException > at java.net.URI.compareTo(libgcj.so.10) > at java.net.URI.compareTo(libgcj.so.10) > at java.util.TreeMap.compare(libgcj.so.10) > at java.util.TreeMap.put(libgcj.so.10) > at java.util.TreeSet.addAll(libgcj.so.10) > at > org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) > at > org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) > at > org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) > at > org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) > > > 2013-06-15 14:54:22,714-0700 INFO NotificationManager > biox3.stanford.edu > 2013-06-15 14:54:22,771-0700 INFO RuntimeStats$ProgressTicker > Submitted:1 > 2013-06-15 14:54:22,775-0700 DEBUG swift APPLICATION_EXCEPTION > jobid=uname-d77eqzal - Application exception: Caused by: Could not > submit > job > Caused by: Could not start coaster service > Caused by: Task ended before registration was received. > > Failed to start coaster service > java.lang.NullPointerException > at java.net.URI.compareTo(libgcj.so.10) > at java.net.URI.compareTo(libgcj.so.10) > at java.util.TreeMap.compare(libgcj.so.10) > at java.util.TreeMap.put(libgcj.so.10) > at java.util.TreeSet.addAll(libgcj.so.10) > at > org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) > at > org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) > at > org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) > at > org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) > java.lang.NullPointerException > at java.net.URI.compareTo(libgcj.so.10) > at java.net.URI.compareTo(libgcj.so.10) > at java.util.TreeMap.compare(libgcj.so.10) > at java.util.TreeMap.put(libgcj.so.10) > at java.util.TreeSet.addAll(libgcj.so.10) > at > org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) > at > org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) > at > org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) > at > org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) > > > > Any help or advice on how to resolve this issue, much much > appreciated! > > Thanks, > > TJ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From wilde at mcs.anl.gov Sat Jun 15 19:11:39 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 15 Jun 2013 19:11:39 -0500 (CDT) Subject: [Swift-user] Debugging Swift Coaster ServiceManager In-Reply-To: <1201844867.3625431.1371340334599.JavaMail.root@mcs.anl.gov> Message-ID: <1630644704.3625706.1371341499376.JavaMail.root@mcs.anl.gov> The more I look at the error the more it looks like the coaster service on biox3 is getting a null pointer exception "at java.net.URI.compareTo(libgcj.so.10)" and that may be the root cause. We dont to my knowledge test on open source Java's yet, although we have discussed this. We used to see lots of incompatibilities with Swift on them, but these have been diminishing. Can you test again after making sure that Java 1.6 or higher is in your PATH on biox3? Thanks, - Mike ----- Original Message ----- > From: "Michael Wilde" > To: "TJ Lane" > Cc: swift-user at ci.uchicago.edu > Sent: Saturday, June 15, 2013 6:52:14 PM > Subject: Re: [Swift-user] Debugging Swift Coaster ServiceManager > > Hi TJ, > > Mihael would be able to interpret these errors better than I, but > here are some things to check: > > - Does the host on which you are running Swift (presumably a cluster > login host?) have multiple interfaces? If they are not all reachable > from biox3.stanford.edu, then set GLOBUS_HOSTNAME=hostname where > hostname is either a dns name or an IP address address of swift host > which is reachable from biox3. > > - If your username on biox3 is different than your username on the > swift host, set it in $HOME/.ssh/conf: > > Host biox3.stanford.edu > Hostname biox3.stanford.edu > User TJsOtherUsername > > - also set the workdirectory accordingly for the remote host (ie if > your username there is not tjlane) > > - If biox3 can not connect back to the swift host on any anonymous > port (eg due to firewall rules), set the valid port range: > > export GLOBUS_TCP_PORT_RANGE=50000,51000 # for example > export GLOBUS_TCP_SOURCE_RANGE=50000,51000 > > - make sure java is in your path or available on biox3 by default. I > *think* that swift coaster bootstrap causes your login shell's > profile/rc to run. Need to check that. Make sure its a reasonable > Java: Sun/Oracle 1.6 or later. > I see some traces of GCJ in the traceback: thats a possible problem: > ("libgcj.so.10") > > - try ssh'ing a simple test to biox3 and make sure for example that > your login there can write to scratch directories. > > - make sure that biox3 has a queue named "batch" and that it accepts > one-hour jobs (maxtime setting) > > Several of these problems (eg the ast one) wont stop the coaster > service from starting, but would prevent coaster workers from > starting. > > Mihael can likely diagnose which of these or other routes are most > likely the cause, and what evidence to look for on biox3. > > Lastly, I see that both the IP listed of your swift host and biox3 > are pingable on the public net, so its not likely the first problem > (IP reachability). > > - Mike > > > > > ----- Original Message ----- > > From: "TJ Lane" > > To: swift-user at ci.uchicago.edu > > Sent: Saturday, June 15, 2013 5:15:04 PM > > Subject: [Swift-user] Debugging Swift Coaster ServiceManager > > > > Swift Users, > > > > Finally back to trying out swift after a delay -- thanks for all > > your > > help > > so far. > > > > I've got a functional swift script up and running, and am now > > trying > > to > > configure my sites.xml to get it running on 4 remote clusters. I've > > gotten > > it working on 2, so 2 more to go! > > > > Let's focus on one first. This cluster is running PBS and I'm > > trying > > to > > access it using coasters, via provider="ssh-cl:pbs". Unfortunately, > > it > > seems like swift can't boot up the coaster service for some reason, > > which I > > haven't been able to figure out. Maybe someone can help me debug > > this, or > > at least know where to start poking around! > > > > Here's the site xml entry: > > > > > > > > > > > > > key="maxWalltime">00:30:00 > > > > > key="lowOverAllocation">100 > > > key="highOverAllocation">100 > > 3600 > > > > batch > > 10 > > 1 > > 1 > > > > 1 > > > > 1.0 > > 10000 > > > > > > > > /home/tjlane/swiftwork > > > > > > > > and here's what gets printed when I try and run a very basic "hello > > cluster" swift script: > > > > tjlane at vspm42 ~/swift_hello > > $ swift -sites.file ~/opt/swift-0.94/etc/sites.xml -tc.file > > ~/opt/swift-0.94/etc/tc.data -config swift.properties uname.swift > > Swift started > > Swift 0.94 swift-r6492 cog-r3658 > > > > RunID: 20130615-1512-h2fskgme > > Progress: time: Sat, 15 Jun 2013 15:12:32 -0700 > > Progress: time: Sat, 15 Jun 2013 15:12:34 -0700 Submitted:1 > > Execution failed: > > Exception in uname: > > Arguments: [-a] > > Host: biox3 > > Directory: uname-20130615-1512-h2fskgme/jobs/a/uname-aan4rzal > > > > Caused by: > > Could not submit job > > Caused by: > > Could not start coaster service > > Caused by: > > Task ended before registration was received. > > > > Failed to start coaster service > > java.lang.NullPointerException > > at java.net.URI.compareTo(libgcj.so.10) > > at java.net.URI.compareTo(libgcj.so.10) > > at java.util.TreeMap.compare(libgcj.so.10) > > at java.util.TreeMap.put(libgcj.so.10) > > at java.util.TreeSet.addAll(libgcj.so.10) > > at > > org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) > > at > > org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) > > at > > org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) > > at > > org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) > > java.lang.NullPointerException > > at java.net.URI.compareTo(libgcj.so.10) > > at java.net.URI.compareTo(libgcj.so.10) > > at java.util.TreeMap.compare(libgcj.so.10) > > at java.util.TreeMap.put(libgcj.so.10) > > at java.util.TreeSet.addAll(libgcj.so.10) > > at > > org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) > > at > > org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) > > at > > org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) > > at > > org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) > > > > > > uname, uname.swift, line 12 > > > > Finally, here's part of what gets dumped to my log file: > > > > > > 2013-06-15 14:54:22,350-0700 INFO BootstrapService > > [/171.67.106.68:39309] > > GET /coaster-bootstrap.jar HTTP/1.0 > > 2013-06-15 14:54:22,713-0700 INFO ServiceManager Service task > > Task(type=JOB_SUBMISSION, identity=urn:cog-1371333260175) > > terminated. > > Removing service. > > 2013-06-15 14:54:22,713-0700 INFO ServiceManager Service does not > > appear > > to be registered with this manager > > 2013-06-15 14:54:22,713-0700 INFO ServiceManager Coaster service > > ended. > > Reason: null > > stdout: > > stderr: Failed to start coaster service > > java.lang.NullPointerException > > at java.net.URI.compareTo(libgcj.so.10) > > at java.net.URI.compareTo(libgcj.so.10) > > at java.util.TreeMap.compare(libgcj.so.10) > > at java.util.TreeMap.put(libgcj.so.10) > > at java.util.TreeSet.addAll(libgcj.so.10) > > at > > org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) > > at > > org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) > > at > > org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) > > at > > org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) > > java.lang.NullPointerException > > at java.net.URI.compareTo(libgcj.so.10) > > at java.net.URI.compareTo(libgcj.so.10) > > at java.util.TreeMap.compare(libgcj.so.10) > > at java.util.TreeMap.put(libgcj.so.10) > > at java.util.TreeSet.addAll(libgcj.so.10) > > at > > org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) > > at > > org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) > > at > > org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) > > at > > org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) > > > > > > 2013-06-15 14:54:22,714-0700 INFO NotificationManager > > biox3.stanford.edu > > 2013-06-15 14:54:22,771-0700 INFO RuntimeStats$ProgressTicker > > Submitted:1 > > 2013-06-15 14:54:22,775-0700 DEBUG swift APPLICATION_EXCEPTION > > jobid=uname-d77eqzal - Application exception: Caused by: Could not > > submit > > job > > Caused by: Could not start coaster service > > Caused by: Task ended before registration was received. > > > > Failed to start coaster service > > java.lang.NullPointerException > > at java.net.URI.compareTo(libgcj.so.10) > > at java.net.URI.compareTo(libgcj.so.10) > > at java.util.TreeMap.compare(libgcj.so.10) > > at java.util.TreeMap.put(libgcj.so.10) > > at java.util.TreeSet.addAll(libgcj.so.10) > > at > > org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) > > at > > org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) > > at > > org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) > > at > > org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) > > java.lang.NullPointerException > > at java.net.URI.compareTo(libgcj.so.10) > > at java.net.URI.compareTo(libgcj.so.10) > > at java.util.TreeMap.compare(libgcj.so.10) > > at java.util.TreeMap.put(libgcj.so.10) > > at java.util.TreeSet.addAll(libgcj.so.10) > > at > > org.globus.cog.abstraction.coaster.service.job.manager.Settings.setCallbackURIs(Settings.java:403) > > at > > org.globus.cog.abstraction.coaster.service.job.manager.JobQueue.(JobQueue.java:41) > > at > > org.globus.cog.abstraction.coaster.service.CoasterService.start(CoasterService.java:148) > > at > > org.globus.cog.abstraction.coaster.service.CoasterService.main(CoasterService.java:382) > > > > > > > > Any help or advice on how to resolve this issue, much much > > appreciated! > > > > Thanks, > > > > TJ > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > From hategan at mcs.anl.gov Sat Jun 15 22:06:01 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 15 Jun 2013 20:06:01 -0700 Subject: [Swift-user] Debugging Swift Coaster ServiceManager In-Reply-To: <1630644704.3625706.1371341499376.JavaMail.root@mcs.anl.gov> References: <1630644704.3625706.1371341499376.JavaMail.root@mcs.anl.gov> Message-ID: <1371351961.20565.15.camel@echo> On Sat, 2013-06-15 at 19:11 -0500, Michael Wilde wrote: > The more I look at the error the more it looks like the coaster service on biox3 is getting a null pointer exception "at java.net.URI.compareTo(libgcj.so.10)" and that may be the root cause. > > We dont to my knowledge test on open source Java's yet, although we have discussed this. > > We used to see lots of incompatibilities with Swift on them, but these have been diminishing. > > Can you test again after making sure that Java 1.6 or higher is in your PATH on biox3? Right. We don't support the GNU java thing. Things should, however, work on OpenJDK or IcedTea. Mihael From tjlane at stanford.edu Sat Jun 15 23:23:06 2013 From: tjlane at stanford.edu (TJ Lane) Date: Sat, 15 Jun 2013 21:23:06 -0700 Subject: [Swift-user] Debugging Swift Coaster ServiceManager In-Reply-To: <1371351961.20565.15.camel@echo> References: <1630644704.3625706.1371341499376.JavaMail.root@mcs.anl.gov> <1371351961.20565.15.camel@echo> Message-ID: Awesome thanks guys -- makes sense. I'll see if I can get our sysadmin to install it and get back to you if it doesn't work. On Sat, Jun 15, 2013 at 8:06 PM, Mihael Hategan wrote: > On Sat, 2013-06-15 at 19:11 -0500, Michael Wilde wrote: > > The more I look at the error the more it looks like the coaster service > on biox3 is getting a null pointer exception "at > java.net.URI.compareTo(libgcj.so.10)" and that may be the root cause. > > > > We dont to my knowledge test on open source Java's yet, although we have > discussed this. > > > > We used to see lots of incompatibilities with Swift on them, but these > have been diminishing. > > > > Can you test again after making sure that Java 1.6 or higher is in your > PATH on biox3? > > Right. We don't support the GNU java thing. Things should, however, work > on OpenJDK or IcedTea. > > Mihael > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Sat Jun 15 23:37:35 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 15 Jun 2013 23:37:35 -0500 (CDT) Subject: [Swift-user] Debugging Swift Coaster ServiceManager In-Reply-To: Message-ID: <897752982.3636618.1371357455960.JavaMail.root@mcs.anl.gov> TJ, its pretty easy to just install it yourself and add your local copy to your path via your shell startup files. Grab the JDK SE release from here thats appropriate for biox3: http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html probably this one: jdk-7u21-linux-x64.tar.gz Just untar it and run the installer. Test it with bin/java -version. Then add its bin/ dir to your PATH Test whether swift/coasters will see the right version by doing: ssh biox3.stanford.edu "which java; java -version" - Mike ----- Original Message ----- > From: "TJ Lane" > To: "Mihael Hategan" > Cc: "Michael Wilde" , swift-user at ci.uchicago.edu > Sent: Saturday, June 15, 2013 11:23:06 PM > Subject: Re: [Swift-user] Debugging Swift Coaster ServiceManager > > > > Awesome thanks guys -- makes sense. I'll see if I can get our > sysadmin to install it and get back to you if it doesn't work. > > > > > On Sat, Jun 15, 2013 at 8:06 PM, Mihael Hategan < hategan at mcs.anl.gov > > wrote: > > > On Sat, 2013-06-15 at 19:11 -0500, Michael Wilde wrote: > > The more I look at the error the more it looks like the coaster > > service on biox3 is getting a null pointer exception "at > > java.net.URI.compareTo(libgcj.so.10)" and that may be the root > > cause. > > > > We dont to my knowledge test on open source Java's yet, although we > > have discussed this. > > > > We used to see lots of incompatibilities with Swift on them, but > > these have been diminishing. > > > > Can you test again after making sure that Java 1.6 or higher is in > > your PATH on biox3? > > Right. We don't support the GNU java thing. Things should, however, > work > on OpenJDK or IcedTea. > > Mihael > > > > From wilde at mcs.anl.gov Sat Jun 15 23:43:23 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 15 Jun 2013 23:43:23 -0500 (CDT) Subject: [Swift-user] Debugging Swift Coaster ServiceManager In-Reply-To: <897752982.3636618.1371357455960.JavaMail.root@mcs.anl.gov> Message-ID: <29183034.3636732.1371357803763.JavaMail.root@mcs.anl.gov> > Just untar it and run the installer. Test it with bin/java -version. Correction: there is no installer. Just untar it: cd $HOME/something tar zxf jdk-7u21-linux-x64.tar.gz PATH=$PWD/jdk1.7.0_21/bin:$PATH # <== in .bashrc, .bash_profile, as appropriate - Mike From tjlane at stanford.edu Sun Jun 16 00:14:23 2013 From: tjlane at stanford.edu (TJ Lane) Date: Sat, 15 Jun 2013 22:14:23 -0700 Subject: [Swift-user] Debugging Swift Coaster ServiceManager In-Reply-To: <29183034.3636732.1371357803763.JavaMail.root@mcs.anl.gov> References: <897752982.3636618.1371357455960.JavaMail.root@mcs.anl.gov> <29183034.3636732.1371357803763.JavaMail.root@mcs.anl.gov> Message-ID: Mike, Yep that was a lot easier than building from source, which is what I was trying to do earlier lol. Java appears to be working now (yay) but I think I have a problem somewhere else. Let me poke around and if I am still stuck I'll ping you guys again! Thanks for all the help so far! TJ On Sat, Jun 15, 2013 at 9:43 PM, Michael Wilde wrote: > > > Just untar it and run the installer. Test it with bin/java -version. > > Correction: there is no installer. Just untar it: > > cd $HOME/something > tar zxf jdk-7u21-linux-x64.tar.gz > PATH=$PWD/jdk1.7.0_21/bin:$PATH # <== in .bashrc, .bash_profile, as > appropriate > > - Mike > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Thu Jun 27 10:39:46 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 27 Jun 2013 10:39:46 -0500 (CDT) Subject: [Swift-user] Linux.com posting on Swift Message-ID: <593994191.6397677.1372347586768.JavaMail.root@mcs.anl.gov> A perspective on Swift and parallel programming from the Linux Foundation: https://www.linux.com/news/featured-blogs/200-libby-clark/725638-swift-the-easy-scripting-language-for-parallel-computing From davidk at ci.uchicago.edu Fri Jun 28 13:55:53 2013 From: davidk at ci.uchicago.edu (David Kelly) Date: Fri, 28 Jun 2013 13:55:53 -0500 (CDT) Subject: [Swift-user] Mapper padding In-Reply-To: <194311737.8720317.1372444324272.JavaMail.root@ci.uchicago.edu> Message-ID: <2028061335.8726828.1372445753964.JavaMail.root@ci.uchicago.edu> Hello, I'm writing a swift script where I'm trying to use the value of an integer to find and map a file. The filename I'm trying to map has the number in it, but is always padded with 0s to make it exactly 9 numbers long. The filename for 64 would be 000000064.csv, for example. My number can be any value from 0 to 999999999. Is there a way to easily build the filename string using any of the existing string libraries? As far as I know we don't have a printf-like way to format this at the moment. Is there a way to do this with mapper padding? I can do it with an ext mapper, just wondering if there's a better way. Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidk at ci.uchicago.edu Fri Jun 28 15:53:01 2013 From: davidk at ci.uchicago.edu (David Kelly) Date: Fri, 28 Jun 2013 15:53:01 -0500 (CDT) Subject: [Swift-user] Mapper padding In-Reply-To: <2028061335.8726828.1372445753964.JavaMail.root@ci.uchicago.edu> Message-ID: <404785118.8751812.1372452781194.JavaMail.root@ci.uchicago.edu> Mike suggested I try using regular expressions. This works by prepending my number with seven 0s, then using a regular expression to get the last 8 numbers. string longname = @strcat("0000000", i); string filename = @strcut(longname, "([0-9]........$)"); I tried to [0-9]{8}$, but it got confused parsing and thought I was trying to reference a variable called 8, so I had to use the dots. ----- Original Message ----- > From: "David Kelly" > To: "swift user" > Sent: Friday, June 28, 2013 1:55:53 PM > Subject: [Swift-user] Mapper padding > Hello, > I'm writing a swift script where I'm trying to use the value of an > integer to find and map a file. The filename I'm trying to map has > the number in it, but is always padded with 0s to make it exactly 9 > numbers long. The filename for 64 would be 000000064.csv, for > example. My number can be any value from 0 to 999999999. > Is there a way to easily build the filename string using any of the > existing string libraries? As far as I know we don't have a > printf-like way to format this at the moment. > Is there a way to do this with mapper padding? > I can do it with an ext mapper, just wondering if there's a better > way. > Thanks, > David > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user -------------- next part -------------- An HTML attachment was scrubbed... URL: