[opa-nightly-tests] jam OPA_Daily_Tests_0404Mon_FAILED!!!

Dave Goodell goodell at mcs.anl.gov
Mon Apr 4 14:25:54 CDT 2011


Neil, any idea what's up here?  These look like real failures in the queue test, which could be caused by one or more of the following things:

1) bad queue code

2) bad queue test code

3) bad build system / #ifdef code that ends up selecting the wrong primitives implementation

4) an actual bug in the x86_64 primitives that hasn't been flushed out yet by other tests

5) hardware issues

This looks like a new machine in the testing lineup.  How much confidence do you have in amani?

-Dave

On Apr 4, 2011, at 12:22 AM CDT, HDF Tester wrote:

> *** OPA Tests on 0404Mon ***
> =============================
>   Tests Summary
> =============================
> ****FAILED amani: standard****
> 
> PASSED duty: standard
> PASSED heiwa: standard
> PASSED jam: standard
> PASSED linew: standard
> 
> =============================
>   Tests Time Summary
> =============================
> jam: Ran 1(1/0/0) tests, Grand total test time =  1m 22s
> amani: Ran 1(0/1/0) tests, Grand total test time =  2m 16s
> heiwa: Ran 1(1/0/0) tests, Grand total test time =  5m 37s
> duty: Ran 1(1/0/0) tests, Grand total test time =  5m 51s
> linew: Ran 1(1/0/0) tests, Grand total test time =  11m 40s
> jam: Ran 6(0/0/0) hosts, Grand total test time =  12m 37s
> 
> 
> =============================
>   Timekeeper log
> =============================
> Timekeeper started at Mon Apr  4 00:10:29 CDT 2011
> Timekeeper sleeping for 1800 seconds
> 
> 
> =============================
>   Tests Failures
> =============================
> =========================
> Dumping logfile of amani: standard
> Last 50 lines of /mnt/scr1/SnapTest/snapshots-opa/log/amani_0404Mon_0010
> =========================
>    LL/SC not available
> Testing pointer LL/SC stack                                            -SKIP-
>    LL/SC not available
> All primitives tests passed.
> PASS: test_primitives
> Testing memory barrier sanity                                          PASSED
> Testing memory barriers with linear array with 2 threads               PASSED
> Testing memory barriers with local variables with 2 threads            PASSED
> Testing memory barriers with scattered array with 2 threads            PASSED
> Testing memory barriers with linear array with 4 threads               PASSED
> Testing memory barriers with local variables with 4 threads            PASSED
> Testing memory barriers with scattered array with 4 threads            PASSED
> Testing memory barriers with linear array with 10 threads              PASSED
> Testing memory barriers with local variables with 10 threads           PASSED
> Testing memory barriers with scattered array with 10 threads           PASSED
> Testing memory barriers with linear array with 100 threads             PASSED
> Testing memory barriers with local variables with 100 threads          PASSED
> Testing memory barriers with scattered array with 100 threads          PASSED
> All barriers tests passed.
> PASS: test_barriers
> Testing queue sanity                                                   PASSED
> Testing multithreaded queue with 2 threads                             PASSED
> Testing multithreaded queue (empty queue) with 2 threads               PASSED
> Testing multithreaded queue (full queue) with 2 threads                PASSED
> Testing multithreaded queue with 4 threads                             PASSED
> Testing multithreaded queue (empty queue) with 4 threads               PASSED
> Testing multithreaded queue (full queue) with 4 threads                PASSED
> Testing multithreaded queue with 10 threads                            PASSED
> Testing multithreaded queue (empty queue) with 10 threads                 Incorrect number of elements dequeued: 4031909 Expected: 4500000
> *FAILED*
>        at /home/hdftest/snapshots-opa/current/test/test_queue.c:399 in test_queue_threaded()...
>    Unexpected return from 1 thread
> Testing multithreaded queue (full queue) with 10 threads               PASSED
> Testing multithreaded queue with 100 threads                           PASSED
> Testing multithreaded queue (empty queue) with 100 threads             PASSED
> Testing multithreaded queue (full queue) with 100 threads              PASSED
> ***** 1 QUEUE TEST FAILED! *****
> FAIL: test_queue
> ===================================================================
> 1 of 4 tests failed
> Please report to https://trac.mcs.anl.gov/projects/openpa/newticket
> ===================================================================
> gmake[2]: *** [check-TESTS] Error 1
> gmake[2]: Leaving directory `/scr/hdftest/snapshots-opa/TestDir/amani/test'
> gmake[1]: *** [check-am] Error 2
> gmake[1]: Leaving directory `/scr/hdftest/snapshots-opa/TestDir/amani/test'
> gmake: *** [check-recursive] Error 1
> Failed running make check
> ===== Exit bin/snapshot with status=2: Mon Apr  4 00:13:06 CDT 2011 =====
> Mon Apr  4 00:13:06 CDT 2011
> =========================
> Dumping done
> =========================
> 
> Runtest did not exit normally.
> 
> =============================
>   Watchers List
> =============================
> OPA Daily test features/platforms watchers and procedure
> ---------------------------------------------------------
> 
> Procedure:
> The watcher will investigate and report the cause of failure by 11am.
> The developer who checked in the error code may report so by then too.
> The watcher or the developer should get the failure fixed and report it
> by 3pm.
> 
> 
> Watcher for OPA:	 	Neil
> 
> 
> ---
> updated: 2009/05/05
> 
> =============================
>   Tests Details
> =============================
> 00:10:09 up 59 days, 14:42, 74 users,  load average: 1.90, 1.98, 2.20
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/sda3             31738420   4065196  26034996  14% /
> /dev/sda1               101086     22885     72982  24% /boot
> /dev/sda2             31738420    209024  29891168   1% /tmp
> /dev/sda6             31738392   5959784  24140384  20% /var
> /dev/sda7             31738392  30271700         0 100% /usr
> /dev/sda8            124991068   4814828 113724540   5% /var/tmp
> /dev/mapper/VolGroup00-home
>                     198351840  15132624 172980856   9% /home
> /dev/sde1            565688764 380484688 156005140  71% /scr
> /dev/sdc1            961432072 818852580  93741492  90% /mnt/scr1
> /dev/sdd1            961432072 885174524  27419548  97% /mnt/hdf
> tmpfs                  8313848        16   8313832   1% /dev/shm
> gumund:/data/ftp     480719072 285632320 170667552  63% /mnt/ftp
> gumund:/data/web     480719072 285632320 170667552  63% /mnt/web
> amani:/mnt/rw-src    288451232 179573408  94225344  66% /mnt/ro-src
> amani:/mnt/rw-src    288451232 179573408  94225344  66% /mnt/rw-src
> STANDARD_OPT=op-configure --prefix=${PWD}/opainstall
> TEST_TYPES=standard
> 
> Running source repository checkout with output saved in
>   /mnt/scr1/SnapTest/snapshots-opa/log/REPO_LOG_0404Mon
> Checking MANIFEST file ...
> cat: /mnt/scr1/SnapTest/snapshots-opa/log/#runtest.0404Mon.8656: No such file or directory
> rm: cannot remove `/mnt/scr1/SnapTest/snapshots-opa/log/#runtest.0404Mon.8656': No such file or directory
> 
> Mon Apr  4 00:10:29 CDT 2011
> *** launching tests from jam ***
> 
> TESTHOST is linew
> jam
> amani
> heiwa
> duty
> liberty
>    Fork off timekeeper 30
> cannot remote command with liberty
> ==============
> Testing linew
> ==============
> ssh linew -n cd /home/hdftest/snapshots-opa;/mnt/scr1/SnapTest/snapshots-opa/bin/runtest -nodiff -norepo -configname linew
> 12:10am  up 23 day(s),  6:07,  4 users,  load average: 4.88, 4.15, 3.59
> /                  (/dev/dsk/c1t0d0s0 ):71487500 blocks  8273143 files
> /devices           (/devices          ):       0 blocks        0 files
> /system/contract   (ctfs              ):       0 blocks 2147483605 files
> /proc              (proc              ):       0 blocks    29893 files
> /etc/mnttab        (mnttab            ):       0 blocks        0 files
> /etc/svc/volatile  (swap              ):20458480 blocks  1182196 files
> /system/object     (objfs             ):       0 blocks 2147483493 files
> /etc/dfs/sharetab  (sharefs           ):       0 blocks 2147483646 files
> /platform/sun4u-us3/lib/libc_psr.so.1(/platform/sun4u-us3/lib/libc_psr/libc_psr_hwcap1.so.1):71487500 blocks  8273143 files
> /platform/sun4u-us3/lib/sparcv9/libc_psr.so.1(/platform/sun4u-us3/lib/sparcv9/libc_psr/libc_psr_hwcap1.so.1):71487500 blocks  8273143 files
> /dev/fd            (fd                ):       0 blocks        0 files
> /tmp               (swap              ):20458480 blocks  1182196 files
> /var/run           (swap              ):20458480 blocks  1182196 files
> /scr               (/dev/dsk/c1t1d0s0 ):896681508 blocks 226811321 files
> /home              (jam:/home         ):366438432 blocks 50913130 files
> /mnt/hdf           (jam:/mnt/hdf      ):152515096 blocks 114002233 files
> /mnt/scr1          (jam:/mnt/scr1     ):285157600 blocks 116085930 files
> /mnt/web           (gumund:/mnt/web   ):390173488 blocks 60893585 files
> /mnt/ftp           (gumund:/mnt/ftp   ):390173488 blocks 60893585 files
> STANDARD_OPT=op-configure --prefix=${PWD}/opainstall
> TEST_TYPES=standard
> 
> Mon Apr  4 00:11:01 CDT 2011
> *** starting standard tests in linew ***
> Uname -a: SunOS linew 5.10 Generic_144488-07 sun4u sparc SUNW,A70
> Running snapshot with output saved in
>   /mnt/scr1/SnapTest/snapshots-opa/log/linew_0404Mon_0011
> PASSED linew: standard
> *** finished standard tests for linew ***
> Mon Apr  4 00:22:23 CDT 2011
> Total time = 11m 24s
> 
> *** finished tests in linew ***
> Mon Apr  4 00:22:25 CDT 2011
> linew: Ran 1(1/0/0) tests, Grand total test time =  11m 40s
> 
> 
> ==============
> Testing jam
> ==============
> ssh jam -n cd /home/hdftest/snapshots-opa;/mnt/scr1/SnapTest/snapshots-opa/bin/runtest -nodiff -norepo -configname jam
> 00:10:37 up 59 days, 14:42, 74 users,  load average: 2.20, 2.04, 2.21
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/sda3             31738420   4065196  26034996  14% /
> /dev/sda1               101086     22885     72982  24% /boot
> /dev/sda2             31738420    209144  29891048   1% /tmp
> /dev/sda6             31738392   5959792  24140376  20% /var
> /dev/sda7             31738392  30271700         0 100% /usr
> /dev/sda8            124991068   4814828 113724540   5% /var/tmp
> /dev/mapper/VolGroup00-home
>                     198351840  15132624 172980856   9% /home
> /dev/sde1            565688764 380484688 156005140  71% /scr
> /dev/sdc1            961432072 818853280  93740792  90% /mnt/scr1
> /dev/sdd1            961432072 885174524  27419548  97% /mnt/hdf
> tmpfs                  8313848        16   8313832   1% /dev/shm
> gumund:/data/ftp     480719072 285632320 170667552  63% /mnt/ftp
> gumund:/data/web     480719072 285632320 170667552  63% /mnt/web
> amani:/mnt/rw-src    288451232 179573408  94225344  66% /mnt/ro-src
> amani:/mnt/rw-src    288451232 179573408  94225344  66% /mnt/rw-src
> STANDARD_OPT=op-configure --prefix=${PWD}/opainstall
> TEST_TYPES=standard
> 
> Mon Apr  4 00:10:47 CDT 2011
> *** starting standard tests in jam ***
> Uname -a: Linux jam 2.6.18-194.3.1.el5PAE #1 SMP Thu May 13 13:48:44 EDT 2010 i686 i686 i386 GNU/Linux
> Running snapshot with output saved in
>   /mnt/scr1/SnapTest/snapshots-opa/log/jam_0404Mon_0010
> PASSED jam: standard
> *** finished standard tests for jam ***
> Mon Apr  4 00:12:09 CDT 2011
> Total time = 1m 22s
> 
> *** finished tests in jam ***
> Mon Apr  4 00:12:09 CDT 2011
> jam: Ran 1(1/0/0) tests, Grand total test time =  1m 22s
> 
> 
> ==============
> Testing amani
> ==============
> ssh amani -n cd /home/hdftest/snapshots-opa;/mnt/scr1/SnapTest/snapshots-opa/bin/runtest -nodiff -norepo -configname amani
> 00:10:40 up 48 days, 10:40,  3 users,  load average: 1.02, 1.15, 1.92
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/hda6             39021196  29788760   7218292  81% /
> /dev/hda3             12192636   4142712   7420580  36% /var
> /dev/hda2             12192636    162848  11400444   2% /tmp
> /dev/hda1               101086     23784     72083  25% /boot
> tmpfs                  2057868         0   2057868   0% /dev/shm
> jam:/home            198351840  15132608 172980864   9% /home
> jam:/mnt/hdf         961432096 885174528  27419552  97% /mnt/hdf
> jam:/mnt/scr1        961432096 818853408  93740672  90% /mnt/scr1
> smirom:/scr          267601440  37497024 216291712  15% /mnt/tmp
> gumund:/data/ftp     480719072 285632320 170667552  63% /mnt/ftp
> gumund:/data/web     480719072 285632320 170667552  63% /mnt/web
> /dev/sdb1            144221592  75309328  61586224  56% /scr
> /dev/sda1            288451232 179573436  94225316  66% /mnt/rw-src
> STANDARD_OPT=op-configure --prefix=${PWD}/opainstall
> TEST_TYPES=standard
> 
> Mon Apr  4 00:10:50 CDT 2011
> *** starting standard tests in amani ***
> Uname -a: Linux amani 2.6.18-194.32.1.el5 #1 SMP Wed Jan 5 17:52:25 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
> Running snapshot with output saved in
>   /mnt/scr1/SnapTest/snapshots-opa/log/amani_0404Mon_0010
> 	*************************************
> 	Mon Apr  4 00:13:06 CDT 2011
> 	****FAILED amani: standard****
> 	*************************************
> *** finished standard tests for amani ***
> Mon Apr  4 00:13:06 CDT 2011
> Total time = 2m 16s
> 
> *** finished tests in amani ***
> Mon Apr  4 00:13:06 CDT 2011
> amani: Ran 1(0/1/0) tests, Grand total test time =  2m 16s
> 
> ****SYSTEM ERROR amani: Abnormal exit from runtest ****
> 
> 	*************************************
> 	Mon Apr  4 00:22:56 CDT 2011
> 	****SYSTEM ERROR amani: runtest command failed ****
> 	*************************************
> 
> ==============
> Testing heiwa
> ==============
> ssh heiwa -n cd /home/hdftest/snapshots-opa;/mnt/scr1/SnapTest/snapshots-opa/bin/runtest -nodiff -norepo -configname heiwa
> 00:10:36 up 30 days, 11:41,  5 users,  load average: 0.97, 1.03, 1.06
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/sdb8             35435232   2727740  30907452   9% /
> tmpfs                  2008144         0   2008144   0% /dev/shm
> /dev/sdb4               198337    106660     81437  57% /boot
> /dev/sdb6             12385456   2122628   9633684  19% /tmp
> /dev/sdb7             12385456   3940944   7815368  34% /var
> /dev/sda6             82573108    686812  77691728   1% /scr
> jam:/mnt/hdf         961432096 885174528  27419552  97% /mnt/hdf
> jam:/mnt/scr1        961432096 818853600  93740480  90% /mnt/scr1
> jam:/home            198351840  15132608 172980864   9% /home
> STANDARD_OPT=op-configure --prefix=${PWD}/opainstall
> TEST_TYPES=standard
> 
> Mon Apr  4 00:10:46 CDT 2011
> *** starting standard tests in heiwa ***
> Uname -a: Linux heiwa 2.6.32.16-150.fc12.ppc64 #1 SMP Sat Jul 24 05:19:27 UTC 2010 ppc64 ppc64 ppc64 GNU/Linux
> Running snapshot with output saved in
>   /mnt/scr1/SnapTest/snapshots-opa/log/heiwa_0404Mon_0010
> PASSED heiwa: standard
> *** finished standard tests for heiwa ***
> Mon Apr  4 00:16:23 CDT 2011
> Total time = 5m 37s
> 
> *** finished tests in heiwa ***
> Mon Apr  4 00:16:23 CDT 2011
> heiwa: Ran 1(1/0/0) tests, Grand total test time =  5m 37s
> 
> 
> ==============
> Testing duty
> ==============
> ssh duty -n cd /home/hdftest/snapshots-opa;/mnt/scr1/SnapTest/snapshots-opa/bin/runtest -nodiff -norepo -configname duty
> 12:10AM  up 95 days,  8:25, 21 users, load averages: 0.08, 0.03, 0.08
> Filesystem       1K-blocks      Used     Avail Capacity  Mounted on
> /dev/aacd0s1a       507630     60986    406034    13%    /
> devfs                    1         1         0   100%    /dev
> /dev/aacd0s1h    188433016 123604522  49753854    71%    /data
> /dev/aacd0s1g     32494668  24598546   5296550    82%    /local_home
> /dev/aacd0s1f     12186190   4125638   7085658    37%    /usr
> /dev/aacd0s1d       507630     50350    416670    11%    /var
> /dev/aacd0s1e       253678       178    233206     0%    /var/tmp
> procfs                   4         4         0   100%    /proc
> jam:/home        198351840  15132624 172980856     8%    /home
> jam:/mnt/hdf     961432072 885174524  27419548    97%    /mnt/hdf
> jam:/mnt/scr1    961432072 818853640  93740432    90%    /mnt/scr1
> gumund:/data/web 480719056 285632312 170667544    63%    /mnt/web
> gumund:/data/ftp 480719056 285632312 170667544    63%    /mnt/ftp
> devfs                    1         1         0   100%    /var/named/dev
> /dev/md0            126702        98    116468     0%    /tmp
> STANDARD_OPT=op-configure --prefix=${PWD}/opainstall
> TEST_TYPES=standard
> 
> Mon Apr  4 00:10:54 CDT 2011
> *** starting standard tests in duty ***
> Uname -a: FreeBSD duty.hdfgroup.uiuc.edu 6.3-STABLE FreeBSD 6.3-STABLE #1: Fri Jul 25 17:10:59 CDT 2008     sukoziol at duty.hdfgroup.uiuc.edu:/usr/obj/usr/src/sys/DUTY  i386
> Running snapshot with output saved in
>   /mnt/scr1/SnapTest/snapshots-opa/log/duty_0404Mon_0010
> PASSED duty: standard
> *** finished standard tests for duty ***
> Mon Apr  4 00:16:45 CDT 2011
> Total time = 5m 51s
> 
> *** finished tests in duty ***
> Mon Apr  4 00:16:45 CDT 2011
> duty: Ran 1(1/0/0) tests, Grand total test time =  5m 51s
> 
> 
> ==============
> Testing liberty
> ==============
> liberty does not accept Remote Command (Mon Apr  4 00:11:18 CDT 2011)
> 	*************************************
> 	Mon Apr  4 00:11:18 CDT 2011
> 	****SYSTEM ERROR: liberty does not accept Remote Command (Mon Apr  4 00:11:18 CDT 2011)
> 	*************************************
> 	*************************************
> 	Mon Apr  4 00:22:56 CDT 2011
> 	****INCOMPLETE liberty: snaptest did not complete****
> 	*************************************
> 
> *** finished tests in jam ***
> Mon Apr  4 00:22:56 CDT 2011
> jam: Ran 6(0/0/0) hosts, Grand total test time =  12m 37s
> 
> _______________________________________________
> opa-nightly-tests mailing list
> opa-nightly-tests at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/opa-nightly-tests



More information about the opa-nightly-tests mailing list