Hello,

  I've been using DOMjudge since the old 3.x days, and in the last few years I've been applying a couple of changes in my new installations to overcome some problems that, IMHO, have arisen with the new versions. I want to share them in case they are useful for others, they can be considered to by applied to the mainstream version, or to get feedback if I've been doing something wrong.

  The first one is regarding CPU vs. wall limit times. When runguard is invoked from testcase_run.sh, the script provides it with the parameters --cputime and --walltime. Years ago the walltime was set with 2*${TIMELIMIT}. But at some point in the past, the format of $(TIMELIMIT) was changed to "<soft>:<hard>" and that expression stop working. The script was updated and now the --walltime is exactly the same as --cputime.

  I experienced difficulties because of that some time ago, because runguard reported Time Limit due walltime exceeded even when the CPU limit was far to be reached. This occurred when the server was suffering a high load, or the first time a big testcase had to be recover from the database.

  My solution has been to change that script (testcase_run.sh) so the walltime is scaled up. The official script is:

runcheck ./run testdata.in program.out \
        $GAINROOT "$RUNGUARD" ${DEBUG:+-v} $CPUSET_OPT \
        ${USE_CHROOT:+-r "$PWD/.."} \
        --nproc=$PROCLIMIT \
        --no-core --streamsize=$FILELIMIT \
        --user="$RUNUSER" --group="$RUNGROUP" \
        --walltime=$TIMELIMIT --cputime=$TIMELIMIT \
        --memsize=$MEMLIMIT --filesize=$FILELIMIT \
        --stderr=program.err --outmeta=program.meta -- \
        "$PREFIX/$PROGRAM" 2>runguard.err

  and my change modifies the --walltime line:

...
        --walltime=$((4*${TIMELIMIT%:*})) --cputime=$TIMELIMIT \
...


  Note that I'm using a factor of 4 instead of 2 used long ago by DOMjudge. This change works for versions up to 5.1.x. Since DOMjudge 5.2 timelimits can be float values, and my expression is not working anymore. Now bc is needed in order to be able to use float arithmetic:

...
        --walltime=$(echo "scale=2;4*${TIMELIMIT%:*}" | bc) --cputime=$TIMELIMIT \
...



  This last patch is due to my college Joan Rodríguez, and adds a new package dependence (in judgehosts) with the bc program.


  The second change that I've been applying to DOMjudge is in runguard itself, and it is also related with CPU time, although now I'm unsure about the reasons causing my problems. Sometimes, when evaluating java submissions, instead of Time Limit verdict we got Run Error. We solved this issue increasing the real cputime hard limit in runguard. The official runguard adds one second to the CPU time limit, and for some reasons that seems to be insufficient in some cases so we change it to add 2 seconds. In the runguard.c source code:

if ( use_cputime ) {
    /* The CPU-time resource limit can only be specified in
       seconds, so round up: we can measure actual CPU time used
       more accurately. Also set the real hard limit one second
       higher: at the soft limit the kernel will send SIGXCPU at
       the hard limit a SIGKILL. The SIGXCPU can be caught, but is 
       not by default and gives us a reliable way to detect if the
       CPU-time limit was reached. */
    rlim_t cputime_limit = (rlim_t)ceil(cputime[1]);
    verbose("setting hard CPU-time limit to %d(+2) seconds",(int)cputime_limit);
    lim.rlim_cur = cputime_limit;               ^
    lim.rlim_max = cputime_limit+2;             |
    setlim(CPU);                 ^              |
}                                +---------- CHANGED


  This change avoids those false RUN ERROR verdicts we sporadically suffered. Unfortunally I cannot provide a way to replicate the error because it is quite erratic. As judgehosts we use virtual machines, in case that is important.

  Thank you for your great DOMjudge.

  Best regards,
    Pedro Pablo