Hi all.
I’m having a difficult time here with a service I’m in need for the cluster we are building here, based on Suse 11.0.
Straight to the problem, there is available for the Torque queueing system (which btw would be a great idea to be included in the distribution… ) two system services scripts (suse.pbs_server and suse.pbs_mom, http://www.clusterresources.com/torquedocs21/1.1installation.shtml). I have both installed here, even recognized as so in yast>system>system services. Unfortunatelly, they give an error in the startup and fail (not affecting the whole boot of the machines, thankfully), and also give an error when I try to start them in the yast>system>system expert mode. On the other hand, I know the programs and the whole queue is working, as I can start it by hand on the head node and slaves and properly submit a job in the queue.
Does anybody has an idea on why this is happening? I just give the scripts a brief look until now, will be more carefully during the week, but anybody has any idea? If there is a simple error that anybody using torque here or someone very familiar with these scripts could find in a faster way, it would be very helpful.
Thanks a lot in advance. I will attach below the code for suse.pbs_server. pbs_mom is quite similar.
#!/bin/sh
#
# pbs_server This script will start and stop the PBS Server
#
### BEGIN INIT INFO
# Provides: pbs_server
# Required-Start: $syslog $remote_fs
# Should-Start:
# Required-Stop: $syslog $remote_fs
# Should-Stop: $null
# Default-Start: 2 3 5
# Default-Stop:
# Short-Description: Torque server
# Description: Torque is a versatile batch system for SMPs and clusters.
# Starts the PBS batch server, which operates as batch server
# on the local host.
### END INIT INFO
PBS_DAEMON=/usr/sbin/pbs_server
PBS_HOME=/var/spool/torque
PIDFILE=$PBS_HOME/server_priv/server.lock
export PBS_DAEMON PBS_HOME PIDFILE
# Source the library functions
. /etc/rc.status
rc_reset
-f /etc/sysconfig/pbs_server ] && . /etc/sysconfig/pbs_server
-x $PBS_DAEMON ] || exit
# let see how we were called
case "$1" in
start)
echo -n "Starting TORQUE Server: "
if -r $PBS_HOME/server_priv/serverdb ]
then
startproc $PBS_DAEMON $SERVER_ARGS
else
startproc $PBS_DAEMON -t create $DAEMON_ARGS
fi
rc_status -v
;;
stop)
echo -n "Shutting down TORQUE Server: "
killproc -p $PIDFILE $PBS_DAEMON
rc_status -v
;;
status)
echo -n "Checking TORQUE Server: "
checkproc -p $PIDFILE pbs_server
rc_status -v
;;
restart)
$0 stop
$0 start
rc_status
;;
try-restart)
$0 status >/dev/null && $0 restart
rc_status
;;
reload|force-reload)
echo -n "Reloading TORQUE Server: "
killproc -p $PIDFILE pbs_server -HUP
rc_status -v
;;
*)
echo "Usage: pbs_server {start|stop|status|try-restart|restart|force-reload|reload}"
exit 1
esac
rc_exit