Monday, July 30, 2007

Start/Stop Samba

Now I've added Samba as a subsystem:

# mkssys -s smbd -p /opt/pware/samba/3.0.24/sbin/smbd -a "-D" -u 0

and on default it should start automatically during the system boot. If it isn't started try first:

# startsrc -s smbd

if unsuccessful try next:

# /opt/pware/samba/3.0.24/sbin/smbd -D

Other option is to use files:
/etc/rc.d/rc2.d/Ssmbd
/etc/rc.d/rc2.d/Ksmbd
Now they are in /home/akha103

AVOIDING Disk Crashes

1. Rule One: don't let this stop your system
* RAID5 or mirror everything
2. Rule Two: monitor error logs
* Make sure you know when a disk failed
3. Rule Three: call hardware support
* That is what they are for
4. Rule Four: Don't meddle
* Only try, if you know what you are doing
5. Rule Five: Read and practise
* Get the Redbooks and try it safely

Formatted list of active Volume groups

for i in $(lsvg);do lsvg $i;done \
| awk ' BEGIN { printf("%10s\t%10s\t%10s\t%10s\t%10s\n","VG","Total(MB)","Free","USED","Disks") };/VOLUME GROUP:/ \
{ printf("%10s\t", $3) };/TOTAL PP/ { B=index($0,"(") + 1;E=index($0," megaby");D=E-B;printf("%10s\t", substr($0,B,D) );};/FREE PP/ \
{ B=index($0,"(") + 1;E=index($0," megaby");D=E-B;printf("%10s\t", substr($0,B,D) );};/USED PP/ \
{ B=index($0,"(") + 1;E=index($0," megaby");D=E-B;printf("%10s\t", substr($0,B,D) );};/ACTIVE PV/ { printf("%10s\t\n", $3) } '

Friday, July 27, 2007

emails from NGData has root@bestgrid30.math.auckland.ac.nz in From:

All emails from NGData has root@bestgrid30.math.auckland.ac.nz in From: field.

The reason of that is I created CNAME ngdata.auckland.ac.nz for data.bestgrid.org (bestgrid30.math.auckland.ac.nz) long time ago. And in revers lookup mail server finds bestgrid30 as a sender domain.

I've applied to move ngdata.auckland.ac.nz from bg30 to bg4.

UPD: Fixed on 27.07.2007

Use correct CPU number

#!/bin/bash
#PBS -u user_name
#PBS -l nodes=1:ppn=8
#PBS -o $PBS_JOBNAME.out
#PBS -e $PBS_JOBNAME.err

#How many procs do I have?
NP=$(wc -l $PBS_NODEFILE | awk '{print $1}')

#cd into the directory where I typed qsub cd $PBS_O_WORKDIR

#run executable
mpiexec -np $NP executable

Thursday, July 26, 2007

Pre-install of any APAC packages.

Pre-install of any APAC packages.
Edit to enable centosrepo:
# nano /etc/yum.repos.d/CentOS-Base.repo

# rpm --import http://mirror.centos.org/centos/4/os/i386/RPM-GPG-KEY-centos4
# cd /etc/yum.repos.d && wget http://www.grid.apac.edu.au/repository/dist/APAC-Grid.repo
# yum install Gpulse Gbuild

Check hostcert.*
openssl x509 -in hostcert.pem -noout -modulus
openssl rsa -in hostkey.pem -noout -modulus

Build NGData
/usr/local/sbin/BuildNgdataVdt161.sh

Wednesday, July 25, 2007

NGData ToDo

- set yum repos to VPAC
- install certificate
- install Gbuild
- install NGData


Current Disk Space Allocation on GateWay

XENHOST

Filesystem Size Used Avail Use% Mounted on
/dev/sda1 9.7G 1.2G 8.0G 14% /
/dev/sda3 56G 5.1G 48G 10% /home


VMHOST


VG #PV #LV #SN Attr VSize VFree

VolumeGroup00 1 29 0 wz--n- 409.84G 326.43G

LV VG Attr LSize
GridSphereR VolumeGroup00 -wi-ao 2.00G
GridSphereS VolumeGroup00 -wi-ao 512.00M
LanguageR VolumeGroup00 -wi-ao 2.00G
LanguageS VolumeGroup00 -wi-ao 512.00M
NG2Root VolumeGroup00 -wi-ao 16.00G
NG2Swap VolumeGroup00 -wi-ao 512.00M
OpenIdpR VolumeGroup00 -wi-ao 2.00G
OpenIdpS VolumeGroup00 -wi-ao 512.00M
SRSData VolumeGroup00 -wi-ao 20.00G
SRSRoot VolumeGroup00 -wi-ao 1.50G
SRSSwap VolumeGroup00 -wi-ao 512.00M
SakaiR VolumeGroup00 -wi-ao 8.00G
SakaiS VolumeGroup00 -wi-ao 512.00M
SakaitR VolumeGroup00 -wi-a- 8.00G
SakaitS VolumeGroup00 -wi-a- 512.00M
ServicesR VolumeGroup00 -wi-ao 4.00G
ServicesS VolumeGroup00 -wi-ao 512.00M
SolverR VolumeGroup00 -wi-ao 1.49G
SolverS VolumeGroup00 -wi-ao 512.00M
SolverU VolumeGroup00 -wi-ao 1.50G
VreR VolumeGroup00 -wi-ao 2.00G
VreS VolumeGroup00 -wi-ao 512.00M
VreU VolumeGroup00 -wi-ao 1.50G
WayfR VolumeGroup00 -wi-ao 2.00G
WayfS VolumeGroup00 -wi-ao 512.00M
WikiProdR VolumeGroup00 -wi-ao 2.93G
WikiProdS VolumeGroup00 -wi-ao 512.00M
WikiR VolumeGroup00 -wi-ao 2.00G
WikiS VolumeGroup00 -wi-ao 512.00M

Tuesday, July 24, 2007

Auckland is in APAC!!!

ng2.auckland.ac.nz is in APAC structure now!!!

It is in GOC database: http://goc.grid.apac.edu.au/

AUCKLAND site in APAC MDS/MIP database:
http://www.sapac.edu.au/webmds/webmds?info=indexinfo&xsl=apacgluexsl
Seek 'AUCKLAND' on the page.

Also ng2.auckland.ac.nz is visible in gcc GUI and even it's possible to submit jobs
for MrBayes calculation but they don't run good yet. I'm working on this.

Thursday, July 19, 2007

MrBayes Installation

To provide a service in Computation Grid I decided to install MrBayes.

The home of MrBayes is http://www.mrbayes.net/

The latest version is 3.1.2. In Linux environment this package should be compiled. First attempt to compile was unsuccessful with errors:

bayes.c:45:31: readline/readline.h: No such file or directory

bayes.c:46:30: readline/history.h: No such file or directory

MrBayes manual says:
Depending on which platform/distribution the compilation is being performed on, it may be necessary to install relevent development libraries to enable compilation. Users of Ubuntu 5.10 (Breezy Badger), for instance, may need to install the either the libreadline4-dev or libreadline5-dev package to provide linking during compilation to GNU readline functionality.
The first try to install libreadline5-dev requires to issue 'apt-get install -f' without any package to update system. After this command I could to install libreadline5-dev by:

'apt-get install -f libreadline5-dev'

Then compilation of MrBayes performed well. Now I think I need to install lam-mpi package to make MrBayes mpi compatible. Currently no mpi packages are install on my head node.

Tuesday, July 10, 2007

Yahoo! Test2 is running!

Final thing was a connecting between ng2:pbs-logmaker and bestgrid-02:pbs-telltail.
Because of bestgrid-02 is Debian box I had to modify startup file quite heavily. Modified lines are bolded:

#!/bin/sh
# pbs-telltail Starts/stops pbs-telltail daemon.
# Graham Jenkins Nov. 2005. Modified: 20051220
#
# chkconfig: 2345 99 05
# description: pbs-telltail startup script

# Adjust as appropriate
REMOTES="ng2.auckland.ac.nz:2812"
[ -z "$PBS_HOME" ] && PBS_HOME=/opt/torque
[ -z "$TELLTAIL_HOME" ] && TELLTAIL_HOME=/usr/local/pbs-telltail

#. /etc/rc.d/init.d/functions
RETVAL=0
case "$1" in
start ) for Remote in $REMOTES ; do
Host=`echo $Remote | awk -F: '{print $1}'`
Port=`echo $Remote | awk -F: '{print $2}'`
echo -n "Starting pbs-telltail on host: $Host .. port: $Port .. "
$TELLTAIL_HOME/pbs-telltail $PBS_HOME/server_logs $Host $Port
RETVAL=$?; echo; [ $RETVAL -ne 0 ] && break
done
[ $RETVAL -eq 0 ] && touch /var/lock/subsys/pbs-telltail ;;
stop ) echo -n "Shutting down pbs-telltail .. "
killproc pbs-telltail
RETVAL=$?; echo
[ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/pbs-telltail ;;
status) status pbs-telltail
RETVAL=$? ;;
* ) echo "Usage: $0 {start|stop|status}"; exit 1 ;;
esac
exit $RETVAL
stop and status options don't work. I will investigate this later.

I copied /usr/local/pbs-telltail/* from ng2 to bestgrid-02.
Copied pbs-telltail.RH to /etc/init.d/pbs-telltail
And modified as mentioned above. Than started this script.

Before a pbs job had stuck and displayed a message "Job Unsubmitted". After it goes further and display env variables as that specified in test2.rsl file (job description).

Submit PBS job from ng2

In my configuration there are 3 boxes: ng2, bestgrid-02 (pbs-server), bestgrid-01 (node). Submitting of PBS jobs requires following issues:
  1. All three boxes have to have common folder for user which has been used as a grid user. In my case it's grid-user. I've exported /home/grid-user from bestgrid-02 and mounted it to ng2 and bestgrid-01 into the same folder;
  2. Ideally that should be whole /home folder exported from another box (i.e. data.bestgrid.org) and mounted on all boxes in a chain ng2->pbs-server->nodes;
  3. grid-user has to have passwordless access between all boxes in the chain;
  4. In /home/grid-user/.ssh/known_hosts records for all boxes should be existed for short and long names;
  5. All boxes should be in /etc/hosts of each box.

Friday, July 6, 2007

One way password less SSH

For user who has to have password less SSH between two hosts, all files in .ssh folder of both hosts must have the same permission attributes:
-rw-r--r-- 1 grid-user grid-user 867 Jul 6 13:51 authorized_keys
-rw------- 1 grid-user grid-user 668 Jun 27 15:27 id_dsa
-rw-r--r-- 1 grid-user grid-user 618 Jun 27 15:27 id_dsa.pub
-rw------- 1 grid-user grid-user 883 Jun 27 15:27 id_rsa
-rw-r--r-- 1 grid-user grid-user 238 Jun 27 15:27 id_rsa.pub
-rw------- 1 grid-user grid-user 554 Jun 27 15:27 identity
-rw-r--r-- 1 grid-user grid-user 358 Jun 27 15:27 identity.pub
-rw-r--r-- 1 grid-user grid-user 918 Jun 27 15:02 known_hosts
In my case I could ssh from ng2 to bestgrid-02 but not back. After Anton's suggestion I found out that on ng2 file authorized_keys had -rw-rw-r-- permissions. After changing to -rw-r--r-- I could ssh from bestgrid to ng2 without password.