A new script has been added to the startup (S98pinguid in rc3.d) to set the SUID bit for the ping command using chmod u+s. Although the script will run at every startup, the time taken to run the script is negligible. Because the script will set the SUID every boot, there will be no problem if the filesystem is changed/updated.

Now scripts running under PBS can successfully use ping to check license servers etc etc.

Kernel 3.7.4

Today the slave nodes were updated to a custom build of Linux Kernel 3.7.4(Vanilla). The build version is internally known as beowulf-slave-node-v13. Also, the process for managing kernel versions on the TFTP server (iserver2) has been automated to streamline updates and the distribution of new kernels.

The new process is:

build kernel

kernelupdate (/root/utils/kernelupdate) to copy the bzImage to the TFTP directory with the appropriate file name. Kernel update also checks that the config file is not newer than the build kernel. I will post the script when I am happy with it.

dhcp2pxecfg (/root/utils/dhcp2pxecfg) changes all the pxeboot config files to point to the newest (or user inputted) kernel.


After a trip to Africa, I am back to resurrect Beo for 2013.

When I logged on to the forum, there were heaps of comments on posts that were obviously spam. If anyone knows how to block spammers from posting comments via WordPress, please comment!

New Logo

Check it out!!!

The idea of converting ASCII art back into an image seems counter intuitive though……

Because the speed of SFTP is quite limited due to the encryption requirement, I have decided to open limited FTP access to the server. All registered users can now access the server using their chosen FTP client (FileZilla is recommended on Windows). The speedup on transfers inside the Monash network is almost twofold.

Users must login using their own username and password. All users are jailed inside their home directory (/home/$USER). And can only upload files into their home directory. Once on the server, the user can then transfer the file to the short directory through a shell.

Please do not leave files in your home directory for long, the home directory only has 90GB space shared between all users, /short has 900GB.

NFS help needed

Home directories are being shared amongst nodes via NFS, however I have not managed to properly configure any sort of username management to fix problems with permissions and ownership. As a result, files are displayed as owned by “nobody” on the nodes. Because SSH requires strict permissions on the authenticating keys, I cannot get passwordless SSH to work between nodes as I cannot figure out how to set the permissions/ownership so that every node recognises the true owner.

I have also tried configuring host based authentication, but could not get the hosts to authenticate due to the permissions on ssh-keysign. I added the SUID bit using chmod u+s, however the file ownerships are still wrong (I believe) due to the NFS mounting. All mounts are mounted using no_root_squash, but I am at ends on how to make this work. We need to have internode SSH ability so that PBS can upload the stderr and stdout files back to the headnode at the end of each simulation.

If anyone has any ideas, I would appreciate some guidance.



Licensing Issues

Unfortunately bc247 is not yet equipped with the ability to probe license servers before starting jobs. It is hoped that a system similar to the License Shadowing Daemon used on the NCI clusters can be implemented. For the mean time, users must first query the license server and manually check that there are enough licenses. Users can then either wait if there are a less than ideal amount of licenses available, or otherwise reduce the number of nodes.

Because bc247 only runs at night time, many of the HPC licenses are already in use by other researchers who start their jobs on other clusters (Sun Grid/NCI) before leaving for the day. This leaves only the scaps of licenses for bc247 users. Not enough licenses are usually available, but that doesn’t stop me from getting my simulations running as can be seen in the output from the license server!

