The cluster is just another computer
Open a terminal and play with these commands:
ls
pwd
mkdir new_directory
touch new_file
cd new_directory
cd ..
ls -l
cp new_file new_directory/
ls -l new_directory
rm -r new_directory new_file
Try exploring your own computer using the terminal. For example, navigate to your last project’s directory.
Log into your account
ssh usr123@abacus
Create a new directory in your home directory. It’s just the same as in your computer.
mkdir ~/workshop
Change your password
passwd
To exit the cluster use Ctlr + D
Start by making an R script by typing the following:
Sys.sleep(20)
message("my first job in the cluster")
Now let’s move it to the cluster
In your computer’s terminal type:
scp ~/path_to_sript/r_script_name.R usr123@abacus:~/workshop
We can also modify the R script in Abacus
In the abacus terminal type:
cd ~/workshop
nano r_script_name.R
Besides nano
you can also use other text editors like vim
or vi
in case you need/want to modify files directly in the cluster
Visit the GitHub repository on http://github.com/efcaguab/sge-workshop
We can downloading from GitHub to the cluster.
But first we need to enable internet in Abacus.
telnet ienabler 259
On the Abacus terminal:
cd ~/
git clone git@github.com:efcaguab/sge-workshop.git
Now you have all the workshop materials in abacus. Try to modify script-2.R
from Abaucs using nano
A high performance computer (HPC) is not a faster computer. It’s several computers connected together.
Each of these computers is called node. Each node can have multiple cores
Every core can run one job
One node acts as a master which distributes jobs across the other nodes
To distribute those jobs, the master node uses queues. We’ll look into that in Part 2
When you log-in, as we did before, you access the master node
Abacus has 22 nodes total. 15 of them are available for computing at the School of Biology
Nodes available have between 8 and 48 cores each. Total of 432 cores
You can explore this yourself by going to http://abacus/ganglia/ on your browser
Have a look at the file structure
ls /
When you log-in you’re actually in /home/usr123
Usually you store your scripts and raw data in you home directory
The output files, especially if they’re heavy, should go to /data/people/usr123
Create your own directory in /data/people
In your computer you would run an R script using
Rscript script.R
☢ Never do that in the cluster ☢
cause you could break it…
Make sure that your script runs well in your computer before sending it to the cluster.
Troubleshooting in the cluster can be a nightmare.
Abacus is a shared resource. Be nice to other people. Don’t use it all unless you really need it.
Don’t be afraid to use it though. It’s there for us.