CMSDAS Pre-Exercise 2: Using the cmslpc cluster
Overview
Teaching: 0 min
Exercises: 10 minQuestions
Learn how to use the CMSLPC cluster
Objectives
Learn how to use the CMSLPC cluster
Questions
For this lesson, please submit your answers using Google Form 2.
Introduction
In the previous lesson, you learned how to use Unix on your personal computer. However, the CMS detector produces more than 500 terabytes of data per second, so there’s only so much you can do on your laptop. In this lesson, we will learn how to use the cmslpc computing cluster, which you will use extensively during CMSDAS and possibly for your actual analysis work as well. Information for lxplus users is also provided, where appropriate.
The basics
The cmslpc
cluster consists of a large number of “interactive nodes,” which users login to via SSH, and an even larger number of “worker nodes,” which are used for large-scale “batch” computing.
We will use the AlmaLinux 8 operating system (OS), a community-supported OS that is binary-compatible with Red Hat Enterprise Linux (RHEL).
AlmaLinux 8 was chosen for CMS Run 3 data processing (AlmaLinux 9 is also available and works for most user software; see these slides for the gory details).
CMS users are allocated storage space in a few places: (1) a small home directory (2 GB) at /uscms/home/username
, ; (2) a medium storage directory (200 GB, not backed up!) at /uscms_data/d[1-3]/username
, which is softlinked in your home directory at /uscms/home/username/nobackup
, and (3) a large storage directory on EOS (2 TB) (special filesystem, more info later in this lesson).
The lxplus
cluster is configured similarly, with slightly different paths and quotas allocated to users (note that lxplus.cern.ch
is an alias for lxplus9.cern.ch
, a login node running AlmaLinux 9 OS; use lxplus8.cern.ch
to get
Logging in
Let’s try logging in to cmslpc
using SSH. SSH is a very widely used program for logging into remote Unix clusters; you can check out the HSF SSH exercise to learn more, but for now you can just follow the commands in this exercise. The authentication for cmslpc uses kerberos (your university cluster may allow simple password login or certificate login, which are not covered here).
First, if you have not configured SSH and kerberos on your own computer, please follow these directions. Once you have the cmslpc-specific SSH and kerberos configuration, execute the following command in the terminal on your own computer:
kinit <YourUsername>@FNAL.GOV
# kinit <YourUsername>@CERN.CH for lxplus users
Enter the kerberos password for your Fermilab account. You now have login access to cmslpc for 24 hours. If you get an error message, double-check that you configure kerberos properly, or head to Mattermost for help.
Next, execute the following to login:
ssh -Y <YourUsername>@cmslpc-el8.fnal.gov
# ssh -Y <YourUsername>@lxplus8.cern.ch for lxplus users
If you see a welcome message followed by a command prompt, congratulations, you have successfully logged in! The commands you enter into the command prompt will now be executed on the cmslpc interactive node. If you see an error message instead, something has probably gone wrong with the authentication; please head to Mattermost and post your error message, and an instructor can help you out.
Running simple commands on cmslpc
Note: this exercise will only work on cmslpc-el8.
In this exercise, we will run a few simple commands on cmslpc. At the end, you will type an answer into a spreadsheet (experienced users should feel free to breeze through, but please do upload your answer so we can follow everyone’s progress).
First, make a folder for your CMSDAS work, and cd
into that directory:
mkdir ~/nobackup/cmsdas
cd nobackup/cmsdas
Use the following command to copy the runThisCommand.py
script:
cp ~cmsdas/preexercises/runThisCommand.py .
Next, copy and paste the following and then hit return:
python3 runThisCommand.py "asdf;klasdjf;kakjsdf;akjf;aksdljf;a" "sldjfqewradsfafaw4efaefawefzdxffasdfw4ffawefawe4fawasdffadsfef"
The response should be your username followed by alphanumeric string of characters unique to your username, for example for a user named gbenelli
:
success: gbenelli toraryyv
If you executed the command without copy-pasting (i.e. only ./runThisCommand.py
without the additional arguments) the command will return:
Error: You must provide the secret key
Alternatively, copying incorrectly (i.e. different arguments) will return:
Error: You didn't paste the correct input string
If you are not running on cmslpc-el8 (for example locally on a laptop), trying to run the command will result in:
bash: ./runThisCommand.py: No such file or directory
or (for example):
Unknown user: gbenelli.
Question 2.1
Copy-and-paste the alphanumeric string of characters unique to your username in the Google form.
Editing files on cmslpc
The purpose of this exercise is to ensure that the user can edit files. We will first copy and then edit the editThisCommand.py script.
Users of cmslpc have several options for editing remote files. Here are a few examples:
- Edit files directly on cmslpc using a terminal-based code editor, such as
nano
,emacs
(opens a GUI by default, which is slow over SSH; useemacs -nw
to get a terminal-based editor instead), orvim
.emacs
andvim
, in particular, have lots of features for code editing, but also have a steep learning curve. - Edit files on your own computer in the terminal (with the same programs), and upload using, e.g.,
sftp myscript.py username@cmslpc-el8.fnal.gov:my/folder
. - Use an application like Visual Studio Code or Sublime Text, either directly on cmslpc (using a remote filesystem plugin, which makes your directory on cmslpc appear as a folder on your computer) or on your own computer (using an SSH or SFTP plugin to automatically upload files to cmslpc). These also have lots of features, and are easier to learn than
emacs
orvim
.
For the sake of this lesson, will will simply edit a file directly on cmslpc, using nano
, emacs
, or vim
. On the cmslpc-el8 cluster, run:
cd ~/nobackup/cmsdas
cp ~cmsdas/preexercises/editThisCommand.py .
Then open editThisCommand.py
with your favorite editor (e.g. emacs -nw editThisCommand.py
).
The 11th line of this python script throws an error, meaning that python prints an error message and ends the program immediately.
For this exercise, add a #
character at the start of the 11th line, which turns the line of code into a comment that is skipped by the python interpreter.
Specifically, change:
# Please comment the line below out by adding a '#' to the front of
# the line.
raise RuntimeError("You need to comment out this line with a #")
to:
# Please comment the line below out by adding a '#' to the front of
# the line.
#raise RuntimeError("You need to comment out this line with a #")
(For vim users: you need to press “i” to insert text.) Save the file (e.g. in emacs, type ctrl+x ctrl+s
to save, ctrl+x ctrl+c
to quit the editor; in vim, press ESC to exit insert mode, the “:wq” to save and exit) and execute the command:
./editThisCommand.py
If this is successful, the result will be:
success: gbenelli 0x6D0DB4E0
If the file has not been successfully edited, an error message will result such as:
Traceback (most recent call last):
File "./editThisCommand.py", line 11, in ?
raise RuntimeError, "You need to comment out this line with a #"
RuntimeError: You need to comment out this line with a #
Question 2.2
Copy-and-paste the line beginning with “success”, resulting from the execution of
./editThisCommand.py
, into the Google form.
Using the EOS filesystem
Your biggest storage on cmslpc is the EOS filesystem, which behaves differently from a normal Unix folder.
EOS has dedicated commands for interacting with the filesystem, rather than the usual ls
, cp
, mkdir
, etc.
Also, files are addressed using so-called “logical filenames” (LFNs), which you can think of as their location inside EOS, rather than their absolute location (or physical filename, PFN).
The LFNs usually start with /store/...
.
Click here for the full documentation; here are a few essential commands.
- On cmslpc, the equivalent of
ls
iseosls
: for example,eosls /store/user/cmsdas/preexercises/
. This is actually a cmslpc-specific alias foreos root://cmseos.fnal.gov ls
; on other clusters, you’ll have to use this full command. A useful flag iseosls -alh
, which will print folder and file sizes. - Similarly to
ls
, the cmslpc-specific equivalents ofmkdir
andmv
areeosmkdir
andeosmv
. (You can doalias eosmkdir
to see the full command behind the alias.) - The equivalent of
cp
isxrdcp
: for example,xrdcp root://cmseos.fnal.gov//store/user/cmsdas/preexercises/DYJetsToLL_M50_NANOAOD.root .
. Theroot://cmseos.fnal.gov
bit tellsxrdcp
which EOS instance to use (only one instance for cmslpc users; lxplus has several, e.g.,root://eoscms.cern.ch
for CMS data androot://eosuser.cern.ch
for user data).
Question 2.3
We will copy a small file from EOS to your nobackup area, containing 10,000 simulated $Z\rightarrow\mu^+\mu^-$ events in the CMS NanoAOD format. We will use this file in later exercises, so make sure not to lose track of it.
Execute the following:
cd ~/nobackup/cmsdas/ xrdcp root://cmseos.fnal.gov//store/user/cmsdas/preexercises/DYJetsToLL_M50_NANOAOD.root .
Using
ls -lh DYJetsToLL_M50_NANOAOD.root
, how big is this file? (It’s the large number.) Write the answer in the Google form.
Key Points
Learn how to use the CMSLPC cluster