Difference between revisions of "Introduction to Linux in HPC/Linux in HPC"

From HPC Wiki
Introduction to Linux in HPC/Linux in HPC
Jump to navigation Jump to search
(Created page with "__TOC__ === Video === <!--T:5--> <youtube width="600" height="400" right>IfD9IPixgpo</youtube> [https://git-ce.rwth-aachen.de/hpc.nrw/ap2/tutorials/linux/-/blob/master/Slid...")
 
 
(22 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
[[Category:Tutorials]]
 +
{{DISPLAYTITLE:<span style="position:absolute; top:-9999px;">Introduction to Linux in HPC/</span>Linux in HPC}}
 +
 
__TOC__
 
__TOC__
 +
 +
{{Infobox_linux_introduction}}
 +
  
 
=== Video === <!--T:5-->
 
=== Video === <!--T:5-->
Line 5: Line 11:
 
<youtube width="600" height="400" right>IfD9IPixgpo</youtube>
 
<youtube width="600" height="400" right>IfD9IPixgpo</youtube>
  
[https://git-ce.rwth-aachen.de/hpc.nrw/ap2/tutorials/linux/-/blob/master/Slides/Linux_HPC/Linux_HPC.pdf Linux in HPC]  Slides 3 - 40 (38 pages)
 
  
  
=== Slide Layout === <!--T:5-->
+
=== Quiz === <!--T:5--
 +
 
 +
 
 +
{{hidden begin
 +
|title = 1. Which command you can use to do a secure copy from the Cluster to you local Linux machine?</br>
 +
Hint:<code>man scp</code>
 +
}}
 +
<quiz display=simple>
 +
{
 +
|type="()"}
 +
+  Click and submit to see the answer
 +
|| <code>scp</code>
 +
|| Example usage: <code>scp your_username@remotehost.edu:foobar.txt /some/local/directory</code>
 +
</quiz>
 +
{{hidden end}}
 +
 
 +
{{hidden begin
 +
|title = 2. Label the interface elements in the terminal:
 +
}}
 +
<quiz display=simple>
 +
{ [[File:Linux_hpc_quiz.png|frame|500px]]
 +
| type="()" }
 +
- 1. shell command
 +
- 2. current prompt
 +
- 3. previous prompt
 +
- 4. cursor
 +
- 5. login message
 +
- 6. command output
 +
+ Click and submit to get answer
 +
|| [[File:Linux_quiz_answer.png|frame|500px]]
 +
</quiz>
 +
{{hidden end}}
 +
 
 +
 
 +
{{Warning|mode=info|text= '''Integrated in slides'''}}
 +
 
 +
{{Warning|mode=warn|text= '''Integrated in slides'''}}
 +
 
 +
=== Exercises for Linux in HPC: GO CP2K GO! === <!--T:5-->
 +
 
 +
 +
CP2K is a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems. In this exercise we are going to
 +
    1. create CP2K input files with different cutoff values from a template input for simulation of 32 water molecules in a box using density functional theory (DFT) calculation.
 +
    2 .analyse the simulation output files and summarize some important results.
 +
 +
'''Create CP2K input files'''
 +
The CP2K template input file for simulation of 32 water molecules in a box using DFT calculation can be found in the <code>Ex_LinuxHPC/01_CreateInput</code> directory and the file name is <code>template.inp</code>.
 +
A placeholder <code>__CUTOFF__</code> is set on line 7 of this file.
 +
 
 +
      <code>CUTOFF __CUTOFF__</code>
 +
 
 +
With smaller cutoff value the DFT calculation runs faster, but the results may be less accurate. With larger cutoff value, on the other hand, the results become more accurate, but the DFT calculation can be slower.
 +
In this exercise we create the CP2K input files based on the template (<code>template.inp</code>) for a range of different cutoff values, e.g. from 250 to 350 with a step size of 10. Please write a bash script that:
 +
    1. creates individual subdirectories for the simulations with different cutoff values
 +
    2. in each subdirectory creates the CP2K input file from the template file with the placeholder <code>__CUTOFF__</code> being replaced by an appropriate cutoff value. The cutoff values are from 250 to 350 and the increment
 +
      is 10. e.g. the cutoff in the first input file may have
 +
 
 +
          <code>CUTOFF 250</code>
 +
 
 +
    and the cutoff in the second input file may look like
 +
 
 +
          <code>CUTOFF 260</code>
 +
 
 +
    and so on until <code>CUTOFF 350</code> in the last CP2K input file.
 +
 
 +
Note: Due to the time limitation we cannot perform all these CP2K simulations during this exercise. However, the example CP2K input (<code>run.inp</code>) and output (<code>run.out</code>) files with different cutoff values can be
 +
found in <code>Ex_LinuxHPC/02_AnalyseOutput</code>.
 +
 
 +
'''Analyse CP2K output files'''
 +
 +
In the CP2K output file, e.g.<code>run.out</code>, the most important information is printed after every simulation step. For example:
 +
 
 +
<syntaxhighlight lang="bash">
 +
*******************************************************************************
 +
ENSEMBLE TYPE                =                                              NVE
 +
STEP NUMBER                  =                                                1
 +
TIME [fs]                    =                                        0.500000
 +
CONSERVED QUANTITY [hartree] =                              -0.545718508103E+03
 +
 
 +
                                              INSTANTANEOUS            AVERAGES
 +
CPU TIME [s]                =                        8.34                8.34
 +
ENERGY DRIFT PER ATOM [K]    =        -0.172713513639E+02  0.000000000000E+00
 +
POTENTIAL ENERGY[hartree]    =        -0.545966997800E+03  -0.545966997800E+03
 +
KINETIC ENERGY [hartree]    =          0.248489696633E+00  0.248489696633E+00
 +
TEMPERATURE [K]              =                    550.644              550.644
 +
*******************************************************************************
 +
</syntaxhighlight>
  
 +
for step number 1 (see the line of <code>STEP NUMBER</code> above). Among these data the most useful results are:
 +
    1. the time step for the simulation on the line beginning with <code>TIME [fs]</code>.
 +
    2. the potential energy for the step on the line beginning with <code>POTENTIAL ENERGY[hartree]</code>. Please note the energy value given in the column of <code>INSTANTANEOUS</code> is relevant.
  
 +
In this exercise please:
 +
    1. write a script that summarizes and prints the time step and potential energy for the CP2K output file.
 +
    2. create a bash script to loop through all CP2K output files and print the time step and potential energy for each one automatically.
  
  
    page 3:
+
<div style='text-align: left;float:left;width:33%;'>{{Clickable button|[[Introduction_to_Linux_in_HPC/Beyond_the_cluster | Previous Page]]|color=white}}</div>
        Shell scripts
+
<div style='text-align: center;float:left;width:33%;'>{{Clickable button|[[Introduction_to_Linux_in_HPC | Main Menu Page]]|color=white}}</div>
        Shell utilities
 
        Combine multiple programs via pipeline.
 
    page 4 - 5:  
 
        space is important
 
        string comparison
 
        integer comparison
 
        other operators
 
    page 6:  
 
        for-loop
 
        while-loop
 
    page 7:  
 
        grep
 
        sed
 
        awk
 
    page 8 - 9:
 
        command1 | command2
 
        echo "scale=64; 355.0/113.0" | bc -l
 
        Tips for pipeline
 
    page 10 - 11:  
 
        mature grep
 
        simpler grep
 
    page 12:  
 
        create input files
 
        run simulations
 
        collect results
 
    page 13:
 
        step 1: prepare input template
 
    page 14:
 
        step 2: replace placeholder using sed
 
    page 15:
 
        step 3: loop thru all parameters
 
    page 16:
 
        step 4: save input files
 
    page 17:
 
        step 5: run simulations
 
    page 18:
 
        identify useful data in output
 
    page 19:
 
        small steps for writing a script
 
    page 20:
 
        multiple solutions are possible
 
    page 21:
 
        filename for output
 
    page 22:
 
        get boxsize
 
    page 23:
 
        init variables
 
    page 24:
 
        accumulate avgT and count numT
 
    page 25:
 
        calculate avgT
 
    page 26:
 
        print the results (explain printf and format)
 
    page 27:
 
        execute report.sh
 
    page 28:
 
        three parts in awk script
 
    page 29:  
 
        init variables
 
    page 30:
 
        get boxsize
 
    page 31:
 
        accumulate avgT and increment numT
 
    page 32:
 
        print the results
 
    page 33:
 
        execute report.awk
 
    page 34:
 
        How to summarize the final results?
 
    page 35:
 
        step 1: find and sort
 
    page 36:
 
        step 2: loop the list of output
 
    page 37:
 
        step 3: use either report.sh or report.awk
 
    page 38:
 
        show final report
 
    page 39:
 
        Environment modules for HPC users
 
        module cmd
 
    page 40:
 
        Recommendations
 

Latest revision as of 15:34, 3 November 2020



HPC.NRW
HPC.NRW
Other HPC Courses
1. Gprof Tutorial
2. OpenMP in Small Bites
Introduction to Linux in HPC
1. Background and History
2. The Command Line
3. Directory structure
4. Files
5. Text display and search
6. Users and permissions
7. Processes
8. The vim text editor
9. Shell scripting
10. Environment variables
11. System configuration
12. SSH Connections
13. SSH: Graphics and File Transfer
14. Various tips


Video


Quiz

1. Which command you can use to do a secure copy from the Cluster to you local Linux machine?
Hint:man scp

Click and submit to see the answer

2. Label the interface elements in the terminal:

Linux hpc quiz.png

1. shell command
2. current prompt
3. previous prompt
4. cursor
5. login message
6. command output
Click and submit to get answer


Info:  Integrated in slides


Warning:  Integrated in slides

Exercises for Linux in HPC: GO CP2K GO!

CP2K is a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems. In this exercise we are going to
   1. create CP2K input files with different cutoff values from a template input for simulation of 32 water molecules in a box using density functional theory (DFT) calculation.
   2 .analyse the simulation output files and summarize some important results.

Create CP2K input files
The CP2K template input file for simulation of 32 water molecules in a box using DFT calculation can be found in the Ex_LinuxHPC/01_CreateInput directory and the file name is template.inp.
A placeholder __CUTOFF__ is set on line 7 of this file.
     CUTOFF __CUTOFF__
With smaller cutoff value the DFT calculation runs faster, but the results may be less accurate. With larger cutoff value, on the other hand, the results become more accurate, but the DFT calculation can be slower.
In this exercise we create the CP2K input files based on the template (template.inp) for a range of different cutoff values, e.g. from 250 to 350 with a step size of 10. Please write a bash script that:
   1. creates individual subdirectories for the simulations with different cutoff values
   2. in each subdirectory creates the CP2K input file from the template file with the placeholder __CUTOFF__ being replaced by an appropriate cutoff value. The cutoff values are from 250 to 350 and the increment 
      is 10. e.g. the cutoff in the first input file may have
         CUTOFF 250
   and the cutoff in the second input file may look like
         CUTOFF 260
   and so on until CUTOFF 350 in the last CP2K input file.
Note: Due to the time limitation we cannot perform all these CP2K simulations during this exercise. However, the example CP2K input (run.inp) and output (run.out) files with different cutoff values can be 
found in Ex_LinuxHPC/02_AnalyseOutput. 
Analyse CP2K output files

In the CP2K output file, e.g.run.out, the most important information is printed after every simulation step. For example:
 *******************************************************************************
 ENSEMBLE TYPE                =                                              NVE
 STEP NUMBER                  =                                                1
 TIME [fs]                    =                                         0.500000
 CONSERVED QUANTITY [hartree] =                              -0.545718508103E+03

                                              INSTANTANEOUS             AVERAGES
 CPU TIME [s]                 =                        8.34                 8.34
 ENERGY DRIFT PER ATOM [K]    =         -0.172713513639E+02   0.000000000000E+00
 POTENTIAL ENERGY[hartree]    =         -0.545966997800E+03  -0.545966997800E+03
 KINETIC ENERGY [hartree]     =          0.248489696633E+00   0.248489696633E+00
 TEMPERATURE [K]              =                     550.644              550.644
 *******************************************************************************
for step number 1 (see the line of STEP NUMBER above). Among these data the most useful results are:
   1. the time step for the simulation on the line beginning with TIME [fs].
   2. the potential energy for the step on the line beginning with POTENTIAL ENERGY[hartree]. Please note the energy value given in the column of INSTANTANEOUS is relevant.
In this exercise please:
   1. write a script that summarizes and prints the time step and potential energy for the CP2K output file.
   2. create a bash script to loop through all CP2K output files and print the time step and potential energy for each one automatically.


Previous Page
Main Menu Page