https://hpc-wiki.info/hpc/api.php?action=feedcontributions&user=Christian-wassermann-e30b%40rwth-aachen.de&feedformat=atomHPC Wiki - User contributions [en]2024-03-28T23:47:09ZUser contributionsMediaWiki 1.35.9https://hpc-wiki.info/hpc/index.php?title=User:Christian-wassermann-e30b@rwth-aachen.de&diff=1025User:Christian-wassermann-e30b@rwth-aachen.de2018-08-29T07:06:30Z<p>Christian-wassermann-e30b@rwth-aachen.de: Created page with "My Page 😊"</p>
<hr />
<div>My Page 😊</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=1006HPC-Dictionary2018-04-23T15:45:20Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== Unix ==<br />
<br />
Unix describes a family of operating systems. Popular representatives include Ubuntu, CentOS and even MacOS, although the latter is not common on HPC systems. Main key features are its shell and file system.<br />
<br />
== File System ==<br />
<br />
The file system describes the directory structure of an operating system. On Unix-based systems the top most directory is <code>/</code>, which is called the root directory. As the name may suggest the file system is organized hierarchically (like a tree) from there on out. Most of the time you will be working in directories starting with <code>/home/<username></code>, which represents the user's home directory. All directories starting with <code>/home/<username></code> can freely be modified to the will of the user.<br />
<br />
== Environment Variable ==<br />
<br />
An environment variable is a dynamic object on a computer, which stores a value. On Unix-based operating systems you can:<br />
<br />
* set the value of a variable with: <code>export <variable-name>=<value></code><br />
* read the value of a variable with: <code>echo $<variable-name></code><br />
<br />
Environment variables can be referenced by software (or the user) to get or set information about the system. Down below are a few examples of environment variables, which might give you an idea for their use and usefulness.<br />
<br />
{| class="wikitable"<br />
|+ Common Environment Variables on Unix Systems<br />
|-<br />
! Environment Variable<br />
! Content<br />
|-<br />
| <code>$USER</code><br />
| your current username<br />
|-<br />
| <code>$PWD</code><br />
| the directory you are currently in<br />
|-<br />
| <code>$HOSTNAME</code><br />
| hostname of the computer you are on<br />
|-<br />
| <code>$HOME</code><br />
| your home directory<br />
|-<br />
| <code>$PATH</code><br />
| list of directories searched for when a command is executed<br />
|}<br />
<br />
== Cluster ==<br />
<br />
A cluster refers to a collection of multiple nodes, which are connected via a network offering high bandwidth with low latency communication. Accessing a cluster is possible by connecting to its specific login nodes.<br />
<br />
== Node ==<br />
<br />
[[File:Hardware_hierarchy.PNG|thumb|500px|Visualization of a typical hardware hierarchy on a cluster]]<br />
<br />
A node is an individual computer consisting of one or more sockets.<br />
<br />
=== Backend Node ===<br />
<br />
Backend nodes are reserved for executing memory demanding and long running applications. They are the most powerful, but also most power consuming part of a cluster as they make up around 98% of it. Since these nodes are not directly accessible by the user, a scheduler manages their access. In order to run on these nodes, a batch job needs to be submitted to the batch system via a scheduler specific command.<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login nodes]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== Socket ==<br />
<br />
A socket is the physical package in which multiple cores are enclosed sharing the same memory.<br />
<br />
== Core ==<br />
<br />
A core has one or more hardware threads and is responsible for executing instructions.<br />
<br />
== Thread ==<br />
<br />
Several threads belong to a single process and share an address space, but each thread has its own stack.<br />
<br />
== Central Processing Unit (CPU) ==<br />
<br />
The word "CPU" is widely used in the field of HPC though not precisely defined. It is mostly used to describe the concrete hardware architecture of a node, but should generally be avoided due to possible misunderstandings and ambiguities.<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
[[File:Memory_hierarchy.PNG|thumb|500px|Visualization of the memory hierarchy a.k.a. the memory pyramid]]<br />
<br />
The RAM is used as working memory for the cores. This is volatile memory meaning, that after a process ends the data in the RAM is no longer available. The RAM is shared between all sockets on a node, though it is physically separated for each socket.<br />
<br />
== Cache ==<br />
<br />
A cache is a relatively small amount of fast memory (compared to RAM), on the CPU chip. A modern CPU has three cache levels: L1 and L2 are specific to each core, while L3 (or Last Level Cache (LLC)) is shared among all cores of a CPU.<br />
<br />
== Scalability ==<br />
<br />
Scalability represents a property of software, that describes how good an application can use an increased number of hardware resources. Good scalability would mean a decrease in runtime when more and more cores are used to solve the problem. Typically applications reach an upper bound regarding a number of cores "beyond they stop scaling", which means that the execution time stops going down (or might even increase again). <br />
<br />
However, good scalability can also imply that the execution time remains the same, when the hardware resources and the problem size increase simultaneously.</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=1005HPC-Dictionary2018-04-23T15:43:52Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== Unix ==<br />
<br />
Unix describes a family of operating systems. Popular representatives include Ubuntu, CentOS and even MacOS, although the latter is not common on HPC systems. Main key features are its shell and file system.<br />
<br />
== File System ==<br />
<br />
The file system describes the directory structure of an operating system. On Unix-based systems the top most directory is <code>/</code>, which is called the root directory. As the name may suggest the file system is organized hierarchically (like a tree) from there on out. Most of the time you will be working in directories starting with <code>/home/<username></code>, which represents the user's home directory. All directories starting with <code>/home/<username></code> can freely be modified to the will of the user.<br />
<br />
== Environment Variable ==<br />
<br />
An environment variable is a dynamic object on a computer, which stores a value. On Unix-based operating systems you can:<br />
<br />
* set the value of a variable with: <code>export <variable-name>=<value></code><br />
* read the value of a variable with: <code>echo $<variable-name></code><br />
<br />
Environment variables can be referenced by software (or the user) to get or set information about the system. Down below are a few examples of environment variables, which might give you an idea for their use and usefulness.<br />
<br />
{| class="wikitable"<br />
|+ Common Environment Variables on Unix Systems<br />
|-<br />
! Environment Variable<br />
! Content<br />
|-<br />
| <code>$USER</code><br />
| your current username<br />
|-<br />
| <code>$PWD</code><br />
| the directory you are currently in<br />
|-<br />
| <code>$HOSTNAME</code><br />
| hostname of the computer you are on<br />
|-<br />
| <code>$HOME</code><br />
| your home directory<br />
|-<br />
| <code>$PATH</code><br />
| list of directories searched for when a command is executed<br />
|}<br />
<br />
== Cluster ==<br />
<br />
A cluster refers to a collection of multiple nodes, which are connected via a network offering high bandwidth with low latency communication. Accessing a cluster is possible by connecting to its specific login nodes.<br />
<br />
== Node ==<br />
<br />
[[File:Hardware_hierarchy.PNG|thumb|500px|Visualization of a typical hardware hierarchy on a cluster]]<br />
<br />
A node is an individual computer consisting of one or more sockets.<br />
<br />
=== Backend Node ===<br />
<br />
Backend nodes are reserved for executing memory demanding and long running applications. They are the most powerful, but also most power consuming part of a cluster as they make up around 98% of it. Since these nodes are not directly accessible by the user, a scheduler manages their access. In order to run on these nodes, a batch job needs to be submitted to the batch system via a scheduler specific command.<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login nodes]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== Socket ==<br />
<br />
A socket is the physical package in which multiple cores are enclosed sharing the same memory.<br />
<br />
== Core ==<br />
<br />
A core has one or more hardware threads and is responsible for executing instructions.<br />
<br />
== Thread ==<br />
<br />
Several threads belong to a single process and share an address space, but each thread has its own stack.<br />
<br />
== Central Processing Unit (CPU) ==<br />
<br />
The word "CPU" is widely used in the field of HPC though not precisely defined. It is mostly used to describe the concrete hardware architecture of a node, but should generally be avoided due to possible misunderstandings and ambiguities.<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
[[File:Memory_hierarchy.PNG|thumb|500px|Visualization of the memory hierarchy a.k.a. memory pyramid]]<br />
<br />
The RAM is used as working memory for the cores. This is volatile memory meaning, that after a process ends the data in the RAM is no longer available. The RAM is shared between all sockets on a node, though it is physically separated for each socket.<br />
<br />
== Cache ==<br />
<br />
A cache is a relatively small amount of fast memory (compared to RAM), on the CPU chip. A modern CPU has three cache levels: L1 and L2 are specific to each core, while L3 (or Last Level Cache (LLC)) is shared among all cores of a CPU.<br />
<br />
== Scalability ==<br />
<br />
Scalability represents a property of software, that describes how good an application can use an increased number of hardware resources. Good scalability would mean a decrease in runtime when more and more cores are used to solve the problem. Typically applications reach an upper bound regarding a number of cores "beyond they stop scaling", which means that the execution time stops going down (or might even increase again). <br />
<br />
However, good scalability can also imply that the execution time remains the same, when the hardware resources and the problem size increase simultaneously.</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=File:Memory_hierarchy.PNG&diff=1004File:Memory hierarchy.PNG2018-04-23T15:35:10Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div></div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=937HPC-Dictionary2018-04-17T15:38:13Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* Scalability */</p>
<hr />
<div>== Unix ==<br />
<br />
Unix describes a family of operating systems. Popular representatives include Ubuntu, CentOS and even MacOS, although the latter is not common on HPC systems. Main key features are its shell and file system.<br />
<br />
== File System ==<br />
<br />
The file system describes the directory structure of an operating system. On Unix-based systems the top most directory is <code>/</code>, which is called the root directory. As the name may suggest the file system is organized hierarchicly (like a tree) from there on out. Most of the time you will be working in directories starting with <code>/home/<username></code>, which represents the user's home directory. All directories starting with <code>/home/<username></code> can freely be modified to the will of the user.<br />
<br />
== Environment Variable ==<br />
<br />
An environment variable is a dynamic object on a computer, which stores a value. Under Unix-based operating systems you can:<br />
<br />
* set the value of a variable with: <code>export <variable-name>=<value></code><br />
* read the value of a variable with: <code>echo $<variable-name></code><br />
<br />
Environment variables can be referenced by software (or the user) to get or set information about the system. Down below are a few examples of environment variables, which might give you an idea for their use and usefulness.<br />
<br />
{| class="wikitable"<br />
|+ Common Environment Variable on Unix Systems<br />
|-<br />
! Environment Variable<br />
! Content<br />
|-<br />
| <code>$USER</code><br />
| your current username<br />
|-<br />
| <code>$PWD</code><br />
| the directory you are currently in<br />
|-<br />
| <code>$HOSTNAME</code><br />
| hostname of the computer you are on<br />
|-<br />
| <code>$HOME</code><br />
| your home directory<br />
|-<br />
| <code>$PATH</code><br />
| list of directories searched for when a command is executed<br />
|}<br />
<br />
== Cluster ==<br />
<br />
A cluster referes to a collection of multiple nodes, which are connected via a network offering high bandwidth with low lateny communication. Accessing a cluster is possible by connecting to its specific login nodes.<br />
<br />
== Node ==<br />
<br />
A node is an individual computer consisting of one or more sockets.<br />
<br />
=== Backend Node ===<br />
<br />
Backend nodes are reserved for executing memory demanding and long running applications. They are the most powerful, but also most power consuming part of a cluster as they make up around 98% of it. Since these nodes are not directly accessable by the user, a scheduler manages their access. In order to run on these nodes, a batch job needs to be submitted to the batch system via a scheduler specific command.<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login node]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== Socket ==<br />
<br />
A socket is the physical package in which multiple cores are enclosed sharing the same memory.<br />
<br />
== Core ==<br />
<br />
A core has one or more hardware threads and is respnsible for executing instructions.<br />
<br />
== Thread ==<br />
<br />
TODO<br />
<br />
== Central Processing Unit (CPU) ==<br />
<br />
The word "CPU" is widely used in the field of HPC though not precisely defined. It is mostly used to describe the concrete hardware architecture of a node, but should generally be avoided due to possible misunderstandings and ambiguities.<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
The RAM is used as working memory for the cores. This is volatile memory meaning, that after a process ends the data in the RAM is no longer available. The RAM is shared between all sockets on a node, though it is physically seperated for each socket.<br />
<br />
== Cache ==<br />
<br />
A cache is a relatively small amount of fast memory (compared to RAM), on the CPU chip. A modern CPU has three cache levels: L1 and L2 are specific to each core, while L3 (or Last Level Cache (LLC)) is shared among all cores of a CPU.<br />
<br />
== Scalability ==<br />
<br />
Scalability represents a property of software, that describes how good an application can use an increased number of hardware resources. Good scalability would mean a decrease in runtime when more and more cores are used to solve the problem. Typically applications reach an upper bound regarding a number of cores "beyond they stop scaling", which means that the execution time stops going down (or might even increase again).</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=936HPC-Dictionary2018-04-17T15:36:11Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* Cache */</p>
<hr />
<div>== Unix ==<br />
<br />
Unix describes a family of operating systems. Popular representatives include Ubuntu, CentOS and even MacOS, although the latter is not common on HPC systems. Main key features are its shell and file system.<br />
<br />
== File System ==<br />
<br />
The file system describes the directory structure of an operating system. On Unix-based systems the top most directory is <code>/</code>, which is called the root directory. As the name may suggest the file system is organized hierarchicly (like a tree) from there on out. Most of the time you will be working in directories starting with <code>/home/<username></code>, which represents the user's home directory. All directories starting with <code>/home/<username></code> can freely be modified to the will of the user.<br />
<br />
== Environment Variable ==<br />
<br />
An environment variable is a dynamic object on a computer, which stores a value. Under Unix-based operating systems you can:<br />
<br />
* set the value of a variable with: <code>export <variable-name>=<value></code><br />
* read the value of a variable with: <code>echo $<variable-name></code><br />
<br />
Environment variables can be referenced by software (or the user) to get or set information about the system. Down below are a few examples of environment variables, which might give you an idea for their use and usefulness.<br />
<br />
{| class="wikitable"<br />
|+ Common Environment Variable on Unix Systems<br />
|-<br />
! Environment Variable<br />
! Content<br />
|-<br />
| <code>$USER</code><br />
| your current username<br />
|-<br />
| <code>$PWD</code><br />
| the directory you are currently in<br />
|-<br />
| <code>$HOSTNAME</code><br />
| hostname of the computer you are on<br />
|-<br />
| <code>$HOME</code><br />
| your home directory<br />
|-<br />
| <code>$PATH</code><br />
| list of directories searched for when a command is executed<br />
|}<br />
<br />
== Cluster ==<br />
<br />
A cluster referes to a collection of multiple nodes, which are connected via a network offering high bandwidth with low lateny communication. Accessing a cluster is possible by connecting to its specific login nodes.<br />
<br />
== Node ==<br />
<br />
A node is an individual computer consisting of one or more sockets.<br />
<br />
=== Backend Node ===<br />
<br />
Backend nodes are reserved for executing memory demanding and long running applications. They are the most powerful, but also most power consuming part of a cluster as they make up around 98% of it. Since these nodes are not directly accessable by the user, a scheduler manages their access. In order to run on these nodes, a batch job needs to be submitted to the batch system via a scheduler specific command.<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login node]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== Socket ==<br />
<br />
A socket is the physical package in which multiple cores are enclosed sharing the same memory.<br />
<br />
== Core ==<br />
<br />
A core has one or more hardware threads and is respnsible for executing instructions.<br />
<br />
== Thread ==<br />
<br />
TODO<br />
<br />
== Central Processing Unit (CPU) ==<br />
<br />
The word "CPU" is widely used in the field of HPC though not precisely defined. It is mostly used to describe the concrete hardware architecture of a node, but should generally be avoided due to possible misunderstandings and ambiguities.<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
The RAM is used as working memory for the cores. This is volatile memory meaning, that after a process ends the data in the RAM is no longer available. The RAM is shared between all sockets on a node, though it is physically seperated for each socket.<br />
<br />
== Cache ==<br />
<br />
A cache is a relatively small amount of fast memory (compared to RAM), on the CPU chip. A modern CPU has three cache levels: L1 and L2 are specific to each core, while L3 (or Last Level Cache (LLC)) is shared among all cores of a CPU.<br />
<br />
== Scalability ==<br />
<br />
Scalability represents a property of software, that describes how good an application can use an increased number of hardware resources. Good scalability would mean a decrease in runtime when more and more cores are used to solve the problem. Typically applications reach an upper bound regarding a number of cores "beyond they stop scaling".</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=935HPC-Dictionary2018-04-17T15:34:51Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== Unix ==<br />
<br />
Unix describes a family of operating systems. Popular representatives include Ubuntu, CentOS and even MacOS, although the latter is not common on HPC systems. Main key features are its shell and file system.<br />
<br />
== File System ==<br />
<br />
The file system describes the directory structure of an operating system. On Unix-based systems the top most directory is <code>/</code>, which is called the root directory. As the name may suggest the file system is organized hierarchicly (like a tree) from there on out. Most of the time you will be working in directories starting with <code>/home/<username></code>, which represents the user's home directory. All directories starting with <code>/home/<username></code> can freely be modified to the will of the user.<br />
<br />
== Environment Variable ==<br />
<br />
An environment variable is a dynamic object on a computer, which stores a value. Under Unix-based operating systems you can:<br />
<br />
* set the value of a variable with: <code>export <variable-name>=<value></code><br />
* read the value of a variable with: <code>echo $<variable-name></code><br />
<br />
Environment variables can be referenced by software (or the user) to get or set information about the system. Down below are a few examples of environment variables, which might give you an idea for their use and usefulness.<br />
<br />
{| class="wikitable"<br />
|+ Common Environment Variable on Unix Systems<br />
|-<br />
! Environment Variable<br />
! Content<br />
|-<br />
| <code>$USER</code><br />
| your current username<br />
|-<br />
| <code>$PWD</code><br />
| the directory you are currently in<br />
|-<br />
| <code>$HOSTNAME</code><br />
| hostname of the computer you are on<br />
|-<br />
| <code>$HOME</code><br />
| your home directory<br />
|-<br />
| <code>$PATH</code><br />
| list of directories searched for when a command is executed<br />
|}<br />
<br />
== Cluster ==<br />
<br />
A cluster referes to a collection of multiple nodes, which are connected via a network offering high bandwidth with low lateny communication. Accessing a cluster is possible by connecting to its specific login nodes.<br />
<br />
== Node ==<br />
<br />
A node is an individual computer consisting of one or more sockets.<br />
<br />
=== Backend Node ===<br />
<br />
Backend nodes are reserved for executing memory demanding and long running applications. They are the most powerful, but also most power consuming part of a cluster as they make up around 98% of it. Since these nodes are not directly accessable by the user, a scheduler manages their access. In order to run on these nodes, a batch job needs to be submitted to the batch system via a scheduler specific command.<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login node]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== Socket ==<br />
<br />
A socket is the physical package in which multiple cores are enclosed sharing the same memory.<br />
<br />
== Core ==<br />
<br />
A core has one or more hardware threads and is respnsible for executing instructions.<br />
<br />
== Thread ==<br />
<br />
TODO<br />
<br />
== Central Processing Unit (CPU) ==<br />
<br />
The word "CPU" is widely used in the field of HPC though not precisely defined. It is mostly used to describe the concrete hardware architecture of a node, but should generally be avoided due to possible misunderstandings and ambiguities.<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
The RAM is used as working memory for the cores. This is volatile memory meaning, that after a process ends the data in the RAM is no longer available. The RAM is shared between all sockets on a node, though it is physically seperated for each socket.<br />
<br />
== Cache ==<br />
<br />
A cache is a relatively small amount of fast memory (compared to RAM), on the CPU chip. A modern CPU has three cache levels: L1 and L2 are specific to each core, while L3 (or Last Level Cache (LLC)) is shared amoung all cores of a CPU.<br />
<br />
== Scalability ==<br />
<br />
Scalability represents a property of software, that describes how good an application can use an increased number of hardware resources. Good scalability would mean a decrease in runtime when more and more cores are used to solve the problem. Typically applications reach an upper bound regarding a number of cores "beyond they stop scaling".</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=934HPC-Dictionary2018-04-17T15:33:56Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* Central Processing Unit (CPU) */</p>
<hr />
<div>== Unix ==<br />
<br />
Unix describes a family of operating systems. Popular representatives include Ubuntu, CentOS and even MacOS, although the latter is not common on HPC systems. Main key features are its shell and file system.<br />
<br />
== File System ==<br />
<br />
The file system describes the directory structure of an operating system. On Unix-based systems the top most directory is <code>/</code>, which is called the root directory. As the name may suggest the file system is organized hierarchicly (like a tree) from there on out. Most of the time you will be working in directories starting with <code>/home/<username></code>, which represents the user's home directory. All directories starting with <code>/home/<username></code> can freely be modified to the will of the user.<br />
<br />
== Environment Variable ==<br />
<br />
An environment variable is a dynamic object on a computer, which stores a value. Under Unix-based operating systems you can:<br />
<br />
* set the value of a variable with: <code>export <variable-name>=<value></code><br />
* read the value of a variable with: <code>echo $<variable-name></code><br />
<br />
Environment variables can be referenced by software (or the user) to get or set information about the system. Down below are a few examples of environment variables, which might give you an idea for their use and usefulness.<br />
<br />
{| class="wikitable"<br />
|+ Common Environment Variable on Unix Systems<br />
|-<br />
! Environment Variable<br />
! Content<br />
|-<br />
| <code>$USER</code><br />
| your current username<br />
|-<br />
| <code>$PWD</code><br />
| the directory you are currently in<br />
|-<br />
| <code>$HOSTNAME</code><br />
| hostname of the computer you are on<br />
|-<br />
| <code>$HOME</code><br />
| your home directory<br />
|-<br />
| <code>$PATH</code><br />
| list of directories searched for when a command is executed<br />
|}<br />
<br />
== Cluster ==<br />
<br />
A cluster referes to a collection of multiple nodes, which are connected via a network offering high bandwidth with low lateny communication. Accessing a cluster is possible by connecting to its specific login nodes.<br />
<br />
== Node ==<br />
<br />
A node is an individual computer consisting of one or more sockets.<br />
<br />
=== Backend Node ===<br />
<br />
Backend nodes are reserved for executing memory demanding and long running applications. They are the most powerful, but also most power consuming part of a cluster as they make up around 98% of it. Since these nodes are not directly accessable by the user, a scheduler manages their access. In order to run on these nodes, a batch job needs to be submitted to the batch system via a scheduler specific command.<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login node]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== Socket ==<br />
<br />
A socket is the physical package in which multiple cores are enclosed sharing the same memory.<br />
<br />
== Core ==<br />
<br />
A core has one or more hardware threads and is respnsible for executing instructions.<br />
<br />
== Central Processing Unit (CPU) ==<br />
<br />
The word "CPU" is widely used in the field of HPC though not precisely defined. It is mostly used to describe the concrete hardware architecture of a node, but should generally be avoided due to possible misunderstandings and ambiguities.<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
The RAM is used as working memory for the cores. This is volatile memory meaning, that after a process ends the data in the RAM is no longer available. The RAM is shared between all sockets on a node, though it is physically seperated for each socket.<br />
<br />
== Cache ==<br />
<br />
A cache is a relatively small amount of fast memory (compared to RAM), on the CPU chip. A modern CPU has three cache levels: L1 and L2 are specific to each core, while L3 (or Last Level Cache (LLC)) is shared amoung all cores of a CPU.<br />
<br />
== Scalability ==<br />
<br />
Scalability represents a property of software, that describes how good an application can use an increased number of hardware resources. Good scalability would mean a decrease in runtime when more and more cores are used to solve the problem. Typically applications reach an upper bound regarding a number of cores "beyond they stop scaling".</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=933HPC-Dictionary2018-04-17T15:31:26Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* Environment Variable */</p>
<hr />
<div>== Unix ==<br />
<br />
Unix describes a family of operating systems. Popular representatives include Ubuntu, CentOS and even MacOS, although the latter is not common on HPC systems. Main key features are its shell and file system.<br />
<br />
== File System ==<br />
<br />
The file system describes the directory structure of an operating system. On Unix-based systems the top most directory is <code>/</code>, which is called the root directory. As the name may suggest the file system is organized hierarchicly (like a tree) from there on out. Most of the time you will be working in directories starting with <code>/home/<username></code>, which represents the user's home directory. All directories starting with <code>/home/<username></code> can freely be modified to the will of the user.<br />
<br />
== Environment Variable ==<br />
<br />
An environment variable is a dynamic object on a computer, which stores a value. Under Unix-based operating systems you can:<br />
<br />
* set the value of a variable with: <code>export <variable-name>=<value></code><br />
* read the value of a variable with: <code>echo $<variable-name></code><br />
<br />
Environment variables can be referenced by software (or the user) to get or set information about the system. Down below are a few examples of environment variables, which might give you an idea for their use and usefulness.<br />
<br />
{| class="wikitable"<br />
|+ Common Environment Variable on Unix Systems<br />
|-<br />
! Environment Variable<br />
! Content<br />
|-<br />
| <code>$USER</code><br />
| your current username<br />
|-<br />
| <code>$PWD</code><br />
| the directory you are currently in<br />
|-<br />
| <code>$HOSTNAME</code><br />
| hostname of the computer you are on<br />
|-<br />
| <code>$HOME</code><br />
| your home directory<br />
|-<br />
| <code>$PATH</code><br />
| list of directories searched for when a command is executed<br />
|}<br />
<br />
== Cluster ==<br />
<br />
A cluster referes to a collection of multiple nodes, which are connected via a network offering high bandwidth with low lateny communication. Accessing a cluster is possible by connecting to its specific login nodes.<br />
<br />
== Node ==<br />
<br />
A node is an individual computer consisting of one or more sockets.<br />
<br />
=== Backend Node ===<br />
<br />
Backend nodes are reserved for executing memory demanding and long running applications. They are the most powerful, but also most power consuming part of a cluster as they make up around 98% of it. Since these nodes are not directly accessable by the user, a scheduler manages their access. In order to run on these nodes, a batch job needs to be submitted to the batch system via a scheduler specific command.<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login node]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== Socket ==<br />
<br />
A socket is the physical package in which multiple cores are enclosed sharing the same memory.<br />
<br />
== Core ==<br />
<br />
A core has one or more hardware threads and is respnsible for executing instructions.<br />
<br />
== Central Processing Unit (CPU) ==<br />
<br />
The word "CPU" is widely used in the field of HPC though not precisely defined. It is mostly used to describe the concrete hardware architecture of a node, but should generally avoided due to possible misunderstandings.<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
The RAM is used as working memory for the cores. This is volatile memory meaning, that after a process ends the data in the RAM is no longer available. The RAM is shared between all sockets on a node, though it is physically seperated for each socket.<br />
<br />
== Cache ==<br />
<br />
A cache is a relatively small amount of fast memory (compared to RAM), on the CPU chip. A modern CPU has three cache levels: L1 and L2 are specific to each core, while L3 (or Last Level Cache (LLC)) is shared amoung all cores of a CPU.<br />
<br />
== Scalability ==<br />
<br />
Scalability represents a property of software, that describes how good an application can use an increased number of hardware resources. Good scalability would mean a decrease in runtime when more and more cores are used to solve the problem. Typically applications reach an upper bound regarding a number of cores "beyond they stop scaling".</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=932HPC-Dictionary2018-04-17T15:30:50Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* Environment Variable */</p>
<hr />
<div>== Unix ==<br />
<br />
Unix describes a family of operating systems. Popular representatives include Ubuntu, CentOS and even MacOS, although the latter is not common on HPC systems. Main key features are its shell and file system.<br />
<br />
== File System ==<br />
<br />
The file system describes the directory structure of an operating system. On Unix-based systems the top most directory is <code>/</code>, which is called the root directory. As the name may suggest the file system is organized hierarchicly (like a tree) from there on out. Most of the time you will be working in directories starting with <code>/home/<username></code>, which represents the user's home directory. All directories starting with <code>/home/<username></code> can freely be modified to the will of the user.<br />
<br />
== Environment Variable ==<br />
<br />
An environment variable is a dynamic object on a computer, which stores a value. You can:<br />
<br />
* set the value of a variable with: <code>export <variable-name>=<value></code><br />
* read the value of a variable with: <code>echo $<variable-name></code><br />
<br />
Environment variables can be referenced by software (or the user) to get or set information about the system. Down below are a few examples of environment variables, which might give you an idea for their use and usefulness.<br />
<br />
{| class="wikitable"<br />
|+ Common Environment Variable on Unix Systems<br />
|-<br />
! Environment Variable<br />
! Content<br />
|-<br />
| <code>$USER</code><br />
| your current username<br />
|-<br />
| <code>$PWD</code><br />
| the directory you are currently in<br />
|-<br />
| <code>$HOSTNAME</code><br />
| hostname of the computer you are on<br />
|-<br />
| <code>$HOME</code><br />
| your home directory<br />
|-<br />
| <code>$PATH</code><br />
| list of directories searched for when a command is executed<br />
|}<br />
<br />
== Cluster ==<br />
<br />
A cluster referes to a collection of multiple nodes, which are connected via a network offering high bandwidth with low lateny communication. Accessing a cluster is possible by connecting to its specific login nodes.<br />
<br />
== Node ==<br />
<br />
A node is an individual computer consisting of one or more sockets.<br />
<br />
=== Backend Node ===<br />
<br />
Backend nodes are reserved for executing memory demanding and long running applications. They are the most powerful, but also most power consuming part of a cluster as they make up around 98% of it. Since these nodes are not directly accessable by the user, a scheduler manages their access. In order to run on these nodes, a batch job needs to be submitted to the batch system via a scheduler specific command.<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login node]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== Socket ==<br />
<br />
A socket is the physical package in which multiple cores are enclosed sharing the same memory.<br />
<br />
== Core ==<br />
<br />
A core has one or more hardware threads and is respnsible for executing instructions.<br />
<br />
== Central Processing Unit (CPU) ==<br />
<br />
The word "CPU" is widely used in the field of HPC though not precisely defined. It is mostly used to describe the concrete hardware architecture of a node, but should generally avoided due to possible misunderstandings.<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
The RAM is used as working memory for the cores. This is volatile memory meaning, that after a process ends the data in the RAM is no longer available. The RAM is shared between all sockets on a node, though it is physically seperated for each socket.<br />
<br />
== Cache ==<br />
<br />
A cache is a relatively small amount of fast memory (compared to RAM), on the CPU chip. A modern CPU has three cache levels: L1 and L2 are specific to each core, while L3 (or Last Level Cache (LLC)) is shared amoung all cores of a CPU.<br />
<br />
== Scalability ==<br />
<br />
Scalability represents a property of software, that describes how good an application can use an increased number of hardware resources. Good scalability would mean a decrease in runtime when more and more cores are used to solve the problem. Typically applications reach an upper bound regarding a number of cores "beyond they stop scaling".</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=931HPC-Dictionary2018-04-17T15:30:21Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* File System */</p>
<hr />
<div>== Unix ==<br />
<br />
Unix describes a family of operating systems. Popular representatives include Ubuntu, CentOS and even MacOS, although the latter is not common on HPC systems. Main key features are its shell and file system.<br />
<br />
== File System ==<br />
<br />
The file system describes the directory structure of an operating system. On Unix-based systems the top most directory is <code>/</code>, which is called the root directory. As the name may suggest the file system is organized hierarchicly (like a tree) from there on out. Most of the time you will be working in directories starting with <code>/home/<username></code>, which represents the user's home directory. All directories starting with <code>/home/<username></code> can freely be modified to the will of the user.<br />
<br />
== Environment Variable ==<br />
<br />
An environment variable is a dynamic object on a compute which stores a value. You can:<br />
<br />
* set the value of a variable with: <code>export <variable-name>=<value></code><br />
* read the value of a variable with: <code>echo $<variable-name></code><br />
<br />
Environment variables can be referenced by software (or the user) to get or set information about the system. Down below are a few examples of environment variables, which might give you an idea for their use and usefulness.<br />
<br />
{| class="wikitable"<br />
|+ Common Environment Variable on Unix Systems<br />
|-<br />
! Environment Variable<br />
! Content<br />
|-<br />
| <code>$USER</code><br />
| your current username<br />
|-<br />
| <code>$PWD</code><br />
| the directory you are currently in<br />
|-<br />
| <code>$HOSTNAME</code><br />
| hostname of the computer you are on<br />
|-<br />
| <code>$HOME</code><br />
| your home directory<br />
|-<br />
| <code>$PATH</code><br />
| list of directories searched for when a command is executed<br />
|}<br />
<br />
== Cluster ==<br />
<br />
A cluster referes to a collection of multiple nodes, which are connected via a network offering high bandwidth with low lateny communication. Accessing a cluster is possible by connecting to its specific login nodes.<br />
<br />
== Node ==<br />
<br />
A node is an individual computer consisting of one or more sockets.<br />
<br />
=== Backend Node ===<br />
<br />
Backend nodes are reserved for executing memory demanding and long running applications. They are the most powerful, but also most power consuming part of a cluster as they make up around 98% of it. Since these nodes are not directly accessable by the user, a scheduler manages their access. In order to run on these nodes, a batch job needs to be submitted to the batch system via a scheduler specific command.<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login node]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== Socket ==<br />
<br />
A socket is the physical package in which multiple cores are enclosed sharing the same memory.<br />
<br />
== Core ==<br />
<br />
A core has one or more hardware threads and is respnsible for executing instructions.<br />
<br />
== Central Processing Unit (CPU) ==<br />
<br />
The word "CPU" is widely used in the field of HPC though not precisely defined. It is mostly used to describe the concrete hardware architecture of a node, but should generally avoided due to possible misunderstandings.<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
The RAM is used as working memory for the cores. This is volatile memory meaning, that after a process ends the data in the RAM is no longer available. The RAM is shared between all sockets on a node, though it is physically seperated for each socket.<br />
<br />
== Cache ==<br />
<br />
A cache is a relatively small amount of fast memory (compared to RAM), on the CPU chip. A modern CPU has three cache levels: L1 and L2 are specific to each core, while L3 (or Last Level Cache (LLC)) is shared amoung all cores of a CPU.<br />
<br />
== Scalability ==<br />
<br />
Scalability represents a property of software, that describes how good an application can use an increased number of hardware resources. Good scalability would mean a decrease in runtime when more and more cores are used to solve the problem. Typically applications reach an upper bound regarding a number of cores "beyond they stop scaling".</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=How-to-Google&diff=930How-to-Google2018-04-17T15:12:24Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== General ==<br />
<br />
* Keep the Google search as simple as possible only being specific where necessary<br />
* Gradually add search terms, if you do not get good results<br />
* Try to search with professional terms, instead of natural speaking language<br />
* Only use important words, instead of full sentences<br />
* Use descriptive words and simply rephrase searches, if no good results show up<br />
* Use quotes <code>"<search>"</code> to tell Google to use exact matching for the text within quotes, asterisks <code>*</code> can be used as wildcards to match any text<br />
* Use hyphens <code><ambiguous search> -<one matching field></code> to explicitly tell Google to exclude words from the search<br />
<br />
== Dealing with Error Messages ==<br />
<br />
* Just copy the error message from the command line to the Google search bar<br />
* Remove system specific details like the system name or paths<br />
* Though you should keep the file name, if it corresponds to a provided or system source code file</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=929HPC-Dictionary2018-04-17T14:25:04Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== Unix ==<br />
<br />
Unix describes a family of operating systems. Popular representatives include Ubuntu, CentOS and even MacOS, although the latter is not common on HPC systems. Main key features are its shell and file system.<br />
<br />
== File System ==<br />
<br />
The file system describes the directory structure of an operating system. On Unix-based systems the top most directory is <code>/</code>, which is called the root directory. As the name may suggest the file system is organized hierarchicly (like a tree) from there on out. Most of the time you will be working in directories like <code>/home/<username></code>, which represents the user's home directory. All directories starting with <code>/home/<username></code> can freely be modified to the will of the user.<br />
<br />
== Environment Variable ==<br />
<br />
An environment variable is a dynamic object on a compute which stores a value. You can:<br />
<br />
* set the value of a variable with: <code>export <variable-name>=<value></code><br />
* read the value of a variable with: <code>echo $<variable-name></code><br />
<br />
Environment variables can be referenced by software (or the user) to get or set information about the system. Down below are a few examples of environment variables, which might give you an idea for their use and usefulness.<br />
<br />
{| class="wikitable"<br />
|+ Common Environment Variable on Unix Systems<br />
|-<br />
! Environment Variable<br />
! Content<br />
|-<br />
| <code>$USER</code><br />
| your current username<br />
|-<br />
| <code>$PWD</code><br />
| the directory you are currently in<br />
|-<br />
| <code>$HOSTNAME</code><br />
| hostname of the computer you are on<br />
|-<br />
| <code>$HOME</code><br />
| your home directory<br />
|-<br />
| <code>$PATH</code><br />
| list of directories searched for when a command is executed<br />
|}<br />
<br />
== Cluster ==<br />
<br />
A cluster referes to a collection of multiple nodes, which are connected via a network offering high bandwidth with low lateny communication. Accessing a cluster is possible by connecting to its specific login nodes.<br />
<br />
== Node ==<br />
<br />
A node is an individual computer consisting of one or more sockets.<br />
<br />
=== Backend Node ===<br />
<br />
Backend nodes are reserved for executing memory demanding and long running applications. They are the most powerful, but also most power consuming part of a cluster as they make up around 98% of it. Since these nodes are not directly accessable by the user, a scheduler manages their access. In order to run on these nodes, a batch job needs to be submitted to the batch system via a scheduler specific command.<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login node]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== Socket ==<br />
<br />
A socket is the physical package in which multiple cores are enclosed sharing the same memory.<br />
<br />
== Core ==<br />
<br />
A core has one or more hardware threads and is respnsible for executing instructions.<br />
<br />
== Central Processing Unit (CPU) ==<br />
<br />
The word "CPU" is widely used in the field of HPC though not precisely defined. It is mostly used to describe the concrete hardware architecture of a node, but should generally avoided due to possible misunderstandings.<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
The RAM is used as working memory for the cores. This is volatile memory meaning, that after a process ends the data in the RAM is no longer available. The RAM is shared between all sockets on a node, though it is physically seperated for each socket.<br />
<br />
== Cache ==<br />
<br />
A cache is a relatively small amount of fast memory (compared to RAM), on the CPU chip. A modern CPU has three cache levels: L1 and L2 are specific to each core, while L3 (or Last Level Cache (LLC)) is shared amoung all cores of a CPU.<br />
<br />
== Scalability ==<br />
<br />
Scalability represents a property of software, that describes how good an application can use an increased number of hardware resources. Good scalability would mean a decrease in runtime when more and more cores are used to solve the problem. Typically applications reach an upper bound regarding a number of cores "beyond they stop scaling".</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=928HPC-Dictionary2018-04-17T14:23:02Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== Unix ==<br />
<br />
Unix describes a family of operating systems. Popular representatives include Ubuntu, CentOS and even MacOS, although the latter is not common on HPC systems. Main key features are its shell and file system.<br />
<br />
== File System ==<br />
<br />
The file system describes the directory structure of an operating system. On Unix-based systems the top most directory is <code>/</code>, which is called the root directory. As the name may suggest the file system is organized hierarchicly (like a tree) from there on out. Most of the time you will be working in directories like <code>/home/<username></code>, which represents the user's home directory. All directories starting with <code>/home/<username></code> can freely be modified to the will of the user.<br />
<br />
== Environment Variable ==<br />
<br />
An environment variable is a dynamic object on a compute which stores a value. You can:<br />
<br />
* set the value of a variable with: <code>export <variable-name>=<value></code><br />
* read the value of a variable with: <code>echo $<variable-name></code><br />
<br />
Environment variables can be referenced by software (or the user) to get or set information about the system. Down below are a few examples of environment variables, which might give you an idea for their use and usefulness.<br />
<br />
{| class="wikitable"<br />
|+ Common Environment Variable on Unix Systems<br />
|-<br />
! Environment Variable<br />
! Content<br />
|-<br />
| <code>$USER</code><br />
| your current username<br />
|-<br />
| <code>$PWD</code><br />
| the directory you are currently in<br />
|-<br />
| <code>$HOSTNAME</code><br />
| hostname of the computer you are on<br />
|-<br />
| <code>$HOME</code><br />
| your home directory<br />
|-<br />
| <code>$PATH</code><br />
| list of directories searched for when a command is executed<br />
|}<br />
<br />
== Node ==<br />
<br />
A node is an individual computer consisting of one or more sockets.<br />
<br />
=== Backend Node ===<br />
<br />
Backend nodes are reserved for executing memory demanding and long running applications. They are the most powerful, but also most power consuming part of a cluster as they make up around 98% of it. Since these nodes are not directly accessable by the user, a scheduler manages their access. In order to run on these nodes, a batch job needs to be submitted to the batch system via a scheduler specific command.<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login node]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== Central Processing Unit (CPU) ==<br />
<br />
The word "CPU" is widely used in the field of HPC though not precisely defined. It is mostly used to describe the concrete hardware architecture of a node, but should generally avoided due to possible misunderstandings.<br />
<br />
== Core ==<br />
<br />
A core has one or more hardware threads and is respnsible for executing instructions.<br />
<br />
== Socket ==<br />
<br />
A socket is the physical package in which multiple cores are enclosed sharing the same memory.<br />
<br />
== Cluster ==<br />
<br />
A cluster referes to a collection of multiple nodes, which are connected via a network offering high bandwidth with low lateny communication. Accessing a cluster is possible by connecting to its specific login nodes.<br />
<br />
== Cache ==<br />
<br />
A cache is a relatively small amount of fast memory (compared to RAM), on the CPU chip. A modern CPU has three cache levels: L1 and L2 are specific to each core, while L3 (or Last Level Cache (LLC)) is shared amoung all cores of a CPU.<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
The RAM is used as working memory for the cores. This is volatile memory meaning, that after a process ends the data in the RAM is no longer available. The RAM is shared between all sockets on a node, though it is physically seperated for each socket.<br />
<br />
== Scalability ==<br />
<br />
Scalability represents a property of software, that describes how good an application can use an increased number of hardware resources. Good scalability would mean a decrease in runtime when more and more cores are used to solve the problem. Typically applications reach an upper bound regarding a number of cores "beyond they stop scaling".</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=927HPC-Dictionary2018-04-17T14:02:10Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== Unix ==<br />
<br />
Unix describes a family of operating systems. Popular representatives include Ubuntu, CentOS and even MacOS, although the latter is not common on HPC systems. Main key features are its shell and file system.<br />
<br />
== File System ==<br />
<br />
The file system describes the directory structure of an operating system. On Unix-based systems the top most directory is <code>/</code>, which is called the root directory. As the name may suggest the file system is organized hierarchicly (like a tree) from there on out. Most of the time you will be working in directories like <code>/home/<username></code>, which represents the user's home directory. All directories starting with <code>/home/<username></code> can freely be modified to the will of the user.<br />
<br />
== Environment Variables ==<br />
<br />
TODO<br />
<br />
== Node ==<br />
<br />
A node is an individual computer consisting of one or more sockets.<br />
<br />
=== Backend Node ===<br />
<br />
Backend nodes are reserved for executing memory demanding and long running applications. They are the most powerful, but also most power consuming part of a cluster as they make up around 98% of it. Since these nodes are not directly accessable by the user, a scheduler manages their access. In order to run on these nodes, a batch job needs to be submitted to the batch system via a scheduler specific command.<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login node]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== Central Processing Unit (CPU) ==<br />
<br />
The word "CPU" is widely used in the field of HPC though not precisely defined. It is mostly used to describe the concrete hardware architecture of a node, but should generally avoided due to possible misunderstandings.<br />
<br />
== Core ==<br />
<br />
A core has one or more hardware threads and is respnsible for executing instructions.<br />
<br />
== Socket ==<br />
<br />
A socket is the physical package in which multiple cores are enclosed sharing the same memory.<br />
<br />
== Cluster ==<br />
<br />
A cluster referes to a collection of multiple nodes, which are connected via a network offering high bandwidth with low lateny communication. Accessing a cluster is possible by connecting to its specific login nodes.<br />
<br />
== Cache ==<br />
<br />
A cache is a relatively small amount of fast memory (compared to RAM), on the CPU chip. A modern CPU has three cache levels: L1 and L2 are specific to each core, while L3 (or Last Level Cache (LLC)) is shared amoung all cores of a CPU.<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
The RAM is used as working memory for the cores. This is volatile memory meaning, that after a process ends the data in the RAM is no longer available. The RAM is shared between all sockets on a node, though it is physically seperated for each socket.<br />
<br />
== Scalability ==<br />
<br />
Scalability represents a property of software, that describes how good an application can use an increased number of hardware resources. Good scalability would mean a decrease in runtime when more and more cores are used to solve the problem. Typically applications reach an upper bound regarding a number of cores "beyond they stop scaling".</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=926HPC-Dictionary2018-04-17T13:37:26Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== Unix ==<br />
<br />
TODO<br />
<br />
== Unix File System ==<br />
<br />
TODO<br />
<br />
== Environment Variables ==<br />
<br />
TODO<br />
<br />
== Node ==<br />
<br />
A node is an individual computer consisting of one or more sockets.<br />
<br />
=== Backend Node ===<br />
<br />
Backend nodes are reserved for executing memory demanding and long running applications. They are the most powerful, but also most power consuming part of a cluster as they make up around 98% of it. Since these nodes are not directly accessable by the user, a scheduler manages their access. In order to run on these nodes, a batch job needs to be submitted to the batch system via a scheduler specific command.<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login node]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== Central Processing Unit (CPU) ==<br />
<br />
The word "CPU" is widely used in the field of HPC though not precisely defined. It is mostly used to describe the concrete hardware architecture of a node, but should generally avoided due to possible misunderstandings.<br />
<br />
== Core ==<br />
<br />
A core has one or more hardware threads and is respnsible for executing instructions.<br />
<br />
== Socket ==<br />
<br />
A socket is the physical package in which multiple cores are enclosed sharing the same memory.<br />
<br />
== Cluster ==<br />
<br />
A cluster referes to a collection of multiple nodes, which are connected via a network offering high bandwidth with low lateny communication. Accessing a cluster is possible by connecting to its specific login nodes.<br />
<br />
== Cache ==<br />
<br />
A cache is a relatively small amount of fast memory (compared to RAM), on the CPU chip. A modern CPU has three cache levels: L1 and L2 are specific to each core, while L3 (or Last Level Cache (LLC)) is shared amoung all cores of a CPU.<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
The RAM is used as working memory for the cores. This is volatile memory meaning, that after a process ends the data in the RAM is no longer available. The RAM is shared between all sockets on a node, though it is physically seperated for each socket.<br />
<br />
== Scalability ==<br />
<br />
Scalability represents a property of software, that describes how good an application can use an increased number of hardware resources. Good scalability would mean a decrease in runtime when more and more cores are used to solve the problem. Typically applications reach an upper bound regarding a number of cores "beyond they stop scaling".</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=925HPC-Dictionary2018-04-17T12:54:20Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== Unix ==<br />
<br />
TODO<br />
<br />
== Unix File System ==<br />
<br />
TODO<br />
<br />
== Environment Variables ==<br />
<br />
TODO<br />
<br />
== Node ==<br />
<br />
A node is an individual computer consisting of one or more sockets.<br />
<br />
=== Backend Node ===<br />
<br />
Backend nodes are reserved for executing memory demanding and long running applications. They are the most powerful, but also most power consuming part of a cluster as they make up around 98% of it. Since these nodes are not directly accessable by the user, a scheduler manages their access. In order to run on these nodes, a batch job needs to be submitted to the batch system via a scheduler specific command.<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login node]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== Central Processing Unit (CPU) ==<br />
<br />
TODO<br />
<br />
== Socket ==<br />
<br />
A socket consists of one or more cores sharing the same memory.<br />
<br />
== Cluster ==<br />
<br />
A cluster referes to a collection of multiple nodes, which are connected via a network offering high bandwidth with low lateny communication. Accessing a cluster is possible by connecting to its specific login nodes.<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
TODO<br />
<br />
== Scalability ==<br />
<br />
TODO</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=924HPC-Dictionary2018-04-17T12:29:35Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== Unix ==<br />
<br />
TODO<br />
<br />
== Unix File System ==<br />
<br />
TODO<br />
<br />
== Environment Variables ==<br />
<br />
TODO<br />
<br />
== Node ==<br />
<br />
TODO<br />
<br />
=== Backend Node ===<br />
<br />
Backend nodes are reserved for executing memory demanding and long running applications. They are the most powerful, but also most power consuming part of a cluster as they make up around 98% of it. Since these nodes are not directly accessable by the user, a scheduler manages their access. In order to run on these nodes, a batch job needs to be submitted to the batch system via a scheduler specific command.<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login node]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== CPU ==<br />
<br />
TODO<br />
<br />
== Socket ==<br />
<br />
TODO<br />
<br />
== Cluster ==<br />
<br />
TODO<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
TODO<br />
<br />
== Scalability ==<br />
<br />
TODO</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=923HPC-Dictionary2018-04-17T11:59:26Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== Unix ==<br />
<br />
TODO<br />
<br />
== Unix File System ==<br />
<br />
TODO<br />
<br />
== Environment Variables ==<br />
<br />
TODO<br />
<br />
== Node ==<br />
<br />
TODO<br />
<br />
=== Backend Node ===<br />
<br />
TODO<br />
<br />
=== Copy Node ===<br />
<br />
Copy nodes are reserved for transfering data to or from a cluster. They usually offer a better connection than other nodes and minimize the disturbance of other users on the system. Depending on the facility, software installed on these nodes may differ from other ones due to their restricted use case, though not every facility chooses to install a designated copy node at all. As an alternative [[#Login Node|login node]] may be used to move data between systems.<br />
<br />
=== Frontend Node ===<br />
<br />
Synonym for [[#Login Node|login node]].<br />
<br />
=== Login Node ===<br />
<br />
Login nodes are reserved for connecting to the cluster of a facility. Most of the time they can also be used for testing and performing interactive tasks (e.g. the analysis of previously collected application profiles). These test runs should generally not exceed execution times of just a few minutes and may only be used to verify that your software is running correctly on the system and its environment before submitting batch jobs to the batch system.<br />
<br />
== CPU ==<br />
<br />
TODO<br />
<br />
== Socket ==<br />
<br />
TODO<br />
<br />
== Cluster ==<br />
<br />
TODO<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
TODO<br />
<br />
== Scalability ==<br />
<br />
TODO</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=How-to-Google&diff=921How-to-Google2018-04-17T11:29:02Z<p>Christian-wassermann-e30b@rwth-aachen.de: Created page with "== General == TODO == Dealing with Error Messages == * Just copy the error message from the command line to the Google search bar * Remove system specific details like the..."</p>
<hr />
<div>== General ==<br />
<br />
TODO<br />
<br />
== Dealing with Error Messages ==<br />
<br />
* Just copy the error message from the command line to the Google search bar<br />
* Remove system specific details like the system name or paths<br />
* Though you should keep the file name, if it corresponds to a provided or system source code file</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=LSF&diff=920LSF2018-04-17T11:10:33Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* Job Submission */</p>
<hr />
<div>== General ==<br />
<br />
LSF (Platform Load Sharing Facility) is a job [[scheduler]]. It is responsible for monitoring and controlling the workload of the batch system of a supercomputer and assigns resources to jobs. This system is aimed at bigger applications that need a lot of resources in terms of memory and execution time and cannot be directly accessed by the user, as opposed to [[Nodes#Login|login nodes]]. Applications to execute have to be specified in a [[jobscript]] that is submitted to the batch system by the user via a scheduler (like LSF).<br />
<br />
<br />
== #BSUB Usage ==<br />
<br />
If you are writing a [[jobscript]] for an LSF batch system, the magic cookie is "#BSUB". To use it, start a new line in your script with "#BSUB". Following that, you can put one of the parameters shown below, where the word written in <...> should be replaced with a value.<br />
<br />
Basic settings:<br />
{| class="wikitable" style="width: 40%;"<br />
| Parameter || Function<br />
|-<br />
| -J <name> || job name<br />
|-<br />
| -o <path> || path to the file where the job output is written<br />
|-<br />
| -e <path> || path to the file for the job error output (if not set, it will be written to output file as well)<br />
|}<br />
<br />
Requesting resources:<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function || Default<br />
|-<br />
| -W <runlimit> || runtime limit in the format [hour:]minute; once the time specified is up, the job will be killed by the [[scheduler]] || 00:15<br />
|-<br />
| -M <memlimit> || memory limit per process in MB || 512<br />
|-<br />
| -S <stacklimit> || limit of stack size per process in MB || 10<br />
|}<br />
<br />
Parallel programming (read more [[Parallel_Programming|here]]):<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function<br />
|-<br />
| -a openmp || start a parallel job for a shared-memory system<br />
|-<br />
| -n <num_threads> || number of threads to execute OpenMP application with<br />
|-<br />
| -a openmpi || start a parallel job for a distributed-memory system<br />
|-<br />
| -n <num_procs> || number of processes to execute MPI application with<br />
|}<br />
<br />
Email notifications:<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function<br />
|-<br />
| -B || send email to the job submitter when the job starts running<br />
|-<br />
| -N || send email to the job submitter when the job has finished<br />
|-<br />
| -u <email_address> || recipient of emails<br />
|}<br />
<br />
== Job Submission ==<br />
<br />
This command submits the job you defined in your [[Jobscript|jobscript]] to the batch system. If the less-than sign <code><</code> is left out, your job will be submitted, but all the resource requests in your jobscript will be ignored.<br />
<br />
$ bsub < jobscript.sh<br />
<br />
Just like any other incoming job, your job will first be queued. Then, the scheduler decides when your job will be run. The more resources your job requires, the longer it may be waiting to execute.<br />
<br />
You can check the current status of your submitted jobs and their job ids with the following shell command. A job can either be pending <code>PEND</code> (waiting for free nodes to run on) or running <code>RUN</code> (the jobscript is currently being executed). If all of your jobs have finished execution, the command will print <code>No unfinished jobs found</code>.<br />
<br />
$ bjobs<br />
<br />
If you are interested in the current status of your job, you can try the utility <code>bpeek</code>. It prints the output which has already been written by your job:<br />
<br />
$ bpeek <job_id><br />
<br />
In case you submitted a job on accident or realised that your job might not be running correctly, you can always remove it from the queue or terminate it when running by typing:<br />
<br />
$ bkill <job_id><br />
<br />
== Jobscript Examples ==<br />
<br />
This serial job will run a given executable, in this case "myapp.exe".<br />
<syntaxhighlight lang="zsh"><br />
#!/usr/bin/env zsh<br />
<br />
### Job name<br />
#BSUB -J MYJOB<br />
<br />
### File where the output should be written<br />
#BSUB -o MYJOB_OUTPUT.txt<br />
<br />
### Time your job needs to execute, e. g. 1 h 20 min<br />
#BSUB -W 1:20<br />
<br />
### Memory your job needs, e. g. 1000 MB <br />
#BSUB -M 1000<br />
<br />
### Stack limit per process, e. g. 20 MB<br />
#BSUB -S 20<br />
<br />
### The last part consists of regular shell commands:<br />
### Change to working directory<br />
cd /home/user/mywork<br />
<br />
### Execute your application<br />
myapp.exe<br />
</syntaxhighlight><br />
<br />
This OpenMP job will start the [[Parallel_Programming|parallel program]] "myapp.exe" with 24 threads.<br />
<syntaxhighlight lang="zsh"><br />
#!/usr/bin/env zsh<br />
<br />
### Job name<br />
#BSUB -J OMPJOB<br />
<br />
### File where the output should be written<br />
#BSUB -o OMPJOB_OUTPUT<br />
<br />
### Time your job needs to execute, e. g. 15 min<br />
#BSUB -W 0:15<br />
<br />
### Memory your job needs, e. g. 1000 MB <br />
#BSUB -M 1000<br />
<br />
### Stack limit per process, e. g. 50 MB<br />
#BSUB -S 50<br />
<br />
### Request 24 compute slots (in this case: threads)<br />
#BSUB -n 24<br />
<br />
### Execute as shared-memory job<br />
#BSUB -a openmp<br />
<br />
### Change to working directory<br />
cd /home/user/mywork<br />
<br />
### Execute your application<br />
myapp.exe<br />
</syntaxhighlight><br />
<br />
This OpenMPI job will start the [[Parallel_Programming|parallel program]] "myapp.exe" with 4 processes.<br />
<syntaxhighlight lang="zsh"><br />
#!/usr/bin/env zsh<br />
<br />
### Job name<br />
#BSUB -J MPIJOB<br />
<br />
### File where the output should be written<br />
#BSUB -o MPIJOB_OUTPUT<br />
<br />
### Time your job needs to execute, e. g. 30 min<br />
#BSUB -W 0:30<br />
<br />
### Memory your job needs, e. g. 1024 MB <br />
#BSUB -M 1024<br />
<br />
### Stack limit per process, e. g. 50 MB<br />
#BSUB -S 50<br />
<br />
### Request 4 compute slots (in this case: processes)<br />
#BSUB -n 4<br />
<br />
### Execute as distributed-memory job with OpenMPI<br />
#BSUB -a openmpi<br />
<br />
### Change to working directory<br />
cd /home/user/mywork<br />
<br />
### Execute your application<br />
myapp.exe<br />
</syntaxhighlight><br />
<br />
== References ==<br />
<br />
[https://doc.itc.rwth-aachen.de/display/CC/Example+scripts More LSF jobscript examples]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Ssh_keys&diff=919Ssh keys2018-04-17T06:30:16Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* How-to-use-it */</p>
<hr />
<div>An ssh key is a way of identifying (authenticating) yourself when connecting to a server via [[ssh]]. A different popular authentication method is entering a password.<br />
<br />
== Why should I use it?==<br />
When you connect to a server, authenticating via a password there are two main problems:<br />
* Someone could bruteforce or guess your password, since many passwords are commonly weak or used for multiple applications.<br />
* Someone could intercept your password, since it has to be send to the server at some point in some form.<br />
<br />
== How-to-use-it ==<br />
<br />
=== Generate a key ===<br />
You should start by generating a key pair:<br />
$ ssh-keygen -b 4096<br />
where you can specify the max length of the key up to 16384 bits.<br />
<br />
You can then optionally protect your key with a passphrase. (Your key is basically just a file sitting on your computer and a passphrase protects your key, if someone happens to steal/copy that file).<br />
<br />
If you did not specify a different file, the key normaly gets generated into the folder <br />
~/.ssh<br />
with the files '''id_rsa''' being your private and '''id_rsa.pub''' being your public key.<br />
<br />
=== Copy the public key to the server ===<br />
<br />
==== Method A ====<br />
<br />
This public key now has to be copied to the server into the <code>~/.ssh/authorized_keys</code> file. This can be done, by opening an [[ssh]] connection via password and then using an editor (e.g. [[vim]]) to paste the key into the file (creating the '''.ssh''' directory beforehand if it does not exist):<br />
$ mkdir -p ~/.ssh<br />
$ vim ~/.ssh/authorized_keys<br />
<br />
==== Method B ====<br />
<br />
Instead of performing the copying of the ssh key to the server manually, you can use the program <code>ssh-copy-id</code> to achieve the same goal:<br />
$ ssh-copy-id <username>@<remote-host><br />
where <code><username></code> is your username on the remote host <remote-host>. You will be prompted for your password and the program will manage the rest.<br />
<br />
Regardless of the method used, the next time you [[ssh]] to the server, it should use the key and instead of prompting for the user's pass'''word''', prompt for the pass'''phrase''' of the key (if you chose to employ one).<br />
<br />
=== Troubleshooting ===<br />
<br />
If it still asks for your password, something went wrong. In that case you should check, whether the '''authorized_keys''' file really contains the key by executing:<br />
$ cat ~/.ssh/authorized_keys<br />
on the '''server''' and <br />
$ cat ~/.ssh/id_rsa.pub<br />
on '''your local machine'''. If the key of your local machine is not contained in the '''authorized_keys''' on the server, repeat the steps of copying the key to the server.<br />
<br />
You should also make sure, that correct file access permission are set. If unsure, execute:<br />
$ chmod 700 ~/.ssh<br />
$ chmod 600 ~/.ssh/id_rsa<br />
$ chmod 640 ~/.ssh/id_rsa.pub<br />
$ chmod 640 ~/.ssh/authorized_keys<br />
$ chmod 640 ~/.ssh/known_hosts<br />
$ chmod 640 ~/.ssh/config<br />
<br />
== How-it-works ==<br />
<br />
The basic principle is that of asymmetrical cryptography being employed by using a public-private-key-pair. A public key is like an indestructible piggy bank: Everybody can put something (data) into it, but nobody can get it back out again. A private key is the key for this. In this way you can distribute all the piggy banks you like and if someone put something in there and sends it back, only you can open it with your private key.<br />
<br />
Now since you gave the server the public key (=piggy bank), it can encrypt something (say a random number), send it back and only you can decrypt this number, since only you have the private key.<br />
<br />
For more detailed information on how this works, head over to the References.<br />
<br />
== References ==<br />
<br />
[https://wiki.archlinux.de/title/SSH-Authentifizierung_mit_Schl%C3%BCsselpaaren SSH keys on the archlinux wiki]<br />
<br />
[https://medium.com/@vrypan/explaining-public-key-cryptography-to-non-geeks-f0994b3c2d5 Public and private keys easily explained]<br />
<br />
[https://www.digitalocean.com/community/tutorials/understanding-the-ssh-encryption-and-connection-process More detailed explanation of the connection and encryption process of ssh]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Binding/Pinning&diff=918Binding/Pinning2018-04-16T15:06:54Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* Basics */</p>
<hr />
<div>== Basics ==<br />
<br />
Pinning threads for shared-memory [[Parallel_Programming|parallelism]] or binding processes for distributed-memory parallelism is an advanced way to control how your system distributes the threads or processes across the available cores. It is important for improving the performance of your application by avoiding costly [[#RemoteMemoryAccess|remote memory accesses]] and keeping the threads or processes close to each other. Threads are "pinned" by setting certain [[OpenMP]]-related environment variables, which you can do with this command:<br />
$ export <env_variable_name>=<value><br />
The terms "thread pinning" and "thread affinity" as well as "process binding" and "process affinity" are used interchangeably.<br />
You can bind processes by specifying additional options when [[How_to_Use_MPI#How_to_Run_an_MPI_Executable|executing]] your [[MPI]] application.<br />
<br />
== How to Pin Threads in OpenMP ==<br />
<br />
[[File:Omp_places.png|thumb|350px|Schematic of how <code>OMP_PLACES={0}:8:2</code> would be interpreted]]<br />
<br />
[[File:Proc_bind_close.PNG|thumb|350px|Schematic of how <code>OMP_PROC_BIND=close</code> would be interpreted on a system comprising 2 nodes with 4 hardware threads each]]<br />
<br />
[[File:Proc_bind_spread.PNG|thumb|350px|<span id="RemoteMemoryAccess"></span>Schematic of <code>OMP_PROC_BIND=spread</code> and an remote memory access of thread 1 accessing the other socket's memory (e.g. thread 0 and thread 1 work on the same data)]]<br />
<br />
<code>OMP_PLACES</code> is employed to specify places on the machine where the threads are put. However, this variable on its own does not determine thread pinning completely, because your system still won't know in what pattern to assign the threads to the given places. Therefore, you also need to set <code>OMP_PROC_BIND</code>.<br />
<br />
<code>OMP_PROC_BIND</code> specifies a binding policy which basically sets criteria by which the threads are distributed.<br />
<br />
If you want to get a schematic overview of your cluster's hardware, e. g. to figure out how many hardware threads there are, type: <code>$ lstopo</code>.<br />
<br />
=== <code>OMP_PLACES</code> ===<br />
<br />
This variable can hold two kinds of values: a name specifying (hardware) places, or a list that marks places.<br />
<br />
{| class="wikitable" style="width:50%;"<br />
| Abstract name || Meaning<br />
|-<br />
| <code>threads</code> || a place is a single hardware thread, i. e. the hyperthreading will be ignored<br />
|-<br />
| <code>cores</code> || a place is a single core with its corresponding amount of hardware threads<br />
|-<br />
| <code>sockets</code> || a place is a single socket<br />
|}<br />
<br />
In order to define specific places by an interval, <code>OMP_PLACES</code> can be set to <code><lowerbound>:<length>:<stride></code>.<br />
All of these three values are non-negative integers and must not exceed your system's bounds. The value of <code><lowerbound></code> can be defined as a list of hardware threads. As an interval, <code><lowerbound></code> has this format: <code>{<starting_point>:<length>}</code> that can be a single place, or a place that holds several hardware threads, which is indicated by <code><length></code>.<br />
<br />
{| class="wikitable" style="width:60%;"<br />
| Example hardware || <code>OMP_PLACES</code> || Places<br />
|-<br />
| 24 cores with one hardware thread each, starting at core 0 and using every 2nd core || <code>{0}:24:2</code> or <code>{0:1}:24:2</code> || <code>{0}, {2}, {4}, {6}, {8}, {10}, {12}, {14}, {16}, {18}, {20}, {22}</code><br />
|-<br />
| 12 cores with two hardware threads each, starting at the first two hardware threads on the first core ({0,1}) and using every 4th core || <code>{0,1}:12:4</code> or <code>{0:2}:12:4</code> || <code>{0,1}, {4,5}, {8,9}, {12,13}, {16,17}, {20,21}</code><br />
|}<br />
<br />
You can also determine these places with a comma-separated list. Say there are 8 cores available with one hardware thread each, and you would like to execute your application on the first four cores, you could define this: <code>$ export OMP_PLACES="{0,1,2,3}"</code><br />
<br />
=== <code>OMP_PROC_BIND</code> ===<br />
<br />
Now that you have set <code>OMP_PROC_BIND</code>, you can now define the order in which the places should be assigned. This is especially useful for NUMA systems (see [[#References|references]] below) because some threads may have to access remote memory, which will slow your application down significantly. If <code>OMP_PROC_BIND</code> is not set, your system will distribute the threads across the nodes and cores randomly.<br />
<br />
{| class="wikitable" style="width:60%;"<br />
| Value || Function<br />
|-<br />
| <code>true</code> || the threads should not be moved<br />
|-<br />
| <code>false</code> || the threads can be moved<br />
|-<br />
| <code>master</code> || worker threads are in the same partition as the master<br />
|-<br />
| <code>close</code> || worker threads are close to the master in contiguous partitions, e. g. if the master is occupying hardware thread 0, worker 1 will be placed on hw thread 1, worker 2 on hw thread 2 and so on<br />
|-<br />
| <code>spread</code> || workers are spread across the available places to maximize the space inbetween two neighbouring threads<br />
|}<br />
<br />
== Options for Binding in Open MPI ==<br />
<br />
Binding processes to certain processors can be done by specifying the options below when executing a program. This is a more advanced way of running an application and also requires knowledge about your system's architecture, e. g. how many cores there are (for an overview of your hardware topology, use <code>$ lstopo</code>). If none of these options are given, default values are set.<br />
By overriding default values with the ones specified, you may be able to improve the performance of your application, if your system distributes them in a suboptimal way per default.<br />
<br />
{| class="wikitable" style="width: 100%;"<br />
| Option || Function || Explanation<br />
|-<br />
| --bind-to <arg> || bind to the processors associated with hardware component; <code><arg></code> can be one of: none, hwthread, core, l1cache, l2cache, l3cache, socket, numa, board; default value: <code>core</code> || e. g.: in case of <code>l3cache</code> the processes will be bound to those processors that share the same L3 cache<br />
|-<br />
| --map-by <arg> || map to the specified hardware component. <code><arg></code> can be one of: slot, hwthread, core, L1cache, L2cache, L3cache, socket, numa, board, node, sequential, distance, and ppr; default value: <code>socket</code> || if <code>--map-by socket</code> with <code>--bind-to core</code> is used and the program is launched with 4 processes on a two socket machine, process 0 is bound to the first core on socket 0, process 1 is bound to the first core on socket 1, process 2 is bound to the second core on socket 0 and process 3 is bound to the second core on socket 1.<br />
|-<br />
| --report-bindings || print any bindings for launched processes to the console || sample output matching the example for <code>--map-by</code>:<br />
<syntaxhighlight lang="bash"><br />
[myhost] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]: [BB/../../../../..][../../../../../..]<br />
[myhost] MCW rank 1 bound to socket 1[core 6[hwt 0-1]]: [../../../../../..][BB/../../../../..]<br />
[myhost] MCW rank 2 bound to socket 0[core 1[hwt 0-1]]: [../BB/../../../..][../../../../../..]<br />
[myhost] MCW rank 3 bound to socket 1[core 7[hwt 0-1]]: [../../../../../..][../BB/../../../..]<br />
</syntaxhighlight><br />
|}<br />
<br />
== References ==<br />
<br />
[http://pages.tacc.utexas.edu/~eijkhout/pcse/html/omp-affinity.html Thread affinity in OpenMP]<br />
<br />
[https://docs.oracle.com/cd/E60778_01/html/E60751/goztg.html More information on <code>OMP_PLACES</code> and <code>OMP_PROC_BIND</code>]<br />
<br />
[https://doc.itc.rwth-aachen.de/download/attachments/35947076/03_OpenMPNumaSimd.pdf Introduction to OpenMP from PPCES (@RWTH Aachen) Part 3: NUMA & SIMD]<br />
<br />
[https://www.open-mpi.org/faq/?category=tuning#using-paffinity-v1.4 FAQ about process affinity in Open MPI]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=How_to_Use_OpenMP&diff=917How to Use OpenMP2018-04-16T15:02:02Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* How to Compile OpenMP Code */</p>
<hr />
<div>== Basics ==<br />
<br />
This page will give you a general overview of how to compile and execute a program that has been [[Parallel_Programming|parallelized]] with [[OpenMP]].<br />
As opposed to [[How_to_Use_MPI|MPI]], you do not have to load any modules to use OpenMP.<br />
<br />
== How to Compile OpenMP Code ==<br />
<br />
Additional compiler flags tell the compiler to enable OpenMP. Otherwise, the OpenMP pragmas in the code will be ignored by the compiler.<br />
<br />
Depending on which compiler you have loaded, use one of the flags below to compile your code.<br />
{| class="wikitable" style="width=60%;"<br />
| Compiler || Flag<br />
|-<br />
| GNU || <code>-fopenmp</code><br />
|-<br />
| Intel || <code>-qopenmp</code><br />
|-<br />
| Clang || <code>-fopenmp</code><br />
|-<br />
| Oracle || <code>-xopenmp</code><br />
|}<br />
<br />
For example: if you plan to use an Intel compiler for your OpenMP code written in C, you have to type this to create an application called <code>omp_code.exe</code>:<br />
$ icc -qopenmp omp_code.c -o omp_code.exe<br />
<br />
== How to Run an OpenMP Application ==<br />
<br />
=== Setting <code>OMP_NUM_THREADS</code> ===<br />
<br />
If you forget to set <code>OMP_NUM_THREADS</code> to any value, the default value of your cluster environment will be used. In most cases, the default is 1, so that your program is executed serially.<br />
<br />
One way to specify the number of threads is by passing an extra argument when running the executable file. In order to start the parallel regions of the example program above with 12 threads, you'd have to type:<br />
$ OMP_NUM_THREADS=12 ./omp_code.exe<br />
This automatically sets the environment variable <code>OMP_NUM_THREADS</code> to 12, but it is reset to its default value after the execution of <code>omp_code.exe</code> finished.<br />
<br />
Another way to set the number of threads is by changing your environment variable. This example will increment it up to 24 threads and override the default value:<br />
$ export OMP_NUM_THREADS=24<br />
If you simply run your application with <code>$ ./omp_code.exe</code> next, this value will be used automatically.<br />
<br />
=== [[Binding/Pinning#How_to_Pin_Threads_in_OpenMP|Thread Pinning]] ===<br />
<br />
The performance of your application may be improved depending on the distribution of threads. Go [[Binding/Pinning|here]] to learn more about thread pinning in order to minimize the execution time.</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=916HPC-Dictionary2018-04-16T14:56:20Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== Unix ==<br />
<br />
TODO<br />
<br />
== Unix File System ==<br />
<br />
TODO<br />
<br />
== Environment Variables ==<br />
<br />
TODO<br />
<br />
== Node ==<br />
<br />
TODO<br />
<br />
=== Login Node ===<br />
<br />
TODO<br />
<br />
=== Copy Node ===<br />
<br />
TODO<br />
<br />
=== Backend Node ===<br />
<br />
TODO<br />
<br />
== CPU ==<br />
<br />
TODO<br />
<br />
== Socket ==<br />
<br />
TODO<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
TODO<br />
<br />
== Scalability ==<br />
<br />
TODO</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=How_to_Use_MPI&diff=915How to Use MPI2018-04-16T14:53:53Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* Basics */</p>
<hr />
<div>== Basics ==<br />
<br />
This page will give you a general overview of how to compile and execute a program that has been [[Parallel_Programming|parallelized]] with [[MPI]]. Many of the options listed below are the same for both Open MPI and Intel MPI, however, be careful and look up if they indeed behave the same way.<br />
<br />
== How to Compile MPI Code ==<br />
<br />
Before continuing, please make sure that the openmpi or intelmpi module is loaded (go [[Modules|here]] to see how to load/switch modules).<br />
<br />
There are several so called MPI "compiler wrappers", e.g. <code>mpicc</code>. These take care of including the correct MPI libraries for the programming language you are using. But they share most command line options. Depending on whether your code is written in C, C++ or Fortran, follow the instructions in one of the tables below. Make sure to replace the arguments inside <code><…></code> with specific values.<br />
<br />
=== Open MPI ===<br />
<br />
Use the following command to specify the program you would like to compile (replace <code><src_file></code> with a path to your code, e. g. <code>./myprog.c</code>). <br />
{| class="wikitable" style="width: 40%;"<br />
| Language || Command<br />
|-<br />
| C || <code>$ mpicc <src_file> -o <name_of_executable></code><br />
|-<br />
| C++ || <code>$ mpicxx <src_file> -o <name_of_executable></code><br />
|-<br />
| Fortran || <code>$ mpifort <src_file> -o <name_of_executable></code><br />
|}<br />
<br />
You can also type the command <code>$ mpicc [options]</code>, <code>$ mpicxx [options]</code> or <code>$ mpifort [options]</code>. There are a few options that come with Open MPI, however, options are more important for running your program. The compiler options might be useful to fetch more information about the Open MPI module you are using. Compile options unknown to the MPI compiler wrapper are simply forwarded to the underlying [[Compiler|compiler]] e.g. <code>icc</code>.<br />
{| class="wikitable" style="width: 40%;"<br />
|Options || Function<br />
|-<br />
| -showme:help || print a short help message about the usage and lists all compiler options<br />
|- <br />
| -showme:version || show Open MPI version<br />
|}<br />
<br />
Instead of typing the compiler wrapper <code>mpicc</code>, <code>mpicxx</code> or <code>mpifort</code> explicitly, on most systems (e.g. the RWTH Compute Cluster) there are environment variables defined, which you can use to call the MPI compiler in a more general manner. Simple use <code>$MPICC</code>, <code>$MPICXX</code> or <code>$MPIFC</code> for the compiler you want to use and let the module system handle the dirty details of using the appropriate command.<br />
<br />
=== Intel MPI ===<br />
<br />
Use the following command to specify the program you would like to compile (replace <code><src_file></code> with a path to your code, e. g. <code>./myprog.c</code>). <br />
{| class="wikitable" style="width: 70%;"<br />
| Compiler Driver || C || C++ || Fortran<br />
|-<br />
| GCC || <code>$ mpicc <src_file> -o <name></code> || <code>$ mpicpc <src_file> -o <name></code> || <code>$ mpifort <src_file> -o <name></code><br />
|-<br />
| Intel || <code>$ mpiicc <src_file> -o <name></code> || <code>$ mpiicpc <src_file> -o <name></code> || <code>$ mpiifort <src_file> -o <name></code><br />
|}<br />
<br />
You can also type the command <code>$ mpicc [options] <src_file> -o <name></code> etc., where <code>[options]</code> can be replaced with one or more of the ones listed below. Intel MPI comes with rather advanced compiler options, that are mainly aimed at optimization and analyzing your code with the help of Intel tools.<br />
{| class="wikitable" style="width: 70%;"<br />
|Options || Function<br />
|-<br />
| -g || enable debugging information<br />
|- <br />
| -OX || enable compiler optimization, where <code>X</code> represents the optimization level and is one of 0, 1, 2, 3<br />
|-<br />
| -v || print the compiler version<br />
|}<br />
<br />
Instead of typing the compiler wrapper <code>mpicc</code>, <code>mpicxx</code> or <code>mpifort</code> explicitly, on most systems (e.g. the RWTH Compute Cluster) there are environment variables defined, which you can use to call the MPI compiler in a more general manner. Simple use <code>$MPICC</code>, <code>$MPICXX</code> or <code>$MPIFC</code> for the compiler you want to use and let the module system handle the dirty details of using the appropriate command.<br />
<br />
== How to Run an MPI Executable ==<br />
<br />
Ensure that the correct MPI module is loaded (go [[Modules|here]] to see how to load/switch modules). Once again, the command line options slightly differ between Intel MPI and Open MPI.<br />
In order to start any MPI program, type the following command where <code><executable></code> specifies the path to your application:<br />
$ mpirun -n <num_procs> [options] <executable><br />
Note that <code>mpiexec</code> and <code>mpirun</code> are synonymous in Open MPI, in Intel MPI it's <code>mpiexec.hydra</code> and <code>mpirun</code>.<br />
<br />
Don’t forget to put the <code>-np</code> or <code>-n</code> option as explained below. All the other options listed below are not mandatory.<br />
<br />
=== Open MPI ===<br />
<br />
{| class="wikitable" style="width: 60%;"<br />
| Option || Function<br />
|-<br />
| -np <num_procs> or -n <num_procs> || number of processes to run<br />
|-<br />
| -npersocket <num_procs> || number of processes per socket<br />
|-<br />
| -npernode <num_procs> || number of processes per node<br />
|-<br />
| -wdir <directory> || change to directory specified before executing the program<br />
|-<br />
| -path <path> || look for executables in the directory specified<br />
|-<br />
| -q or -quiet || suppress helpful messages<br />
|-<br />
| -output-filename <name> || redirect output into the file <name>.<rank><br />
|-<br />
| -x <env_variable> || export the specified environment variable to the remote nodes where the program will be executed<br />
|-<br />
| --help || list all options available with an explanation<br />
|}<br />
<br />
=== Intel MPI ===<br />
<br />
{| class="wikitable" style="width: 60%;"<br />
| Option || Function<br />
|-<br />
| -n <num_procs> || number of processes to run<br />
|-<br />
| -ppn <num_procs> || number of processes per node; for that to work, it may be necessary to set the environment variable <code>I_MPI_JOB_RESPECT_PROCESS_PLACEMENT=off</code><br />
|-<br />
| -wdir <directory> || change to directory specified before executing the program<br />
|-<br />
| -path <path> || look for executables in the directory specified<br />
|-<br />
| -outfile-pattern <name> || redirect stdout to file<br />
|-<br />
| --help || list all options available with an explanation<br />
|}<br />
<br />
=== Process Binding in Open MPI ===<br />
<br />
Binding processes means telling your system how to place the processes onto the architecture. This can be done by adding command-line options when calling <code>mpiexec</code> and may enhance the performance of your application. In order to learn more about that, go [[Binding/Pinning#Options_for_Binding_in_Open_MPI|here]].<br />
<br />
== References ==<br />
<br />
[https://software.intel.com/en-us/mpi-developer-reference-linux-compiler-command-options Intel MPI compiler options]<br />
<br />
[https://www.open-mpi.org/doc/v2.0/man1/mpiexec.1.php Manual page for Open MPI's mpiexec]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=OpenMP&diff=914OpenMP2018-04-16T14:50:09Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>OpenMP is an open standard for Shared Memory [[Parallel_Programming|parallelization]]. Information on how to run an existing OpenMP program can be found in the "[[How to Use OpenMP]]"-Section.<br />
<br />
== General ==<br />
OpenMP programming is mainly done with pragmas:<br />
<syntaxhighlight lang="c"><br />
#include <stdio.h><br />
<br />
int main(int argc, char* argv[])<br />
{<br />
#pragma omp parallel<br />
{<br />
printf("Hallo Welt!\n");<br />
}<br />
<br />
return 0;<br />
}<br />
</syntaxhighlight><br />
<br />
interpreted by a normal compiler as comments, these will only come into effect when a specific [[compiler]] (options) is utilized like detailed [[How_to_Use_OpenMP|here]].<br />
<br />
Please check the more detailed tutorials in the References.<br />
<br />
== References ==<br />
[https://doc.itc.rwth-aachen.de/download/attachments/35947076/01_IntroductionToOpenMP.pdf Introduction to OpenMP from PPCES (@RWTH Aachen) Part 1: Introduction]<br />
<br />
[https://doc.itc.rwth-aachen.de/download/attachments/35947076/02_OpenMPTaskingInDepth.pdf Introduction to OpenMP from PPCES (@RWTH Aachen) Part 2: Tasking in Depth]<br />
<br />
[https://doc.itc.rwth-aachen.de/download/attachments/35947076/03_OpenMPNumaSimd.pdf Introduction to OpenMP from PPCES (@RWTH Aachen) Part 3: NUMA & SIMD]<br />
<br />
[https://doc.itc.rwth-aachen.de/download/attachments/35947076/04_OpenMPSummary.pdf Introduction to OpenMP from PPCES (@RWTH Aachen) Part 4: Summary]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=MPI&diff=913MPI2018-04-16T14:48:14Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* General */</p>
<hr />
<div>MPI is an open standard for Distributed Memory [[Parallel_Programming|parallelization]]. Information on how to run an existing MPI program can be found in the [[How to Use MPI]] Section.<br />
<br />
== General ==<br />
In MPI the most essential operations are:<br />
* <code>MPI_Send</code> for sending a message<br />
<syntaxhighlight lang="c"><br />
int MPI_Send (const void* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)<br />
</syntaxhighlight><br />
<br />
* <code>MPI_Recv</code> for receiving a message<br />
<syntaxhighlight lang="c"><br />
int MPI_Recv (void* buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status* status)<br />
</syntaxhighlight><br />
<br />
Although there a 100+ MPI functions defined in the standard (e.g. for non-blocking or collective communication, see the [[#References|References]] for more details), you can write meaningful MPI application with less than 20 of those. Programs written with these functions have to be compiled with a specific [[compiler]] (options) and executed with a special startup program like detailed [[How_to_Use_MPI|here]].<br />
<br />
Please check the more detailed tutorials in the References.<br />
<br />
== References ==<br />
[https://doc.itc.rwth-aachen.de/download/attachments/35947076/01_PPCES2018_MPI_Tutorial.pdf Introduction to MPI from PPCES (@RWTH Aachen) Part 1]<br />
<br />
[https://doc.itc.rwth-aachen.de/download/attachments/35947076/02_PPCES2018_MPI_Tutorial.pdf Introduction to MPI from PPCES (@RWTH Aachen) Part 2]<br />
<br />
[https://doc.itc.rwth-aachen.de/download/attachments/35947076/03_PPCES2018_MPI_Tutorial.pdf Introduction to MPI from PPCES (@RWTH Aachen) Part 3]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=911HPC-Dictionary2018-04-16T14:41:24Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== Unix ==<br />
<br />
TODO<br />
<br />
== Unix File System ==<br />
<br />
TODO<br />
<br />
== Node ==<br />
<br />
TODO<br />
<br />
=== Login Node ===<br />
<br />
TODO<br />
<br />
=== Copy Node ===<br />
<br />
TODO<br />
<br />
=== Backend Node ===<br />
<br />
TODO<br />
<br />
== CPU ==<br />
<br />
TODO<br />
<br />
== Socket ==<br />
<br />
TODO<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
TODO<br />
<br />
== Scalability ==<br />
<br />
TODO</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Parallel_Programming&diff=910Parallel Programming2018-04-16T14:41:05Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>In order to solve a problem faster, work can sometimes be executed in parallel, as mentioned in [[Getting_Started#Parallel_Programming_or_.22How-To-Use-More-Than-One-Core.22|Getting Started]]. To achieve this, one usually uses either a [[#Shared_Memory|Shared Memory]] or a [[#Distributed_Memory|Distributed Memory]] programming model. Following is a short description of the basic concept of Shared Memory Systems and Distributed Memory Systems. Information on how to start/run/use an exisiting parallel code can be found in the [[OpenMP]] or [[MPI]] article.<br />
<br />
<br />
== Shared Memory ==<br />
[[File:Shared_Memory.png|thumb|300px|Schematic of shared memory]]<br />
<br />
Shared Memory programming works like the communication of multiple people, who are cleaning a house, via a pin board. There is one shared memory (pin-board in the analogy) where everybody can see what everybody is doing and how far they have gotten or which results (the bathroom is already clean) they got. Similar to the physical world, there are logistical limits on many parallel units (people) can use the memory (pin board) efficiently and how big it can be.<br />
<br />
In the computer this translates to multiple cores having joint access to the same shared memory as depicted. This has the advantage, that there is generally very little communication overhead, since every core can write to every memory location and the communication is therefore implicit. Futhermore parallelising an existing sequential (= not parallel) program is commonly straight forward and very easy to implement, if the underlying problem allows parallelisation. As can be seen in the picture, it is not practical to attach more and more cores to the same memory, because it can only serve a limited number of cores with data efficiently at the same time. Therefore this paradigm is limited by how many cores can fit into one computer (a few hundred is a good estimate).<br />
<br />
For parallelizing applications, which plan on running on these kind of systems, [[OpenMP|Open Memory Programming (OpenMP)]] is commonly used in the HPC community.<br />
<br />
<br />
== Distributed Memory ==<br />
<br />
[[File:Distributed_Memory_sparse.png|thumb|300px|Schematic of a distributed memory system with a sparse network]]<br />
[[File:Distributed_Memory_dense.png|thumb|300px|Schematic of a distributed memory system with a dense network]]<br />
<br />
Distributed Memory is similar to the way how multiple humans interact while solving problems: every process (person) 'works' on it's own and can communicate with the others by sending messages (talking and listening).<br />
<br />
In a computer or a cluster of computers every core works on it's own and has a way (e.g. the [[MPI|Message Passing Interface (MPI)]]) to communicate with the other cores. This messaging can happen within a CPU between multiple cores, utilize a high speed network between the computers (nodes) of a supercomputer, or theoretically even happen over the internet. This sending and receiving of messages is often harder to implement for the developer and sometimes even requires a major rewrite/restructure of existing code. However, it has the advantage, that it can be scaled to more computers (nodes), since every process has it's own memory and can communicate over [[MPI]] with the other processes. The limiting factor here is the speed and characteristics of the physical network, connecting the different nodes.<br />
<br />
The communication pattern is depicted with a sparse and a dense network. In a sparse network, messages have to be forwarded by sometimes multiple cores to reach their destination. The more connections there are, the lower this amount of forwarding gets, which reduces average latency and overhead and increases throughput and scalability.<br />
<br />
Since every communication is explicitly coded, this communication pattern can be designed carefully to exploit the architecture and the available nodes to their fullest extend. It follows, that in theroy the application can scale as high as the underlying problem allows, being only limited by the network connecting the nodes and the overhead for sending/receiving messages.<br />
<br />
== Should I use Distributed Memory or Shared Memory? ==<br />
This really depends on the problem at hand. If the problem is parallelizable, the required computing power is a good indicator. When a few to a hundred cores should suffice, [[OpenMP]] is (for existing codes) commonly the easiest alternative. However, if thousands or even millions of cores are required, there is not really a way around [[MPI]]. To give a better overview, different pros and cons are listed in the table below:<br />
<br />
{| class="wikitable" style="width: 60%;"<br />
!colspan="2" | Shared Memory ([[OpenMP]]) || colspan="2"| Distributed Memory ([[MPI]])<br />
|-<br />
! Pros || Cons || Pros || Cons<br />
|-<br />
| Easy to implement || scales only to 1 node || scales across multiple nodes || harder to implement<br />
|-<br />
| shared variables || inherent data races || no inherent data races || no shared variables<br />
|-<br />
| low overhead || || rowspan="2"| each MPI process can utilize OpenMP,<br />
resulting in a hybrid application<br />
| some overhead<br />
|-<br />
| can be executed/started normally || || needs a library wrapper<br />
|}<br />
<br />
So in sum it really depends and if you are unsure, have a chat with your local HPC division, check out the example in the References or head over to the [[Support]].<br />
<br />
== References ==<br />
[https://doc.itc.rwth-aachen.de/download/attachments/3474050/OpenMP_and_MPI_for_Dummies-C.pdf?version=1&modificationDate=1387402526000&api=v2 Difference between SM und DM in a concrete C example]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Parallel_Programming&diff=909Parallel Programming2018-04-16T14:35:10Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>In order to solve a problem faster, work can sometimes be executed in parallel, as mentioned in [[Getting_Started#Parallel_Programming_or_.22How-To-Use-More-Than-One-Core.22|Getting Started]]. To achieve this, one usually uses either a [[#Shared_Memory|Shared Memory]] or a [[#Distributed_Memory|Distributed Memory]] programming model. Following is a short description of the basic concept of Shared Memory Systems and Distributed Memory Systems. Information on how to start/run/use an exisiting parallel code can be found in the [[OpenMP]] or [[MPI]] article.<br />
<br />
<br />
== Shared Memory ==<br />
[[File:Shared_Memory.png|thumb|300px|Schematic of shared memory]]<br />
<br />
Shared Memory programming works like the communication of multiple people, who are cleaning a house, via a pin board. There is one shared memory (pin-board in the analogy) where everybody can see what everybody is doing and how far they have gotten or which results (the bathroom is already clean) they got. Similar to the physical world, there are logistical limits on many parallel units (people) can use the memory (pin board) efficiently and how big it can be.<br />
<br />
In the computer this translates to multiple cores having joint access to the same shared memory as depicted. This has the advantage, that there is generally very little communication overhead, since every core can write to every memory location and the communication is therefore implicit. Futhermore parallelising an existing sequential (= not parallel) program is commonly straight forward and very easy to implement, if the underlying problem allows parallelisation. As can be seen in the picture, it is not practical to attach more and more cores to the same memory, because it can only serve a limited number of cores with data efficiently at the same time. Therefore this paradigm is limited by how many cores can fit into one computer (a few hundred is a good estimate).<br />
<br />
For parallelizing applications, which plan on running on these kind of systems, [[OpenMP|Open Memory Programming (OpenMP)]] is commonly used in the HPC community.<br />
<br />
<br />
== Distributed Memory ==<br />
<br />
[[File:Distributed_Memory_sparse.png|thumb|300px|Schematic of distributed memory with sparse network]]<br />
[[File:Distributed_Memory_dense.png|thumb|300px|Schematic of distributed memory with dense network]]<br />
<br />
Distributed Memory is similar to the way how multiple humans interact while solving problems: every process (person) 'works' on it's own and can communicate with the others by sending messages (talking and listening).<br />
<br />
In a computer or a cluster of computers every core works on it's own and has a way (e.g. the [[MPI|Message Passing Interface (MPI)]]) to communicate with the other cores. This messaging can happen within a CPU between multiple cores, utilize a high speed network between the computers (nodes) of a supercomputer, or theoretically even happen over the internet. This sending and receiving of messages is often harder to implement for the developer and sometimes even requires a major rewrite/restructure of existing code. However, it has the advantage, that it can be scaled to more computers (nodes), since every process has it's own memory and can communicate over [[MPI]] with the other processes. The limiting factor here is the speed and characteristics of the physical network, connecting the different nodes.<br />
<br />
The communication pattern is depicted with a sparse and a dense network. In a sparse network, messages have to be forwarded by sometimes multiple cores to reach their destination. The more connections there are, the lower this amount of forwarding gets, which reduces average latency and overhead and increases throughput/scalability.<br />
<br />
Since every communication is explicitly coded, this communication pattern can be designed carefully to exploit the architecture and the available nodes to their fullest extend. It follows, that in theroy the application can scale as high as the underlying problem allows, being only limited by the network connecting the nodes and the overhead for sending/receiving messages.<br />
<br />
== Should I use Distributed Memory or Shared Memory? ==<br />
This really depends on the problem at hand. If the problem is parallelizable, the required computing power is a good indicator. When a few to a hundred cores should suffice, [[OpenMP]] is (for existing codes) commonly the easiest alternative. However, if thousands or even millions of cores are required, there is not really a way around [[MPI]]. To give a better overview, different pros and cons are listed in the table below:<br />
<br />
{| class="wikitable" style="width: 60%;"<br />
!colspan="2" | Shared Memory ([[OpenMP]]) || colspan="2"| Distributed Memory ([[MPI]])<br />
|-<br />
! Pros || Cons || Pros || Cons<br />
|-<br />
| Easy to implement || scales only to 1 node || scales across multiple nodes || harder to implement<br />
|-<br />
| shared variables || inherent data races || no inherent data races || no shared variables<br />
|-<br />
| low overhead || || rowspan="2"| each MPI process can utilize OpenMP,<br />
resulting in a hybrid application<br />
| some overhead<br />
|-<br />
| can be executed/started normally || || needs a library wrapper<br />
|}<br />
<br />
So in sum it really depends and if you are unsure, have a chat with your local HPC division, check out the example in the References or head over to the [[Support]].<br />
<br />
== References ==<br />
[https://doc.itc.rwth-aachen.de/download/attachments/3474050/OpenMP_and_MPI_for_Dummies-C.pdf?version=1&modificationDate=1387402526000&api=v2 Difference between SM und DM in a concrete C example]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Parallel_Programming&diff=908Parallel Programming2018-04-16T14:26:52Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>In order to solve a problem faster, work can sometimes be executed in parallel, as mentioned in [[Getting_Started#Parallel_Programming_or_.22How-To-Use-More-Than-One-Core.22|Getting Started]]. To achieve this, one usually uses either a [[#Shared_Memory|Shared Memory]] or a [[#Distributed_Memory|Distributed Memory]] programming model. Following is a short description of the basic concept of Shared Memory Systems and Distributed Memory Systems. Information on how to start/run/use an exisiting parallel code can be found in the [[OpenMP]] or [[MPI]] article.<br />
<br />
<br />
== Shared Memory ==<br />
[[File:Shared_Memory.png|thumb|300px|Schematic of shared memory]]<br />
<br />
Shared Memory programming works like the communication of multiple people, who are cleaning a house, via a pin board. There is one shared memory (pin-board in the analogy) where everybody can see what everybody is doing and how far they have gotten or which results (the bathroom is already clean) they got. Similar to the physical world, there are logistical limits on many parallel units (people) can use the memory (pin board) efficiently and how big it can be.<br />
<br />
In the computer this translates to multiple cores having joint access to the same shared memory as depicted. This has the advantage, that there is generally very little communication overhead, since every core can write to every memory location and the communication is therefore implicit. Futhermore parallelising an existing sequential (= not parallel) program is commonly straight forward and very easy to implement, if the underlying problem allows parallelisation. As can be seen in the picture, it is not practical to attach more and more cores to the memory. Therefore this paradigm is limited by how many cores can fit into one computer (a few hundred is a good estimate).<br />
<br />
For parallelizing applications, which plan on running on these kind of systems, [[OpenMP|Open Memory Programming (OpenMP)]] is commonly used in the HPC community.<br />
<br />
<br />
== Distributed Memory ==<br />
<br />
[[File:Distributed_Memory_sparse.png|thumb|300px|Schematic of distributed memory with sparse network]]<br />
[[File:Distributed_Memory_dense.png|thumb|300px|Schematic of distributed memory with dense network]]<br />
<br />
Distributed Memory is similar to the way how multiple humans interact while solving problems: every process (person) 'works' on it's own and can communicate with the others by sending messages (talking and listening).<br />
<br />
In a computer or a cluster of computers every core works on it's own and has a way (e.g. the [[MPI|Message Passing Interface (MPI)]]) to communicate with the other cores. This messaging can happen within a CPU between multiple cores, utilize a high speed network between the computers (nodes) of a supercomputer, or theoretically even happen over the internet. This sending and receiving of messages is often harder to implement for the developer and sometimes even requires a major rewrite/restructure of existing code. However, it has the advantage, that it can be scaled to more computers (nodes), since every process has it's own memory and can communicate over [[MPI]] with the other processes. The limiting factor here is the speed and characteristics of the physical network, connecting the different nodes.<br />
<br />
The communication pattern is depicted with a sparse and a dense network. In a sparse network, messages have to be forwarded by sometimes multiple cores to reach their destination. The more connections there are, the lower this amount of forwarding gets, which reduces average latency and overhead and increases throughput/scalability.<br />
<br />
Since every communication is explicitly coded, this communication pattern can be designed carefully to exploit the architecture and the available nodes to their fullest extend. It follows, that in theroy the application can scale as high as the underlying problem allows, being only limited by the network connecting the nodes and the overhead for sending/receiving messages.<br />
<br />
== Should I use Distributed Memory or Shared Memory? ==<br />
This really depends on the problem at hand. If the problem is parallelizable, the required computing power is a good indicator. When a few to a hundred cores should suffice, [[OpenMP]] is (for existing codes) commonly the easiest alternative. However, if thousands or even millions of cores are required, there is not really a way around [[MPI]]. To give a better overview, different pros and cons are listed in the table below:<br />
<br />
{| class="wikitable" style="width: 60%;"<br />
!colspan="2" | Shared Memory ([[OpenMP]]) || colspan="2"| Distributed Memory ([[MPI]])<br />
|-<br />
! Pros || Cons || Pros || Cons<br />
|-<br />
| Easy to implement || scales only to 1 node || scales across multiple nodes || harder to implement<br />
|-<br />
| shared variables || inherent data races || no inherent data races || no shared variables<br />
|-<br />
| low overhead || || rowspan="2"| each MPI process can utilize OpenMP,<br />
resulting in a hybrid application<br />
| some overhead<br />
|-<br />
| can be executed/started normally || || needs a library wrapper<br />
|}<br />
<br />
So in sum it really depends and if you are unsure, have a chat with your local HPC division, check out the example in the References or head over to the [[Support]].<br />
<br />
== References ==<br />
[https://doc.itc.rwth-aachen.de/download/attachments/3474050/OpenMP_and_MPI_for_Dummies-C.pdf?version=1&modificationDate=1387402526000&api=v2 Difference between SM und DM in a concrete C example]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Getting_Started&diff=907Getting Started2018-04-16T14:12:36Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== [[Access|Access]] or "How-to-be-allowed-onto-the-supercomputer" ==<br />
Depending on the specific supercomputer, one has to either register to get a user account or write a project proposal and apply for computing resources that way. The respective pages are linked in [[Access|this overview]].<br />
<br />
After this is done and login credentials are supplied, one can proceed to [[ Getting_Started#Login_or_.22How-to-now-actually-connect-to-the-supercomputer.22 | login ]].<br />
<br />
== [[Nodes#Login|Login]] or "How-to-now-actually-connect-to-the-supercomputer" ==<br />
Most HPC Systems are unix-based environments with [[shell]] (commandline) access.<br />
<br />
To log in, one usually uses [[ssh]] to reach the respective [[Nodes#Login|Login Nodes]] (computers reserved for people just like you that want to connect to the supercomputer). Sometimes this access is restricted, so you can only connect, when you are within the university/facility and its network. To still access the Login Nodes externally, one can 'pretend to be inside the network' by using a [[VPN|Virtual Private Network (VPN)]].<br />
<br />
Once there, the user can interact with the system and run (small) programs to generally test the system/software.<br />
<br />
== [[File_Transfer|File Transfer]] or "How-to-get-your-data-onto-or-off-the-supercomputer" ==<br />
To get your data (files) onto the supercomputer or back to your local machine, there are usually different ways. Sometimes there are computers specifically reserved for this purpose called [[Nodes#Copy|copy nodes]].<br />
<br />
If available to you, it is recommened to use these copy nodes to move data to or from the supercomputer, since this will result in a better connection and disturb other users less. Additionally the tools mentioned below might only work on these nodes. If there are no dedicated copy nodes, you can usually use the [[Nodes#Login|Login Nodes]] for this purpose.<br />
<br />
Commonly used and widely supported copying tools are [[rsync]] which mirrors directories (folders) between the supercomputer and your local machine. [[scp]] which is useful for a few single files or specified file-lists, and lastly the commonly used [[ftp]] or the encrypted version sftp (or ftps).<br />
A little bit more information can be found in the [[File_Transfer|File Transfer]] article.<br />
<br />
== [[Scheduler|Schedulers]] or "How-To-Run-Applications-on-a-supercomputer" ==<br />
To run any significant program or workload on a supercomputer, generally [[Scheduler|schedulers]] are employed. Alongside the above-mentioned Login Nodes there are usually far more Backend Nodes in the system (computers exclusively reserved for computing, to which you cannot connect directly, also referred to as "batch system"). A program called scheduler decides who gets how many of those compute resources for which amount of time. Please use the Backend Nodes for everything which is not a simple small test and only runs for a few minutes., otherwise you will block the Login Nodes for everybody when you run your calculations there. These Backend Nodes make up more than 98% of a supercomputer and can only be accessed via the scheduler.<br />
<br />
When you log into a supercomputer, you can run commands on the Login Nodes interactively. You type, you hit return, the command gets executed. Schedulers work differently. You submit a series of commands (in form of a file) and tell it, how much resources it will approximately need in terms of:<br />
<br />
* time: If the specified time runs out, before your application finishes and exits, it will be terminated by the scheduler.<br />
* compute resources: how many cpus ('calculation thingies'), sockets ('cpu-houses') and nodes ('computers')<br />
* memory resources: how much RAM ('very fast memory, similar to the few books you have at home')<br />
<br />
This combination of specified commands and required resources is commonly referred to as a "(batch) job".<br />
<br />
If later compute resources become free, which match the requirements of your application, the scheduler will run your specified commands on the requested hardware. This is usually delayed (sometimes you have to wait a day or two) and not instant, because other users are currently using the compute resources and you have to wait until their program runs finish. Furthermore you cannot change the series of commands after submitting, but just terminate the job and submit a new one in case of an error.<br />
<br />
The file specifying this series of commands and the required resources is called a [[jobscript]]. Its format and syntax depends on the installed scheduler. When you have this jobscript ready with the help of [[jobscript-examples]], colleagues or your local [[support]], you can submit it to the respective [[Schedulers|scheduler of your facility]]. The scheduler then waits until a set of nodes (computers) are free and later allocates those to execute your job as soon as possible. Sometimes there is (an optional) email notification, which is send when your job starts execution/finished running.<br />
<br />
Be aware that your specified requirements have to fit within the boundaries of the system of your facility. If you ask for more than there is, chances are, the scheduler will accept your job and wait until missing hardware is bought and installed - although this will not happen in 99.9% of cases. Information over the available hardware can be found in the [https://gauss-allianz.de/de/hpc-ecosystem overview of the Gauss Allianz] or the [[Site-specific_documentation|documentation of the different sites]]. You can find more information about [[Getting_Started#Parallel_Programming_or_.22How-To-Use-More-Than-One-Core.22|parallelizing programs here]]. Also there is an [[Schedulers|overview of the schedulers used at the different sites]].<br />
<br />
== [[Modules|Modules]] or "How-To-Use-Software-Without-installing-everything-yourself" ==<br />
Since a lot of applications rely on 3rd party software, there is a program on most supercomputers, called the [[Modules|Module system]]. With this system, other software, like compilers or special math libraries, are easily loadable and usable. Depending on the institution, different modules might be available, but there are usually common ones like the [[Compiler#Intel_Compiler|Intel]] or [[Compiler#Gnu_Compiler_Collection|GCC]] [[Compiler|Compilers]].<br />
<br />
A few common commands, to enter into the supercomputer commandline and talk to the module system, are <br />
{| class="wikitable" style="width: 40%;"<br />
| module list || lists loaded modules<br />
|-<br />
| module avail || lists available (loadable) modules<br />
|-<br />
| module load/unload x || loads/unloads module x<br />
|-<br />
| module switch x y || switches out module x for module y<br />
|}<br />
<br />
If you recurrently need lots of modules, this loading can be automated with an [[sh-file]], so that you just have to execute the file once and it loads all modules, you need.<br />
<br />
== [[Parallel_Programming|Parallel Programming]] or "How-To-Use-More-Than-One-Core" ==<br />
Currently development of computers is at a point, where you cannot just make a processor run faster (e.g. by increasing its clock frequency), because limits of physics have been reached in semiconductor development. Therefore the current approach is to split the work into multiple, ideally independent parts, which are then executed in parallel. Similar to cleaning your house, where everybody takes care of a few rooms, on a supercomputer this is usually done with parallel programming paradigms like [[OpenMP|Open Multi-Processing (OpenMP)]] or [[MPI|Message Passing Interface (MPI)]]. However like the fact that you only have one vacuum cleaner in the whole house which not everybody can use at the same time, there are limits on how fast you can get, even with a big number of processing units/cpus/cores (analogous to people in the metaphor) working on your problem (cleaning the house) in parallel.</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Compiler&diff=906Compiler2018-04-16T14:05:38Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>A Compiler is a computer program translating code from one language to another.<br />
<br />
== General ==<br />
When people write applications, they usually employ a text editor and a high level language like C/C++ or Fortran to produce code that looks somewhat like this:<br />
<br />
[[File:Compiler_Shematic.png|thumb|1000px|Schematic of the compile process]]<br />
<br />
<syntaxhighlight lang="c"><br />
#include <stdio.h><br />
<br />
int main()<br />
{<br />
printf("Hello, World!\n");<br />
return 0;<br />
}<br />
<br />
</syntaxhighlight><br />
<br />
<br />
This is easy to write, understand and maintain for humans. However, since a computer only understands 0s and 1s, this can not be executed directly. A Compiler tranlates this code into a binary file, which can be executed.<br />
<br />
With the emergence of higher level programming languages, the entry barrier into programming is significantly lowered. This facillitates the creation of more complex programs which cannot (easily) be written just in terms of 0s and 1s by humans.<br />
<br />
== Basic Usage ==<br />
You usually use a compiler by calling it from the [[shell]]:<br />
$ cc hello_world.c -o hello_world.o<br />
where you feed it the file hello_world.c and let it create the binary output file hello_world.o, which you can then execute by calling <br />
$ ./hello_world.o<br />
producing the desired output <br />
Hello, World!<br />
<br />
In most compilers there are optimization flags like <code>-O2</code> (commonly ranging from 0 to 3), where the compiler tries to figure out, what your program is doing and whether there is more efficient way of doing that. When the development process is near completion and you begin to use your program productively, optimization should be turned on to ensure that the software runs as fast as possible.<br />
<br />
When compiling an application (target) from multiple files, one might need to use another program called the linker to bind the different parts together. A handy tool to automate the process of compiling and linking is a build system like [[Make]] or [[CMake]], which might become necessary when dealing with more complex projects.<br />
<br />
== Intel Compiler ==<br />
The Intel Compiler (icc) is a compiler written by Intel and optimized to utilize the features of their microprocessors to their fullest extend, sometimes resulting in a significantly higher performance compared to other compiler alternatives. It is usually called with (maybe you have to load the corresponding [[Modules|module]] beforehand):<br />
$ icc [files] [options]<br />
<br />
== Gnu Compiler Collection ==<br />
The Gnu Compiler Collection (gcc) is a free collection of compilers, originally written for the GNU operating system and now available on all major platforms.<br />
It is usually called with (maybe you have to load the corresponding [[Modules|module]] beforehand):<br />
$ gcc [files] [options]<br />
<br />
== LLVM ==<br />
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. It is usually called with (maybe you have to load the corresponding [[Modules|module]] beforehand):<br />
$ clang [files] [options]<br />
<br />
== References ==<br />
[https://www.youtube.com/watch?v=QXjU9qTsYCc Video Explaining the Basic Idea of a Compiler]<br />
<br />
[https://software.intel.com/en-us/intel-compilers Intel Compilers]<br />
<br />
[https://gcc.gnu.org/ Gnu Compiler Collection (gcc)]<br />
<br />
[https://llvm.org/ LLVM Compiler Collection]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Torque&diff=905Torque2018-04-16T13:59:13Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== General ==<br />
<br />
Torque is a job [[scheduler]]. It is responsible for monitoring and controlling the workload of the batch system of a supercomputer and assigns resources to jobs. This system is aimed at bigger applications that need a lot of resources in terms of memory and execution time and cannot be directly accessed by the user, as opposed to [[Nodes#Login|login nodes]]. Applications to execute have to be specified in a [[jobscript]] that is submitted to the batch system by the user via a scheduler (like Torque).<br />
<br />
== Job Submission ==<br />
<br />
This command submits the job you defined in your [[Jobscript|jobscript]] to the batch system:<br />
<br />
$ qsub jobscript.sh<br />
<br />
Just like any other incoming job, your job will first be queued. Then, the scheduler decides when your job will be run. The more resources your job requires, the longer it may be waiting to execute.<br />
<br />
You can check the current status of your submitted jobs and their job ids with the following shell command. The most common states for a job are queued <code>Q</code> (job waits for free nodes), running <code>R</code> (the jobscript is currently being executed) or on hold <code>H</code> (job is currently stopped, but does not wait for resources). The command also shows the elapsed time since your job has started running and the time limit.<br />
<br />
$ qstat -u <user_id><br />
<br />
In case you submitted a job on accident or realised that your job might not be running correctly, you can always remove it from the queue or terminate it when running by typing:<br />
<br />
$ qdel <job_id><br />
<br />
== #PBS Usage ==<br />
<br />
TODO<br />
<br />
== Jobscript Examples ==<br />
<br />
TODO<br />
<br />
== References ==<br />
<br />
[http://www.democritos.it/activities/IT-MC/documentation/newinterface/pages/runningcodes.html Overview of how to write a jobscript for Torque]<br />
<br />
[http://docs.adaptivecomputing.com/torque/3-0-5/2.1jobsubmission.php Job submission on Torque]<br />
<br />
[http://www.arc.ox.ac.uk/content/torque-job-scheduler Guide to the Torque scheduler]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=LSF&diff=904LSF2018-04-16T13:59:12Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== General ==<br />
<br />
LSF (Platform Load Sharing Facility) is a job [[scheduler]]. It is responsible for monitoring and controlling the workload of the batch system of a supercomputer and assigns resources to jobs. This system is aimed at bigger applications that need a lot of resources in terms of memory and execution time and cannot be directly accessed by the user, as opposed to [[Nodes#Login|login nodes]]. Applications to execute have to be specified in a [[jobscript]] that is submitted to the batch system by the user via a scheduler (like LSF).<br />
<br />
<br />
== #BSUB Usage ==<br />
<br />
If you are writing a [[jobscript]] for an LSF batch system, the magic cookie is "#BSUB". To use it, start a new line in your script with "#BSUB". Following that, you can put one of the parameters shown below, where the word written in <...> should be replaced with a value.<br />
<br />
Basic settings:<br />
{| class="wikitable" style="width: 40%;"<br />
| Parameter || Function<br />
|-<br />
| -J <name> || job name<br />
|-<br />
| -o <path> || path to the file where the job output is written<br />
|-<br />
| -e <path> || path to the file for the job error output (if not set, it will be written to output file as well)<br />
|}<br />
<br />
Requesting resources:<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function || Default<br />
|-<br />
| -W <runlimit> || runtime limit in the format [hour:]minute; once the time specified is up, the job will be killed by the [[scheduler]] || 00:15<br />
|-<br />
| -M <memlimit> || memory limit per process in MB || 512<br />
|-<br />
| -S <stacklimit> || limit of stack size per process in MB || 10<br />
|}<br />
<br />
Parallel programming (read more [[Parallel_Programming|here]]):<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function<br />
|-<br />
| -a openmp || start a parallel job for a shared-memory system<br />
|-<br />
| -n <num_threads> || number of threads to execute OpenMP application with<br />
|-<br />
| -a openmpi || start a parallel job for a distributed-memory system<br />
|-<br />
| -n <num_procs> || number of processes to execute MPI application with<br />
|}<br />
<br />
Email notifications:<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function<br />
|-<br />
| -B || send email to the job submitter when the job starts running<br />
|-<br />
| -N || send email to the job submitter when the job has finished<br />
|-<br />
| -u <email_address> || recipient of emails<br />
|}<br />
<br />
== Job Submission ==<br />
<br />
This command submits the job you defined in your [[Jobscript|jobscript]] to the batch system. If the less-than sign <code><</code> is left out, your job will be submitted, but all the resource requests in your jobscript will be ignored.<br />
<br />
$ bsub < jobscript.sh<br />
<br />
Just like any other incoming job, your job will first be queued. Then, the scheduler decides when your job will be run. The more resources your job requires, the longer it may be waiting to execute.<br />
<br />
You can check the current status of your submitted jobs and their job ids with the following shell command. A job can either be pending <code>PEND</code> (waiting for free nodes to run on) or running <code>RUN</code> (the jobscript is currently being executed). If all of your jobs have finished execution, the command will print <code>No unfinished jobs found</code>.<br />
<br />
$ bjobs<br />
<br />
In case you submitted a job on accident or realised that your job might not be running correctly, you can always remove it from the queue or terminate it when running by typing:<br />
<br />
$ bkill <job_id><br />
<br />
== Jobscript Examples ==<br />
<br />
This serial job will run a given executable, in this case "myapp.exe".<br />
<syntaxhighlight lang="zsh"><br />
#!/usr/bin/env zsh<br />
<br />
### Job name<br />
#BSUB -J MYJOB<br />
<br />
### File where the output should be written<br />
#BSUB -o MYJOB_OUTPUT.txt<br />
<br />
### Time your job needs to execute, e. g. 1 h 20 min<br />
#BSUB -W 1:20<br />
<br />
### Memory your job needs, e. g. 1000 MB <br />
#BSUB -M 1000<br />
<br />
### Stack limit per process, e. g. 20 MB<br />
#BSUB -S 20<br />
<br />
### The last part consists of regular shell commands:<br />
### Change to working directory<br />
cd /home/user/mywork<br />
<br />
### Execute your application<br />
myapp.exe<br />
</syntaxhighlight><br />
<br />
This OpenMP job will start the [[Parallel_Programming|parallel program]] "myapp.exe" with 24 threads.<br />
<syntaxhighlight lang="zsh"><br />
#!/usr/bin/env zsh<br />
<br />
### Job name<br />
#BSUB -J OMPJOB<br />
<br />
### File where the output should be written<br />
#BSUB -o OMPJOB_OUTPUT<br />
<br />
### Time your job needs to execute, e. g. 15 min<br />
#BSUB -W 0:15<br />
<br />
### Memory your job needs, e. g. 1000 MB <br />
#BSUB -M 1000<br />
<br />
### Stack limit per process, e. g. 50 MB<br />
#BSUB -S 50<br />
<br />
### Request 24 compute slots (in this case: threads)<br />
#BSUB -n 24<br />
<br />
### Execute as shared-memory job<br />
#BSUB -a openmp<br />
<br />
### Change to working directory<br />
cd /home/user/mywork<br />
<br />
### Execute your application<br />
myapp.exe<br />
</syntaxhighlight><br />
<br />
This OpenMPI job will start the [[Parallel_Programming|parallel program]] "myapp.exe" with 4 processes.<br />
<syntaxhighlight lang="zsh"><br />
#!/usr/bin/env zsh<br />
<br />
### Job name<br />
#BSUB -J MPIJOB<br />
<br />
### File where the output should be written<br />
#BSUB -o MPIJOB_OUTPUT<br />
<br />
### Time your job needs to execute, e. g. 30 min<br />
#BSUB -W 0:30<br />
<br />
### Memory your job needs, e. g. 1024 MB <br />
#BSUB -M 1024<br />
<br />
### Stack limit per process, e. g. 50 MB<br />
#BSUB -S 50<br />
<br />
### Request 4 compute slots (in this case: processes)<br />
#BSUB -n 4<br />
<br />
### Execute as distributed-memory job with OpenMPI<br />
#BSUB -a openmpi<br />
<br />
### Change to working directory<br />
cd /home/user/mywork<br />
<br />
### Execute your application<br />
myapp.exe<br />
</syntaxhighlight><br />
<br />
== References ==<br />
<br />
[https://doc.itc.rwth-aachen.de/display/CC/Example+scripts More LSF jobscript examples]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=SLURM&diff=903SLURM2018-04-16T13:59:10Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== General ==<br />
<br />
SLURM is a job [[scheduler]]. It is responsible for monitoring and controlling the workload of the batch system of a supercomputer and assigns resources to jobs. This system is aimed at bigger applications that need a lot of resources in terms of memory and execution time and cannot be directly accessed by the user, as opposed to [[Nodes#Login|login nodes]]. Applications to execute have to be specified in a [[jobscript]] that is submitted to the batch system by the user via a scheduler (like SLURM).<br />
<br />
== #SBATCH Usage ==<br />
<br />
If you are writing a [[jobscript]] for a SLURM batch system, the magic cookie is "#SBATCH". To use it, start a new line in your script with "#SBATCH". Following that, you can put one of the parameters shown below, where the word written in <...> should be replaced with a value.<br />
<br />
Basic settings:<br />
{| class="wikitable" style="width: 40%;"<br />
| Parameter || Function<br />
|-<br />
| --job-name=<name> || job name<br />
|-<br />
| --output=<path> || path to the file where the job (error) output is written to<br />
|}<br />
<br />
Requesting resources:<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function<br />
|-<br />
| --time=<runlimit> || runtime limit in the format hours:min:sec; once the time specified is up, the job will be killed by the [[scheduler]]<br />
|-<br />
| --mem=<memlimit> || job memory request per node, usually an integer followed by a prefix for the unit (e. g. --mem=1G for 1 GB)<br />
|}<br />
<br />
Parallel programming (read more [[Parallel_Programming|here]]):<br />
<br />
Settings for OpenMP:<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function<br />
|-<br />
| --nodes=1 || start a parallel job for a shared-memory system on only one node<br />
|-<br />
| --cpus-per-task=<num_threads> || number of threads to execute OpenMP application with<br />
|-<br />
| --ntasks-per-core=<num_hyperthreads> || number of hyperthreads per core; i. e. any value greater than 1 will turn on hyperthreading (the possible maximum depends on your CPU)<br />
|-<br />
| --ntasks-per-node=1 || for OpenMP, use one task per node only<br />
|}<br />
<br />
Settings for MPI:<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function<br />
|-<br />
| --nodes=<num_nodes> || start a parallel job for a distributed-memory system on several nodes<br />
|-<br />
| --cpus-per-task=1 || for MPI, use one task per CPU<br />
|-<br />
| --ntasks-per-core=1 || disable hyperthreading<br />
|-<br />
| --ntasks-per-node=<num_procs> || number of processes per node (the possible maximum depends on your nodes)<br />
|}<br />
<br />
Email notifications:<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function<br />
|-<br />
| --mail-type=<type> || type can be one of BEGIN, END, FAIL, REQUEUE or ALL (where a mail will be sent each time the status of your process changes)<br />
|-<br />
| --mail-user=<email_address> || email address to send notifications to<br />
|}<br />
<br />
== Job Submission ==<br />
<br />
This command submits the job you defined in your [[Jobscript|jobscript]] to the batch system:<br />
<br />
$ sbatch jobscript.sh<br />
<br />
Just like any other incoming job, your job will first be queued. Then, the scheduler decides when your job will be run. The more resources your job requires, the longer it may be waiting to execute.<br />
<br />
You can check the current status of your submitted jobs and their job ids with the following shell command. A job can either be pending <code>PD</code> (waiting for free nodes to run on) or running <code>R</code> (the jobscript is currently being executed). This command will also print the time (hours:min:sec) that your job has been running for.<br />
<br />
$ squeue -u <user_id><br />
<br />
In case you submitted a job on accident or realised that your job might not be running correctly, you can always remove it from the queue or terminate it when running by typing:<br />
<br />
$ scancel <job_id><br />
<br />
== Jobscript Examples ==<br />
<br />
This serial job will run a given executable, in this case "myapp.exe".<br />
<syntaxhighlight lang="bash"><br />
#!/bin/bash<br />
<br />
### Job name<br />
#SBATCH --job-name=MYJOB<br />
<br />
### File for the output<br />
#SBATCH --output=MYJOB_OUTPUT<br />
<br />
### Time your job needs to execute, e. g. 15 min 30 sec<br />
#SBATCH --time=00:15:30<br />
<br />
### Memory your job needs per node, e. g. 1 GB<br />
#SBATCH --mem=1G<br />
<br />
### The last part consists of regular shell commands:<br />
### Change to working directory<br />
cd /home/usr/workingdirectory<br />
<br />
### Execute your application<br />
myapp.exe<br />
</syntaxhighlight><br />
<br />
If you'd like to run a parallel job on a cluster that is managed by SLURM, you have to clarify that. Therefore, use the command "srun <my_executable>" in your jobscript.<br />
<br />
This OpenMP job will start the [[Parallel_Programming|parallel program]] "myapp.exe" with 24 threads.<br />
<syntaxhighlight lang="bash"><br />
#!/bin/bash<br />
<br />
### Job name<br />
#SBATCH --job-name=OMPJOB<br />
<br />
### File for the output<br />
#SBATCH --output=OMPJOB_OUTPUT<br />
<br />
### Time your job needs to execute, e. g. 30 min<br />
#SBATCH --time=00:30:00<br />
<br />
### Memory your job needs per node, e. g. 500 MB<br />
#SBATCH --mem=500M<br />
<br />
### Use one node for parallel jobs on shared-memory systems<br />
#SBATCH --nodes=1<br />
<br />
### Number of threads to use, e. g. 24<br />
#SBATCH --cpus-per-task=24<br />
<br />
### Number of hyperthreads per core<br />
#SBATCH --ntasks-per-core=1<br />
<br />
### Tasks per node (for shared-memory parallelisation, use 1)<br />
#SBATCH --ntasks-per-node=1<br />
<br />
### The last part consists of regular shell commands:<br />
### Set the number of threads in your cluster environment to the value specified above<br />
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK<br />
<br />
### Change to working directory<br />
cd /home/usr/workingdirectory<br />
<br />
### Run your parallel application<br />
srun myapp.exe<br />
</syntaxhighlight><br />
<br />
This MPI job will start the [[Parallel_Programming|parallel program]] "myapp.exe" with 12 processes.<br />
<syntaxhighlight lang="bash"><br />
#!/bin/bash<br />
<br />
### Job name<br />
#SBATCH --job-name=MPIJOB<br />
<br />
### File for the output<br />
#SBATCH --output=MPIJOB_OUTPUT<br />
<br />
### Time your job needs to execute, e. g. 50 min<br />
#SBATCH --time=00:50:00<br />
<br />
### Memory your job needs per node, e. g. 250 MB<br />
#SBATCH --mem=250M<br />
<br />
### Use more than one node for parallel jobs on distributed-memory systems, e. g. 2<br />
#SBATCH --nodes=2<br />
<br />
### Number of CPUS per task (for distributed-memory parallelisation, use 1)<br />
#SBATCH --cpus-per-task=1<br />
<br />
### Disable hyperthreading by setting the tasks per core to 1<br />
#SBATCH --ntasks-per-core=1<br />
<br />
### Number of processes per node, e. g. 6 (6 processes on 2 nodes = 12 processes in total)<br />
#SBATCH --ntasks-per-node=6<br />
<br />
### The last part consists of regular shell commands:<br />
### Set the number of threads in your cluster environment to 1, as specified above<br />
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK<br />
<br />
### Change to working directory<br />
cd /home/usr/workingdirectory<br />
<br />
### Run your parallel application<br />
srun myapp.exe<br />
</syntaxhighlight><br />
<br />
== References ==<br />
<br />
[https://www.lrz.de/services/compute/linux-cluster/batch_parallel/example_jobs/ Advanced SLURM jobscript examples]<br />
<br />
[http://www.nersc.gov/users/computational-systems/cori/running-jobs/example-batch-scripts/ Detailled guide to more advanced scripts]<br />
<br />
[https://slurm.schedmd.com/sbatch.html SBATCH documentation]<br />
<br />
[https://user.cscs.ch/getting_started/running_jobs/jobscript_generator/#slurm-jobscript-generator SLURM jobscript generator]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=LSF&diff=902LSF2018-04-16T13:50:42Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== General ==<br />
<br />
LSF is a job [[Scheduler|scheduler]]. The abbreviation stands for "Platform Load Sharing Facility", which is used to monitor and control the workload of the batch system of a supercomputer and to assign resources to jobs. This system targets applications that utilize a lot of resources regarding memory and computation time. It cannot be directly accessed by the user, as opposed to the [[Nodes#Login|login nodes]]. Applications to execute have to be specified in a [[Jobscript|jobscript]] that is sent to the batch system by the user.<br />
<br />
== #BSUB Usage ==<br />
<br />
If you are writing a [[jobscript]] for an LSF batch system, the magic cookie is "#BSUB". To use it, start a new line in your script with "#BSUB". Following that, you can put one of the parameters shown below, where the word written in <...> should be replaced with a value.<br />
<br />
Basic settings:<br />
{| class="wikitable" style="width: 40%;"<br />
| Parameter || Function<br />
|-<br />
| -J <name> || job name<br />
|-<br />
| -o <path> || path to the file where the job output is written<br />
|-<br />
| -e <path> || path to the file for the job error output (if not set, it will be written to output file as well)<br />
|}<br />
<br />
Requesting resources:<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function || Default<br />
|-<br />
| -W <runlimit> || runtime limit in the format [hour:]minute; once the time specified is up, the job will be killed by the [[scheduler]] || 00:15<br />
|-<br />
| -M <memlimit> || memory limit per process in MB || 512<br />
|-<br />
| -S <stacklimit> || limit of stack size per process in MB || 10<br />
|}<br />
<br />
Parallel programming (read more [[Parallel_Programming|here]]):<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function<br />
|-<br />
| -a openmp || start a parallel job for a shared-memory system<br />
|-<br />
| -n <num_threads> || number of threads to execute OpenMP application with<br />
|-<br />
| -a openmpi || start a parallel job for a distributed-memory system<br />
|-<br />
| -n <num_procs> || number of processes to execute MPI application with<br />
|}<br />
<br />
Email notifications:<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function<br />
|-<br />
| -B || send email to the job submitter when the job starts running<br />
|-<br />
| -N || send email to the job submitter when the job has finished<br />
|-<br />
| -u <email_address> || recipient of emails<br />
|}<br />
<br />
== Job Submission ==<br />
<br />
This command submits the job you defined in your [[Jobscript|jobscript]] to the batch system. If the less-than sign <code><</code> is left out, your job will be submitted, but all the resource requests in your jobscript will be ignored.<br />
<br />
$ bsub < jobscript.sh<br />
<br />
Just like any other incoming job, your job will first be queued. Then, the scheduler decides when your job will be run. The more resources your job requires, the longer it may be waiting to execute.<br />
<br />
You can check the current status of your submitted jobs and their job ids with the following shell command. A job can either be pending <code>PEND</code> (waiting for free nodes to run on) or running <code>RUN</code> (the jobscript is currently being executed). If all of your jobs have finished execution, the command will print <code>No unfinished jobs found</code>.<br />
<br />
$ bjobs<br />
<br />
In case you submitted a job on accident or realised that your job might not be running correctly, you can always remove it from the queue or terminate it when running by typing:<br />
<br />
$ bkill <job_id><br />
<br />
== Jobscript Examples ==<br />
<br />
This serial job will run a given executable, in this case "myapp.exe".<br />
<syntaxhighlight lang="zsh"><br />
#!/usr/bin/env zsh<br />
<br />
### Job name<br />
#BSUB -J MYJOB<br />
<br />
### File where the output should be written<br />
#BSUB -o MYJOB_OUTPUT.txt<br />
<br />
### Time your job needs to execute, e. g. 1 h 20 min<br />
#BSUB -W 1:20<br />
<br />
### Memory your job needs, e. g. 1000 MB <br />
#BSUB -M 1000<br />
<br />
### Stack limit per process, e. g. 20 MB<br />
#BSUB -S 20<br />
<br />
### The last part consists of regular shell commands:<br />
### Change to working directory<br />
cd /home/user/mywork<br />
<br />
### Execute your application<br />
myapp.exe<br />
</syntaxhighlight><br />
<br />
This OpenMP job will start the [[Parallel_Programming|parallel program]] "myapp.exe" with 24 threads.<br />
<syntaxhighlight lang="zsh"><br />
#!/usr/bin/env zsh<br />
<br />
### Job name<br />
#BSUB -J OMPJOB<br />
<br />
### File where the output should be written<br />
#BSUB -o OMPJOB_OUTPUT<br />
<br />
### Time your job needs to execute, e. g. 15 min<br />
#BSUB -W 0:15<br />
<br />
### Memory your job needs, e. g. 1000 MB <br />
#BSUB -M 1000<br />
<br />
### Stack limit per process, e. g. 50 MB<br />
#BSUB -S 50<br />
<br />
### Request 24 compute slots (in this case: threads)<br />
#BSUB -n 24<br />
<br />
### Execute as shared-memory job<br />
#BSUB -a openmp<br />
<br />
### Change to working directory<br />
cd /home/user/mywork<br />
<br />
### Execute your application<br />
myapp.exe<br />
</syntaxhighlight><br />
<br />
This OpenMPI job will start the [[Parallel_Programming|parallel program]] "myapp.exe" with 4 processes.<br />
<syntaxhighlight lang="zsh"><br />
#!/usr/bin/env zsh<br />
<br />
### Job name<br />
#BSUB -J MPIJOB<br />
<br />
### File where the output should be written<br />
#BSUB -o MPIJOB_OUTPUT<br />
<br />
### Time your job needs to execute, e. g. 30 min<br />
#BSUB -W 0:30<br />
<br />
### Memory your job needs, e. g. 1024 MB <br />
#BSUB -M 1024<br />
<br />
### Stack limit per process, e. g. 50 MB<br />
#BSUB -S 50<br />
<br />
### Request 4 compute slots (in this case: processes)<br />
#BSUB -n 4<br />
<br />
### Execute as distributed-memory job with OpenMPI<br />
#BSUB -a openmpi<br />
<br />
### Change to working directory<br />
cd /home/user/mywork<br />
<br />
### Execute your application<br />
myapp.exe<br />
</syntaxhighlight><br />
<br />
== References ==<br />
<br />
[https://doc.itc.rwth-aachen.de/display/CC/Example+scripts More LSF jobscript examples]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Torque&diff=901Torque2018-04-16T13:50:27Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== General ==<br />
<br />
Torque is a job [[Scheduler|scheduler]]. It is used to monitor and control the workload of the batch system of a supercomputer and assigns resources to jobs. This system targets applications that utilize a lot of resources and it cannot be directly accessed by the user, as opposed to the [[Nodes#Login|login nodes]]. Applications to execute have to be specified in a [[Jobscript|jobscript]] that is sent to the batch system by the user.<br />
<br />
== Job Submission ==<br />
<br />
This command submits the job you defined in your [[Jobscript|jobscript]] to the batch system:<br />
<br />
$ qsub jobscript.sh<br />
<br />
Just like any other incoming job, your job will first be queued. Then, the scheduler decides when your job will be run. The more resources your job requires, the longer it may be waiting to execute.<br />
<br />
You can check the current status of your submitted jobs and their job ids with the following shell command. The most common states for a job are queued <code>Q</code> (job waits for free nodes), running <code>R</code> (the jobscript is currently being executed) or on hold <code>H</code> (job is currently stopped, but does not wait for resources). The command also shows the elapsed time since your job has started running and the time limit.<br />
<br />
$ qstat -u <user_id><br />
<br />
In case you submitted a job on accident or realised that your job might not be running correctly, you can always remove it from the queue or terminate it when running by typing:<br />
<br />
$ qdel <job_id><br />
<br />
== #PBS Usage ==<br />
<br />
TODO<br />
<br />
== Jobscript Examples ==<br />
<br />
TODO<br />
<br />
== References ==<br />
<br />
[http://www.democritos.it/activities/IT-MC/documentation/newinterface/pages/runningcodes.html Overview of how to write a jobscript for Torque]<br />
<br />
[http://docs.adaptivecomputing.com/torque/3-0-5/2.1jobsubmission.php Job submission on Torque]<br />
<br />
[http://www.arc.ox.ac.uk/content/torque-job-scheduler Guide to the Torque scheduler]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=SLURM&diff=900SLURM2018-04-16T13:50:13Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== General ==<br />
<br />
SLURM is a job [[Scheduler|scheduler]]. It is responsible for monitoring and controlling the workload of the batch system of a supercomputer and assigns resources to jobs. This system is aimed at bigger applications that need a lot of resources in terms of memory and execution time and cannot be directly accessed by the user, as opposed to [[Nodes#Login|login nodes]]. Applications to execute have to be specified in a [[Jobscript|jobscript]] that is sent to the batch system by the user.<br />
<br />
== #SBATCH Usage ==<br />
<br />
If you are writing a [[jobscript]] for a SLURM batch system, the magic cookie is "#SBATCH". To use it, start a new line in your script with "#SBATCH". Following that, you can put one of the parameters shown below, where the word written in <...> should be replaced with a value.<br />
<br />
Basic settings:<br />
{| class="wikitable" style="width: 40%;"<br />
| Parameter || Function<br />
|-<br />
| --job-name=<name> || job name<br />
|-<br />
| --output=<path> || path to the file where the job (error) output is written to<br />
|}<br />
<br />
Requesting resources:<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function<br />
|-<br />
| --time=<runlimit> || runtime limit in the format hours:min:sec; once the time specified is up, the job will be killed by the [[scheduler]]<br />
|-<br />
| --mem=<memlimit> || job memory request per node, usually an integer followed by a prefix for the unit (e. g. --mem=1G for 1 GB)<br />
|}<br />
<br />
Parallel programming (read more [[Parallel_Programming|here]]):<br />
<br />
Settings for OpenMP:<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function<br />
|-<br />
| --nodes=1 || start a parallel job for a shared-memory system on only one node<br />
|-<br />
| --cpus-per-task=<num_threads> || number of threads to execute OpenMP application with<br />
|-<br />
| --ntasks-per-core=<num_hyperthreads> || number of hyperthreads per core; i. e. any value greater than 1 will turn on hyperthreading (the possible maximum depends on your CPU)<br />
|-<br />
| --ntasks-per-node=1 || for OpenMP, use one task per node only<br />
|}<br />
<br />
Settings for MPI:<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function<br />
|-<br />
| --nodes=<num_nodes> || start a parallel job for a distributed-memory system on several nodes<br />
|-<br />
| --cpus-per-task=1 || for MPI, use one task per CPU<br />
|-<br />
| --ntasks-per-core=1 || disable hyperthreading<br />
|-<br />
| --ntasks-per-node=<num_procs> || number of processes per node (the possible maximum depends on your nodes)<br />
|}<br />
<br />
Email notifications:<br />
{| class="wikitable" style="width: 60%;"<br />
| Parameter || Function<br />
|-<br />
| --mail-type=<type> || type can be one of BEGIN, END, FAIL, REQUEUE or ALL (where a mail will be sent each time the status of your process changes)<br />
|-<br />
| --mail-user=<email_address> || email address to send notifications to<br />
|}<br />
<br />
== Job Submission ==<br />
<br />
This command submits the job you defined in your [[Jobscript|jobscript]] to the batch system:<br />
<br />
$ sbatch jobscript.sh<br />
<br />
Just like any other incoming job, your job will first be queued. Then, the scheduler decides when your job will be run. The more resources your job requires, the longer it may be waiting to execute.<br />
<br />
You can check the current status of your submitted jobs and their job ids with the following shell command. A job can either be pending <code>PD</code> (waiting for free nodes to run on) or running <code>R</code> (the jobscript is currently being executed). This command will also print the time (hours:min:sec) that your job has been running for.<br />
<br />
$ squeue -u <user_id><br />
<br />
In case you submitted a job on accident or realised that your job might not be running correctly, you can always remove it from the queue or terminate it when running by typing:<br />
<br />
$ scancel <job_id><br />
<br />
== Jobscript Examples ==<br />
<br />
This serial job will run a given executable, in this case "myapp.exe".<br />
<syntaxhighlight lang="bash"><br />
#!/bin/bash<br />
<br />
### Job name<br />
#SBATCH --job-name=MYJOB<br />
<br />
### File for the output<br />
#SBATCH --output=MYJOB_OUTPUT<br />
<br />
### Time your job needs to execute, e. g. 15 min 30 sec<br />
#SBATCH --time=00:15:30<br />
<br />
### Memory your job needs per node, e. g. 1 GB<br />
#SBATCH --mem=1G<br />
<br />
### The last part consists of regular shell commands:<br />
### Change to working directory<br />
cd /home/usr/workingdirectory<br />
<br />
### Execute your application<br />
myapp.exe<br />
</syntaxhighlight><br />
<br />
If you'd like to run a parallel job on a cluster that is managed by SLURM, you have to clarify that. Therefore, use the command "srun <my_executable>" in your jobscript.<br />
<br />
This OpenMP job will start the [[Parallel_Programming|parallel program]] "myapp.exe" with 24 threads.<br />
<syntaxhighlight lang="bash"><br />
#!/bin/bash<br />
<br />
### Job name<br />
#SBATCH --job-name=OMPJOB<br />
<br />
### File for the output<br />
#SBATCH --output=OMPJOB_OUTPUT<br />
<br />
### Time your job needs to execute, e. g. 30 min<br />
#SBATCH --time=00:30:00<br />
<br />
### Memory your job needs per node, e. g. 500 MB<br />
#SBATCH --mem=500M<br />
<br />
### Use one node for parallel jobs on shared-memory systems<br />
#SBATCH --nodes=1<br />
<br />
### Number of threads to use, e. g. 24<br />
#SBATCH --cpus-per-task=24<br />
<br />
### Number of hyperthreads per core<br />
#SBATCH --ntasks-per-core=1<br />
<br />
### Tasks per node (for shared-memory parallelisation, use 1)<br />
#SBATCH --ntasks-per-node=1<br />
<br />
### The last part consists of regular shell commands:<br />
### Set the number of threads in your cluster environment to the value specified above<br />
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK<br />
<br />
### Change to working directory<br />
cd /home/usr/workingdirectory<br />
<br />
### Run your parallel application<br />
srun myapp.exe<br />
</syntaxhighlight><br />
<br />
This MPI job will start the [[Parallel_Programming|parallel program]] "myapp.exe" with 12 processes.<br />
<syntaxhighlight lang="bash"><br />
#!/bin/bash<br />
<br />
### Job name<br />
#SBATCH --job-name=MPIJOB<br />
<br />
### File for the output<br />
#SBATCH --output=MPIJOB_OUTPUT<br />
<br />
### Time your job needs to execute, e. g. 50 min<br />
#SBATCH --time=00:50:00<br />
<br />
### Memory your job needs per node, e. g. 250 MB<br />
#SBATCH --mem=250M<br />
<br />
### Use more than one node for parallel jobs on distributed-memory systems, e. g. 2<br />
#SBATCH --nodes=2<br />
<br />
### Number of CPUS per task (for distributed-memory parallelisation, use 1)<br />
#SBATCH --cpus-per-task=1<br />
<br />
### Disable hyperthreading by setting the tasks per core to 1<br />
#SBATCH --ntasks-per-core=1<br />
<br />
### Number of processes per node, e. g. 6 (6 processes on 2 nodes = 12 processes in total)<br />
#SBATCH --ntasks-per-node=6<br />
<br />
### The last part consists of regular shell commands:<br />
### Set the number of threads in your cluster environment to 1, as specified above<br />
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK<br />
<br />
### Change to working directory<br />
cd /home/usr/workingdirectory<br />
<br />
### Run your parallel application<br />
srun myapp.exe<br />
</syntaxhighlight><br />
<br />
== References ==<br />
<br />
[https://www.lrz.de/services/compute/linux-cluster/batch_parallel/example_jobs/ Advanced SLURM jobscript examples]<br />
<br />
[http://www.nersc.gov/users/computational-systems/cori/running-jobs/example-batch-scripts/ Detailled guide to more advanced scripts]<br />
<br />
[https://slurm.schedmd.com/sbatch.html SBATCH documentation]<br />
<br />
[https://user.cscs.ch/getting_started/running_jobs/jobscript_generator/#slurm-jobscript-generator SLURM jobscript generator]</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Getting_Started&diff=899Getting Started2018-04-16T13:49:02Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* Schedulers or "How-To-Run-Applications-on-a-supercomputer" */</p>
<hr />
<div>== [[Access|Access]] or "How-to-be-allowed-onto-the-supercomputer" ==<br />
Depending on the specific supercomputer, one has to either register to get a user account or write a project proposal and apply for computing resources that way. The respective pages are linked in [[Access|this overview]].<br />
<br />
After this is done and login credentials are supplied, one can proceed to [[ Getting_Started#Login_or_.22How-to-now-actually-connect-to-the-supercomputer.22 | login ]].<br />
<br />
== [[Nodes#Login|Login]] or "How-to-now-actually-connect-to-the-supercomputer" ==<br />
Most HPC Systems are unix-based environments with [[shell]] (commandline) access.<br />
<br />
To log in, one usually uses [[ssh]] to reach the respective [[Nodes#Login|Login Nodes]] (computers reserved for people just like you that want to connect to the supercomputer). Sometimes this access is restricted, so you can only connect, when you are within the university/facility and its network. To still access the Login Nodes externally, one can 'pretend to be inside the network' by using a [[VPN|Virtual Private Network (VPN)]].<br />
<br />
Once there, the user can interact with the system and run (small) programs to generally test the system/software.<br />
<br />
== [[File_Transfer|File Transfer]] or "How-to-get-your-data-onto-or-off-the-supercomputer" ==<br />
To get your data (files) onto the supercomputer or back to your local machine, there are usually different ways. Sometimes there are computers specifically reserved for this purpose called [[Nodes#Copy|copy nodes]].<br />
<br />
If available to you, it is recommened to use these copy nodes to move data to or from the supercomputer, since this will result in a better connection and disturb other users less. Additionally the tools mentioned below might only work on these nodes. If there are no dedicated copy nodes, you can usually use the [[Nodes#Login|Login Nodes]] for this purpose.<br />
<br />
Commonly used and widely supported copying tools are [[rsync]] which mirrors directories (folders) between the supercomputer and your local machine. [[scp]] which is useful for a few single files or specified file-lists, and lastly the commonly used [[ftp]] or the encrypted version sftp (or ftps).<br />
A little bit more information can be found in the [[File_Transfer|File Transfer]] article.<br />
<br />
== [[Scheduler|Schedulers]] or "How-To-Run-Applications-on-a-supercomputer" ==<br />
To run any significant program or workload on a supercomputer, generally [[Scheduler|schedulers]] are employed. Alongside the above-mentioned Login Nodes there are usually far more Backend Nodes in the system (computers exclusively reserved for computing, to which you cannot connect directly, also referred to as "batch system"). A program called scheduler decides who gets how many of those compute resources for which amount of time. Please use the Backend Nodes for everything which is not a simple small test and only runs for a few minutes., otherwise you will block the Login Nodes for everybody when you run your calculations there. These Backend Nodes make up more than 98% of a supercomputer and can only be accessed via the scheduler.<br />
<br />
When you log into a supercomputer, you can run commands on the Login Nodes interactively. You type, you hit return, the command gets executed. Schedulers work differently. You submit a series of commands (in form of a file) and tell it, how much resources it will approximately need in terms of:<br />
<br />
* time: If the specified time runs out, before your application finishes and exits, it will be terminated by the scheduler.<br />
* compute resources: how many cpus ('calculation thingies'), sockets ('cpu-houses') and nodes ('computers')<br />
* memory resources: how much RAM ('very fast memory, similar to the few books you have at home')<br />
<br />
This combination of specified commands and required resources is commonly referred to as a "(batch) job".<br />
<br />
If later compute resources become free, which match the requirements of your application, the scheduler will run your specified commands on the requested hardware. This is usually delayed (sometimes you have to wait a day or two) and not instant, because other users are currently using the compute resources and you have to wait until their program runs finish. Furthermore you cannot change the series of commands after submitting, but just terminate the job and submit a new one in case of an error.<br />
<br />
The file specifying this series of commands and the required resources is called a [[jobscript]]. Its format and syntax depends on the installed scheduler. When you have this jobscript ready with the help of [[jobscript-examples]], colleagues or your local [[support]], you can submit it to the respective [[Schedulers|scheduler of your facility]]. The scheduler then waits until a set of nodes (computers) are free and later allocates those to execute your job as soon as possible. Sometimes there is (an optional) email notification, which is send when your job starts execution/finished running.<br />
<br />
Be aware that your specified requirements have to fit within the boundaries of the system of your facility. If you ask for more than there is, chances are, the scheduler will accept your job and wait until missing hardware is bought and installed - although this will not happen in 99.9% of cases. Information over the available hardware can be found in the [https://gauss-allianz.de/de/hpc-ecosystem overview of the Gauss Allianz] or the [[Site-specific_documentation|documentation of the different sites]]. You can find more information about [[Getting_Started#Parallel_Programming_or_.22How-To-Use-More-Than-One-Core.22|parallelizing programs here]]. Also there is an [[Schedulers|overview of the schedulers used at the different sites]].<br />
<br />
== [[Modules|Modules]] or "How-To-Use-Software-Without-installing-everything-yourself" ==<br />
Since a lot of applications rely on 3rd party software, there is a program on most supercomputers, called the [[Modules|Module system]]. With this system, other software, like compilers or special math libraries, are easily loadable and usable. Depending on the institution, different modules might be available, but there are usually common ones like the [[Compiler#Intel_Compiler|Intel]] or [[Compiler#Gnu_Compiler_Collection|GCC]] [[Compiler|Compilers]].<br />
<br />
A few common commands, to enter into the supercomputer commandline and talk to the module system, are <br />
{| class="wikitable" style="width: 40%;"<br />
| module list || lists loaded modules<br />
|-<br />
| module avail || lists available (loadable) modules<br />
|-<br />
| module load/unload x || loads/unloads module x<br />
|-<br />
| module switch x y || switches out module x for module y<br />
|}<br />
<br />
If you recurrently need lots of modules, this loading can be automated with an [[sh-file]], so that you just have to execute the file once and it loads all modules, you need.<br />
<br />
== [[Parallel_Programming|Parallel Programming]] or "How-To-Use-More-Than-One-Core" ==<br />
Currently development of computers is at a point, where you cannot just make a processor run faster, because the semiconductor physics simply do not work that way. Therefore the current approach is to split the work into multiple, ideally independent parts, which are then executed in parallel. Similar to cleaning your house, where everybody takes care of a few rooms, on a supercomputer this is usually done with parallel programming paradigms like [[OpenMP|Open Multi-Processing (OpenMP)]] or [[MPI|Message Passing Interface (MPI)]]. However like the fact that you only have one vacuum cleaner in the whole house which not everybody can use at the same time, there are limits on how fast you can get, even with a big number of processing units/cpus/cores (analogous to people in the metaphor) working on your problem (cleaning the house) in parallel.</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Site-specific_documentation&diff=898Site-specific documentation2018-04-16T13:42:32Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>Please find the different documentations of the participating facilities here:<br />
<br />
{| class="wikitable" style="width: 40%;"<br />
| IT Center - RWTH Aachen || [https://doc.itc.rwth-aachen.de/ Doc_site][http://itc.rwth-aachen.de/ IT_Center] <br />
|-<br />
| RRZE - FAU Erlangen || [https://www.anleitungen.rrze.fau.de/hpc/ HPC_documentation]<br />
|-<br />
| ZIH - TU Dresden || [https://doc.zih.tu-dresden.de/hpc-wiki/bin/view/Compendium/WebHome HPC Compendium]<br />
|}</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Getting_Started&diff=897Getting Started2018-04-16T13:42:09Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* Schedulers or "How-To-Run-Applications-on-a-supercomputer" */</p>
<hr />
<div>== [[Access|Access]] or "How-to-be-allowed-onto-the-supercomputer" ==<br />
Depending on the specific supercomputer, one has to either register to get a user account or write a project proposal and apply for computing resources that way. The respective pages are linked in [[Access|this overview]].<br />
<br />
After this is done and login credentials are supplied, one can proceed to [[ Getting_Started#Login_or_.22How-to-now-actually-connect-to-the-supercomputer.22 | login ]].<br />
<br />
== [[Nodes#Login|Login]] or "How-to-now-actually-connect-to-the-supercomputer" ==<br />
Most HPC Systems are unix-based environments with [[shell]] (commandline) access.<br />
<br />
To log in, one usually uses [[ssh]] to reach the respective [[Nodes#Login|Login Nodes]] (computers reserved for people just like you that want to connect to the supercomputer). Sometimes this access is restricted, so you can only connect, when you are within the university/facility and its network. To still access the Login Nodes externally, one can 'pretend to be inside the network' by using a [[VPN|Virtual Private Network (VPN)]].<br />
<br />
Once there, the user can interact with the system and run (small) programs to generally test the system/software.<br />
<br />
== [[File_Transfer|File Transfer]] or "How-to-get-your-data-onto-or-off-the-supercomputer" ==<br />
To get your data (files) onto the supercomputer or back to your local machine, there are usually different ways. Sometimes there are computers specifically reserved for this purpose called [[Nodes#Copy|copy nodes]].<br />
<br />
If available to you, it is recommened to use these copy nodes to move data to or from the supercomputer, since this will result in a better connection and disturb other users less. Additionally the tools mentioned below might only work on these nodes. If there are no dedicated copy nodes, you can usually use the [[Nodes#Login|Login Nodes]] for this purpose.<br />
<br />
Commonly used and widely supported copying tools are [[rsync]] which mirrors directories (folders) between the supercomputer and your local machine. [[scp]] which is useful for a few single files or specified file-lists, and lastly the commonly used [[ftp]] or the encrypted version sftp (or ftps).<br />
A little bit more information can be found in the [[File_Transfer|File Transfer]] article.<br />
<br />
== [[Scheduler|Schedulers]] or "How-To-Run-Applications-on-a-supercomputer" ==<br />
To run any significant program or workload on a supercomputer, generally [[Scheduler|schedulers]] are employed. Alongside the above-mentioned Login Nodes there are usually far more Backend Nodes in the system (computers exclusively reserved for computing, to which you cannot connect directly). A program called scheduler decides who gets how many of those compute resources for which amount of time. Please use the Backend Nodes for everything which is not a simple small test and only runs for a few minutes., otherwise you will block the Login Nodes for everybody when you run your calculations there. These Backend Nodes make up more than 98% of a supercomputer and can only be accessed via the scheduler.<br />
<br />
When you log into a supercomputer, you can run commands on the Login Nodes interactively. You type, you hit return, the command gets executed. Schedulers work differently. You submit a series of commands (in form of a file) and tell it, how much resources it will approximately need in terms of:<br />
<br />
* time: If the specified time runs out, before your application finishes and exits, it will be terminated by the scheduler.<br />
* compute resources: how many cpus ('calculation thingies'), sockets ('cpu-houses') and nodes ('computers')<br />
* memory resources: how much RAM ('very fast memory, similar to the few books you have at home')<br />
<br />
This combination of specified commands and required resources is commonly referred to as a "(batch) job".<br />
<br />
If later compute resources become free, which match the requirements of your application, the scheduler will run your specified commands on the requested hardware. This is usually delayed (sometimes you have to wait a day or two) and not instant, because other users are currently using the compute resources and you have to wait until their program runs finish. Furthermore you cannot change the series of commands after submitting, but just terminate the job and submit a new one in case of an error.<br />
<br />
The file specifying this series of commands and the required resources is called a [[jobscript]]. Its format and syntax depends on the installed scheduler. When you have this jobscript ready with the help of [[jobscript-examples]], colleagues or your local [[support]], you can submit it to the respective [[Schedulers|scheduler of your facility]]. The scheduler then waits until a set of nodes (computers) are free and later allocates those to execute your job as soon as possible. Sometimes there is (an optional) email notification, which is send when your job starts execution/finished running.<br />
<br />
Be aware that your specified requirements have to fit within the boundaries of the system of your facility. If you ask for more than there is, chances are, the scheduler will accept your job and wait until missing hardware is bought and installed - although this will not happen in 99.9% of cases. Information over the available hardware can be found in the [https://gauss-allianz.de/de/hpc-ecosystem overview of the Gauss Allianz] or the [[Site-specific_documentation|documentation of the different sites]]. You can find more information about [[Getting_Started#Parallel_Programming_or_.22How-To-Use-More-Than-One-Core.22|parallelizing programs here]]. Also there is an [[Schedulers|overview of the schedulers used at the different sites]].<br />
<br />
== [[Modules|Modules]] or "How-To-Use-Software-Without-installing-everything-yourself" ==<br />
Since a lot of applications rely on 3rd party software, there is a program on most supercomputers, called the [[Modules|Module system]]. With this system, other software, like compilers or special math libraries, are easily loadable and usable. Depending on the institution, different modules might be available, but there are usually common ones like the [[Compiler#Intel_Compiler|Intel]] or [[Compiler#Gnu_Compiler_Collection|GCC]] [[Compiler|Compilers]].<br />
<br />
A few common commands, to enter into the supercomputer commandline and talk to the module system, are <br />
{| class="wikitable" style="width: 40%;"<br />
| module list || lists loaded modules<br />
|-<br />
| module avail || lists available (loadable) modules<br />
|-<br />
| module load/unload x || loads/unloads module x<br />
|-<br />
| module switch x y || switches out module x for module y<br />
|}<br />
<br />
If you recurrently need lots of modules, this loading can be automated with an [[sh-file]], so that you just have to execute the file once and it loads all modules, you need.<br />
<br />
== [[Parallel_Programming|Parallel Programming]] or "How-To-Use-More-Than-One-Core" ==<br />
Currently development of computers is at a point, where you cannot just make a processor run faster, because the semiconductor physics simply do not work that way. Therefore the current approach is to split the work into multiple, ideally independent parts, which are then executed in parallel. Similar to cleaning your house, where everybody takes care of a few rooms, on a supercomputer this is usually done with parallel programming paradigms like [[OpenMP|Open Multi-Processing (OpenMP)]] or [[MPI|Message Passing Interface (MPI)]]. However like the fact that you only have one vacuum cleaner in the whole house which not everybody can use at the same time, there are limits on how fast you can get, even with a big number of processing units/cpus/cores (analogous to people in the metaphor) working on your problem (cleaning the house) in parallel.</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Getting_Started&diff=896Getting Started2018-04-16T13:38:39Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* Schedulers or "How-To-Run-Applications-on-a-supercomputer" */</p>
<hr />
<div>== [[Access|Access]] or "How-to-be-allowed-onto-the-supercomputer" ==<br />
Depending on the specific supercomputer, one has to either register to get a user account or write a project proposal and apply for computing resources that way. The respective pages are linked in [[Access|this overview]].<br />
<br />
After this is done and login credentials are supplied, one can proceed to [[ Getting_Started#Login_or_.22How-to-now-actually-connect-to-the-supercomputer.22 | login ]].<br />
<br />
== [[Nodes#Login|Login]] or "How-to-now-actually-connect-to-the-supercomputer" ==<br />
Most HPC Systems are unix-based environments with [[shell]] (commandline) access.<br />
<br />
To log in, one usually uses [[ssh]] to reach the respective [[Nodes#Login|Login Nodes]] (computers reserved for people just like you that want to connect to the supercomputer). Sometimes this access is restricted, so you can only connect, when you are within the university/facility and its network. To still access the Login Nodes externally, one can 'pretend to be inside the network' by using a [[VPN|Virtual Private Network (VPN)]].<br />
<br />
Once there, the user can interact with the system and run (small) programs to generally test the system/software.<br />
<br />
== [[File_Transfer|File Transfer]] or "How-to-get-your-data-onto-or-off-the-supercomputer" ==<br />
To get your data (files) onto the supercomputer or back to your local machine, there are usually different ways. Sometimes there are computers specifically reserved for this purpose called [[Nodes#Copy|copy nodes]].<br />
<br />
If available to you, it is recommened to use these copy nodes to move data to or from the supercomputer, since this will result in a better connection and disturb other users less. Additionally the tools mentioned below might only work on these nodes. If there are no dedicated copy nodes, you can usually use the [[Nodes#Login|Login Nodes]] for this purpose.<br />
<br />
Commonly used and widely supported copying tools are [[rsync]] which mirrors directories (folders) between the supercomputer and your local machine. [[scp]] which is useful for a few single files or specified file-lists, and lastly the commonly used [[ftp]] or the encrypted version sftp (or ftps).<br />
A little bit more information can be found in the [[File_Transfer|File Transfer]] article.<br />
<br />
== [[Scheduler|Schedulers]] or "How-To-Run-Applications-on-a-supercomputer" ==<br />
To run any significant program or workload on a supercomputer, generally [[Scheduler|schedulers]] are employed. Alongside the above-mentioned Login Nodes there are usually far more Backend Nodes in the system (computers exclusively reserved for computing, to which you cannot connect directly). A program called scheduler decides who gets how many of those compute resources for which amount of time. Please use the Backend Nodes for everything which is not a simple small test and only runs for a few minutes., otherwise you will block the Login Nodes for everybody when you run your calculations there. These Backend Nodes make up more than 98% of a supercomputer and can only be accessed via the scheduler.<br />
<br />
When you log into a supercomputer, you can run commands on the Login Nodes interactively. You type, you hit return, the command gets executed. Schedulers work differently. You submit a series of commands (in form of a file) and tell it, how much resources it will approximately need in terms of:<br />
<br />
* time: If the specified time runs out, before your application finishes and exits, it will be terminated by the scheduler.<br />
* compute resources: how many cpus ('calculation thingies'), sockets ('cpu-houses') and nodes ('computers')<br />
* memory resources: how much RAM ('very fast memory, similar to the few books you have at home')<br />
<br />
This combination of specified commands and required resources is commonly referred to as a "(batch) job".<br />
<br />
If later compute resources become free, which match the requirements of your application, the scheduler will run your specified commands on the requested hardware. This is usually delayed (sometimes you have to wait a day or two) and not instant, because other users are currently using the compute resources and you have to wait until their program runs finish. Furthermore you cannot change the series of commands after submitting, but just terminate the job and submit a new one in case of an error.<br />
<br />
The file specifying this series of commands and the required resources is called a [[jobscript]]. Its format and syntax depends on the installed scheduler. When you have this jobscript ready with the help of [[jobscript-examples]], colleagues or your local [[Support]], you can submit it to the respective [[Schedulers|scheduler of your facility]]. The Scheduler then waits until a set of nodes (computers) are free and later allocates those to execute your job as soon as possible. Sometimes there is (an optional) email notification, which is send when your job starts execution/finished running.<br />
<br />
Be aware that your specified requirements have to fit within the boundaries of the system of your facility. If you ask for more than there is, chances are, the scheduler will accept your job and wait until missing hardware is bought and installed - although this will not happen in 99.9% of cases. Information over the available hardware can be found in the [https://gauss-allianz.de/de/hpc-ecosystem Overview of the Gauss Allianz] or the [[Site-specific_documentation|documentation of the different sites]]. You can find more information about [[Getting_Started#Parallel_Programming_or_.22How-To-Use-More-Than-One-Core.22|parallelizing programs here]]. Also there is an [[Schedulers|overview of the schedulers used at the different sites]].<br />
<br />
== [[Modules|Modules]] or "How-To-Use-Software-Without-installing-everything-yourself" ==<br />
Since a lot of applications rely on 3rd party software, there is a program on most supercomputers, called the [[Modules|Module system]]. With this system, other software, like compilers or special math libraries, are easily loadable and usable. Depending on the institution, different modules might be available, but there are usually common ones like the [[Compiler#Intel_Compiler|Intel]] or [[Compiler#Gnu_Compiler_Collection|GCC]] [[Compiler|Compilers]].<br />
<br />
A few common commands, to enter into the supercomputer commandline and talk to the module system, are <br />
{| class="wikitable" style="width: 40%;"<br />
| module list || lists loaded modules<br />
|-<br />
| module avail || lists available (loadable) modules<br />
|-<br />
| module load/unload x || loads/unloads module x<br />
|-<br />
| module switch x y || switches out module x for module y<br />
|}<br />
<br />
If you recurrently need lots of modules, this loading can be automated with an [[sh-file]], so that you just have to execute the file once and it loads all modules, you need.<br />
<br />
== [[Parallel_Programming|Parallel Programming]] or "How-To-Use-More-Than-One-Core" ==<br />
Currently development of computers is at a point, where you cannot just make a processor run faster, because the semiconductor physics simply do not work that way. Therefore the current approach is to split the work into multiple, ideally independent parts, which are then executed in parallel. Similar to cleaning your house, where everybody takes care of a few rooms, on a supercomputer this is usually done with parallel programming paradigms like [[OpenMP|Open Multi-Processing (OpenMP)]] or [[MPI|Message Passing Interface (MPI)]]. However like the fact that you only have one vacuum cleaner in the whole house which not everybody can use at the same time, there are limits on how fast you can get, even with a big number of processing units/cpus/cores (analogous to people in the metaphor) working on your problem (cleaning the house) in parallel.</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=895HPC-Dictionary2018-04-16T13:28:10Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== Unix ==<br />
<br />
TODO<br />
<br />
== Unix File System ==<br />
<br />
TODO<br />
<br />
== Node ==<br />
<br />
TODO<br />
<br />
=== Login Node ===<br />
<br />
TODO<br />
<br />
=== Copy Node ===<br />
<br />
TODO<br />
<br />
=== Backend Node ===<br />
<br />
TODO<br />
<br />
== CPU ==<br />
<br />
TODO<br />
<br />
== Socket ==<br />
<br />
TODO<br />
<br />
== Random Access Memory (RAM) ==<br />
<br />
TODO</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Getting_Started&diff=894Getting Started2018-04-16T13:26:41Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* Schedulers or "How-To-Run-Applications-on-a-supercomputer" */</p>
<hr />
<div>== [[Access|Access]] or "How-to-be-allowed-onto-the-supercomputer" ==<br />
Depending on the specific supercomputer, one has to either register to get a user account or write a project proposal and apply for computing resources that way. The respective pages are linked in [[Access|this overview]].<br />
<br />
After this is done and login credentials are supplied, one can proceed to [[ Getting_Started#Login_or_.22How-to-now-actually-connect-to-the-supercomputer.22 | login ]].<br />
<br />
== [[Nodes#Login|Login]] or "How-to-now-actually-connect-to-the-supercomputer" ==<br />
Most HPC Systems are unix-based environments with [[shell]] (commandline) access.<br />
<br />
To log in, one usually uses [[ssh]] to reach the respective [[Nodes#Login|Login Nodes]] (computers reserved for people just like you that want to connect to the supercomputer). Sometimes this access is restricted, so you can only connect, when you are within the university/facility and its network. To still access the Login Nodes externally, one can 'pretend to be inside the network' by using a [[VPN|Virtual Private Network (VPN)]].<br />
<br />
Once there, the user can interact with the system and run (small) programs to generally test the system/software.<br />
<br />
== [[File_Transfer|File Transfer]] or "How-to-get-your-data-onto-or-off-the-supercomputer" ==<br />
To get your data (files) onto the supercomputer or back to your local machine, there are usually different ways. Sometimes there are computers specifically reserved for this purpose called [[Nodes#Copy|copy nodes]].<br />
<br />
If available to you, it is recommened to use these copy nodes to move data to or from the supercomputer, since this will result in a better connection and disturb other users less. Additionally the tools mentioned below might only work on these nodes. If there are no dedicated copy nodes, you can usually use the [[Nodes#Login|Login Nodes]] for this purpose.<br />
<br />
Commonly used and widely supported copying tools are [[rsync]] which mirrors directories (folders) between the supercomputer and your local machine. [[scp]] which is useful for a few single files or specified file-lists, and lastly the commonly used [[ftp]] or the encrypted version sftp (or ftps).<br />
A little bit more information can be found in the [[File_Transfer|File Transfer]] article.<br />
<br />
== [[Scheduler|Schedulers]] or "How-To-Run-Applications-on-a-supercomputer" ==<br />
To run any significant program or workload on a supercomputer, generally [[Scheduler|schedulers]] are employed. Alongside the above-mentioned Login Nodes there are usually far more Backend Nodes in the system (computers exclusively reserved for computing, to which you cannot connect directly). A program called scheduler decides who gets how many of those compute resources for which amount of time. Please use the Backend Nodes for everything which is not a simple small test and only runs for a few minutes., otherwise you will block the Login Nodes for everybody when you run your calculations there. These Backend Nodes make up more than 98% of a supercomputer and can only be accessed via the scheduler.<br />
<br />
When you log into a supercomputer, you can run commands on the Login Nodes interactively. You type, you hit return, the command gets executed. Schedulers work differently: You submit a series of commands (in form of a file) and tell it, how long you think these commands will take (also in the file) and how much resources it will approximately need in terms of:<br />
<br />
* time: If the specified time runs out, before your application finishes and exits, it will be terminated by the scheduler.<br />
* compute resources: how many cpus ('calculation thingies'), sockets ('cpu-houses') and nodes ('computers')<br />
* memory resources: how much RAM ('very fast memory, similar to the few books you have at home')<br />
<br />
If later compute resources become free, which match the requirements of your application, the scheduler will run your specified commands on the requested hardware. This is usually delayed (sometimes you have to wait a day or two) and not instant, because other users are currently using the compute resources and you have to wait until their program runs finish. Furthermore you cannot change the series of commands after submitting, but just terminate the 'job' and submit a new one in case of an error.<br />
<br />
The file specifying this series of commands and the required resources is called a [[jobscript]]. Its format and syntax depends on the installed scheduler. When you have this jobscript ready with the help of [[jobscript-examples]], colleagues or your local [[Support]], you can submit it to the respective [[Schedulers|scheduler of your facility]]. The Scheduler then waits until a set of nodes (computers) are free and later allocates those to execute your 'job' as soon as possible. Sometimes there is (an optional) email notification, which is send when your job starts execution/finished running.<br />
<br />
Be aware that your specified requirements have to fit within the boundaries of the system of your facility. If you ask for more than there is, chances are, the scheduler will accept your job and wait until missing hardware is bought and installed - although this will not happen in 99.9% of cases. Information over the available hardware can be found in the [https://gauss-allianz.de/de/hpc-ecosystem Overview of the Gauss Allianz] or the [[Site-specific_documentation|documentation of the different sites]]. You can find more information about [[Getting_Started#Parallel_Programming_or_.22How-To-Use-More-Than-One-Core.22|parallelizing programs here]]. Also there is an [[Schedulers|overview of the schedulers used at the different sites]].<br />
<br />
== [[Modules|Modules]] or "How-To-Use-Software-Without-installing-everything-yourself" ==<br />
Since a lot of applications rely on 3rd party software, there is a program on most supercomputers, called the [[Modules|Module system]]. With this system, other software, like compilers or special math libraries, are easily loadable and usable. Depending on the institution, different modules might be available, but there are usually common ones like the [[Compiler#Intel_Compiler|Intel]] or [[Compiler#Gnu_Compiler_Collection|GCC]] [[Compiler|Compilers]].<br />
<br />
A few common commands, to enter into the supercomputer commandline and talk to the module system, are <br />
{| class="wikitable" style="width: 40%;"<br />
| module list || lists loaded modules<br />
|-<br />
| module avail || lists available (loadable) modules<br />
|-<br />
| module load/unload x || loads/unloads module x<br />
|-<br />
| module switch x y || switches out module x for module y<br />
|}<br />
<br />
If you recurrently need lots of modules, this loading can be automated with an [[sh-file]], so that you just have to execute the file once and it loads all modules, you need.<br />
<br />
== [[Parallel_Programming|Parallel Programming]] or "How-To-Use-More-Than-One-Core" ==<br />
Currently development of computers is at a point, where you cannot just make a processor run faster, because the semiconductor physics simply do not work that way. Therefore the current approach is to split the work into multiple, ideally independent parts, which are then executed in parallel. Similar to cleaning your house, where everybody takes care of a few rooms, on a supercomputer this is usually done with parallel programming paradigms like [[OpenMP|Open Multi-Processing (OpenMP)]] or [[MPI|Message Passing Interface (MPI)]]. However like the fact that you only have one vacuum cleaner in the whole house which not everybody can use at the same time, there are limits on how fast you can get, even with a big number of processing units/cpus/cores (analogous to people in the metaphor) working on your problem (cleaning the house) in parallel.</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Getting_Started&diff=893Getting Started2018-04-16T13:16:33Z<p>Christian-wassermann-e30b@rwth-aachen.de: /* Schedulers or "How-To-Run-Applications-on-a-supercomputer" */</p>
<hr />
<div>== [[Access|Access]] or "How-to-be-allowed-onto-the-supercomputer" ==<br />
Depending on the specific supercomputer, one has to either register to get a user account or write a project proposal and apply for computing resources that way. The respective pages are linked in [[Access|this overview]].<br />
<br />
After this is done and login credentials are supplied, one can proceed to [[ Getting_Started#Login_or_.22How-to-now-actually-connect-to-the-supercomputer.22 | login ]].<br />
<br />
== [[Nodes#Login|Login]] or "How-to-now-actually-connect-to-the-supercomputer" ==<br />
Most HPC Systems are unix-based environments with [[shell]] (commandline) access.<br />
<br />
To log in, one usually uses [[ssh]] to reach the respective [[Nodes#Login|Login Nodes]] (computers reserved for people just like you that want to connect to the supercomputer). Sometimes this access is restricted, so you can only connect, when you are within the university/facility and its network. To still access the Login Nodes externally, one can 'pretend to be inside the network' by using a [[VPN|Virtual Private Network (VPN)]].<br />
<br />
Once there, the user can interact with the system and run (small) programs to generally test the system/software.<br />
<br />
== [[File_Transfer|File Transfer]] or "How-to-get-your-data-onto-or-off-the-supercomputer" ==<br />
To get your data (files) onto the supercomputer or back to your local machine, there are usually different ways. Sometimes there are computers specifically reserved for this purpose called [[Nodes#Copy|copy nodes]].<br />
<br />
If available to you, it is recommened to use these copy nodes to move data to or from the supercomputer, since this will result in a better connection and disturb other users less. Additionally the tools mentioned below might only work on these nodes. If there are no dedicated copy nodes, you can usually use the [[Nodes#Login|Login Nodes]] for this purpose.<br />
<br />
Commonly used and widely supported copying tools are [[rsync]] which mirrors directories (folders) between the supercomputer and your local machine. [[scp]] which is useful for a few single files or specified file-lists, and lastly the commonly used [[ftp]] or the encrypted version sftp (or ftps).<br />
A little bit more information can be found in the [[File_Transfer|File Transfer]] article.<br />
<br />
== [[Scheduler|Schedulers]] or "How-To-Run-Applications-on-a-supercomputer" ==<br />
To run any significant program or workload on a supercomputer, generally [[Scheduler|schedulers]] are employed. Alongside the above-mentioned Login Nodes there are usually far more Backend Nodes in the system (computers exclusively reserved for computing, to which you cannot connect directly). A program called [[Scheduler|scheduler]] decides who gets how many of those compute resources for which amount of time. Please use the [[Scheduler|scheduler]] for everything which is not a simple small test and only runs for a few minutes. More than 98% of the power of a supercomputer can only be accessed via the [[Scheduler|scheduler]] and you will block the Login Nodes for everybody when you run your calculations there :/<br />
<br />
When you log into a supercomputer, you can run commands on the Login Nodes interactively. You type, you hit return, the command gets executed. [[Scheduler|Schedulers]] work differently: You submit a series of commands (in form of a file) and tell it, how long you think these commands will take (also in the file) and how much resources it will approximately need in terms of:<br />
<br />
* time: If the specified time runs out, before your application finishes and exits, it will be terminated by the scheduler.<br />
* compute resources: how many cpus ('calculation thingies'), sockets ('cpu-houses') and nodes ('computers')<br />
* memory resources: how much RAM ('very fast memory, similar to the few books you have at home')<br />
<br />
If later compute resources become free, which match the requirements of your application, the [[Scheduler|scheduler]] will run your specified commands on the requested hardware. This is usually delayed (sometimes you have to wait a day or two) and not instant, because other users are currently using the compute resources and you have to wait until their program runs finish. Furthermore you cannot change the series of commands after submitting, but just terminate the 'job' and submit a new one in case of an error.<br />
<br />
The file specifying this series of commands and the required resources is called a [[Jobscript]]. Its format and syntax depends on the installed scheduler. When you have this [[Jobscript|jobscript]] ready with the help of [[jobscript-examples]], colleagues or your local [[Support]], you can submit it to the respective [[Schedulers|scheduler of your facility]]. The Scheduler then waits until a set of nodes (computers) are free and later allocates those to execute your 'job' as soon as possible. Sometimes there is (an optional) email notification, which is send when your job starts execution/finished running.<br />
<br />
Be aware that your specified requirements have to fit within the boundaries of the system of your facility. If you ask for more than there is, chances are, the scheduler will accept your job and wait until missing hardware is bought and installed - although this will not happen in 99.9% of cases. Information over the available hardware can be found in the [https://gauss-allianz.de/de/hpc-ecosystem Overview of the Gauss Allianz] or the [[Site-specific_documentation|documentation of the different sites]]. You can find more information about [[Getting_Started#Parallel_Programming_or_.22How-To-Use-More-Than-One-Core.22|parallelizing programs here]]. Also there is an [[Schedulers|Overview of the schedulers of the different sites]]<br />
<br />
== [[Modules|Modules]] or "How-To-Use-Software-Without-installing-everything-yourself" ==<br />
Since a lot of applications rely on 3rd party software, there is a program on most supercomputers, called the [[Modules|Module system]]. With this system, other software, like compilers or special math libraries, are easily loadable and usable. Depending on the institution, different modules might be available, but there are usually common ones like the [[Compiler#Intel_Compiler|Intel]] or [[Compiler#Gnu_Compiler_Collection|GCC]] [[Compiler|Compilers]].<br />
<br />
A few common commands, to enter into the supercomputer commandline and talk to the module system, are <br />
{| class="wikitable" style="width: 40%;"<br />
| module list || lists loaded modules<br />
|-<br />
| module avail || lists available (loadable) modules<br />
|-<br />
| module load/unload x || loads/unloads module x<br />
|-<br />
| module switch x y || switches out module x for module y<br />
|}<br />
<br />
If you recurrently need lots of modules, this loading can be automated with an [[sh-file]], so that you just have to execute the file once and it loads all modules, you need.<br />
<br />
== [[Parallel_Programming|Parallel Programming]] or "How-To-Use-More-Than-One-Core" ==<br />
Currently development of computers is at a point, where you cannot just make a processor run faster, because the semiconductor physics simply do not work that way. Therefore the current approach is to split the work into multiple, ideally independent parts, which are then executed in parallel. Similar to cleaning your house, where everybody takes care of a few rooms, on a supercomputer this is usually done with parallel programming paradigms like [[OpenMP|Open Multi-Processing (OpenMP)]] or [[MPI|Message Passing Interface (MPI)]]. However like the fact that you only have one vacuum cleaner in the whole house which not everybody can use at the same time, there are limits on how fast you can get, even with a big number of processing units/cpus/cores (analogous to people in the metaphor) working on your problem (cleaning the house) in parallel.</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=HPC-Dictionary&diff=892HPC-Dictionary2018-04-16T12:30:28Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== Unix ==<br />
<br />
TODO<br />
<br />
== Unix File System ==<br />
<br />
TODO<br />
<br />
== Node ==<br />
<br />
TODO<br />
<br />
=== Login Node ===<br />
<br />
TODO<br />
<br />
=== Copy Node ===<br />
<br />
TODO<br />
<br />
=== Backend Node ===<br />
<br />
TODO</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Getting_Started&diff=891Getting Started2018-04-16T12:28:31Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>== [[Access|Access]] or "How-to-be-allowed-onto-the-supercomputer" ==<br />
Depending on the specific supercomputer, one has to either register to get a user account or write a project proposal and apply for computing resources that way. The respective pages are linked in [[Access|this overview]].<br />
<br />
After this is done and login credentials are supplied, one can proceed to [[ Getting_Started#Login_or_.22How-to-now-actually-connect-to-the-supercomputer.22 | login ]].<br />
<br />
== [[Nodes#Login|Login]] or "How-to-now-actually-connect-to-the-supercomputer" ==<br />
Most HPC Systems are unix-based environments with [[shell]] (commandline) access.<br />
<br />
To log in, one usually uses [[ssh]] to reach the respective [[Nodes#Login|Login Nodes]] (computers reserved for people just like you that want to connect to the supercomputer). Sometimes this access is restricted, so you can only connect, when you are within the university/facility and its network. To still access the Login Nodes externally, one can 'pretend to be inside the network' by using a [[VPN|Virtual Private Network (VPN)]].<br />
<br />
Once there, the user can interact with the system and run (small) programs to generally test the system/software.<br />
<br />
== [[File_Transfer|File Transfer]] or "How-to-get-your-data-onto-or-off-the-supercomputer" ==<br />
To get your data (files) onto the supercomputer or back to your local machine, there are usually different ways. Sometimes there are computers specifically reserved for this purpose called [[Nodes#Copy|copy nodes]].<br />
<br />
If available to you, it is recommened to use these copy nodes to move data to or from the supercomputer, since this will result in a better connection and disturb other users less. Additionally the tools mentioned below might only work on these nodes. If there are no dedicated copy nodes, you can usually use the [[Nodes#Login|Login Nodes]] for this purpose.<br />
<br />
Commonly used and widely supported copying tools are [[rsync]] which mirrors directories (folders) between the supercomputer and your local machine. [[scp]] which is useful for a few single files or specified file-lists, and lastly the commonly used [[ftp]] or the encrypted version sftp (or ftps).<br />
A little bit more information can be found in the [[File_Transfer|File Transfer]] article.<br />
<br />
== [[Scheduler|Schedulers]] or "How-To-Run-Applications-on-a-supercomputer" ==<br />
To run any significant program or workload on a supercomputer, generally [[Scheduler|schedulers]] are employed. Alongside the above-mentioned Login Nodes there are usually far more Backend Nodes in the system (computers exclusively reserved for computing, to which you cannot connect directly). A programm called [[Scheduler|scheduler]] decides who gets how many of those compute resources for which amount of time. Please use the [[Scheduler|scheduler]] for everything which is not a simple small test and only runs for a few minutes. More than 98% of the power of a supercomputer can only be accessed via the [[Scheduler|scheduler]] and you will block the Login Nodes for everybody when you run your calculations there :/<br />
<br />
When you log into a supercomputer, you can run commands on the Login Nodes interactively. You type, you hit return, the command gets executed. [[Scheduler|Schedulers]] work differently: You submit a series of commands (in form of a file) and tell it, how long you think these commands will take (also in the file) and how much resources it will approximately need in terms of:<br />
<br />
* time: If the specified time runs out, before your application finishes and exits, it will be terminated by the scheduler.<br />
* compute resources: how many cpus ('calculation thingies'), sockets ('cpu-houses') and nodes ('computers')<br />
* memory resources: how much RAM ('very fast memory, similar to the few books you have at home')<br />
<br />
If later compute resources become free, which match the requirements of your application, the [[Scheduler|scheduler]] will run your specified commands on the requested hardware. This is usually delayed (sometimes you have to wait a day or two) and not instant, because other users are currently using the compute resources and you have to wait until their program runs finish. Furthermore you cannot change the series of commands after submitting, but just terminate the 'job' and submit a new one in case of an error.<br />
<br />
The file specifying this series of commands and the required resources is called a [[Jobscript]]. Its format and syntax depends on the installed scheduler. When you have this [[Jobscript|jobscript]] ready with the help of [[jobscript-examples]], colleagues or your local [[Support]], you can submit it to the respective [[Schedulers|scheduler of your facility]]. The Scheduler then waits until a set of nodes (computers) are free and later allocates those to execute your 'job' as soon as possible. Sometimes there is (an optional) email notification, which is send when your job starts execution/finished running.<br />
<br />
Be aware that your specified requirements have to fit within the boundaries of the system of your facility. If you ask for more than there is, chances are, the scheduler will accept your job and wait until missing hardware is bought and installed - although this will not happen in 99.9% of cases. Information over the available hardware can be found in the [https://gauss-allianz.de/de/hpc-ecosystem Overview of the Gauss Allianz] or the [[Site-specific_documentation|documentation of the different sites]]. You can find more information about [[Getting_Started#Parallel_Programming_or_.22How-To-Use-More-Than-One-Core.22|parallelizing programs here]]. Also there is an [[Schedulers|Overview of the schedulers of the different sites]]<br />
<br />
== [[Modules|Modules]] or "How-To-Use-Software-Without-installing-everything-yourself" ==<br />
Since a lot of applications rely on 3rd party software, there is a program on most supercomputers, called the [[Modules|Module system]]. With this system, other software, like compilers or special math libraries, are easily loadable and usable. Depending on the institution, different modules might be available, but there are usually common ones like the [[Compiler#Intel_Compiler|Intel]] or [[Compiler#Gnu_Compiler_Collection|GCC]] [[Compiler|Compilers]].<br />
<br />
A few common commands, to enter into the supercomputer commandline and talk to the module system, are <br />
{| class="wikitable" style="width: 40%;"<br />
| module list || lists loaded modules<br />
|-<br />
| module avail || lists available (loadable) modules<br />
|-<br />
| module load/unload x || loads/unloads module x<br />
|-<br />
| module switch x y || switches out module x for module y<br />
|}<br />
<br />
If you recurrently need lots of modules, this loading can be automated with an [[sh-file]], so that you just have to execute the file once and it loads all modules, you need.<br />
<br />
== [[Parallel_Programming|Parallel Programming]] or "How-To-Use-More-Than-One-Core" ==<br />
Currently development of computers is at a point, where you cannot just make a processor run faster, because the semiconductor physics simply do not work that way. Therefore the current approach is to split the work into multiple, ideally independent parts, which are then executed in parallel. Similar to cleaning your house, where everybody takes care of a few rooms, on a supercomputer this is usually done with parallel programming paradigms like [[OpenMP|Open Multi-Processing (OpenMP)]] or [[MPI|Message Passing Interface (MPI)]]. However like the fact that you only have one vacuum cleaner in the whole house which not everybody can use at the same time, there are limits on how fast you can get, even with a big number of processing units/cpus/cores (analogous to people in the metaphor) working on your problem (cleaning the house) in parallel.</div>Christian-wassermann-e30b@rwth-aachen.dehttps://hpc-wiki.info/hpc/index.php?title=Ssh_keys&diff=890Ssh keys2018-04-16T12:24:29Z<p>Christian-wassermann-e30b@rwth-aachen.de: </p>
<hr />
<div>An ssh key is a way of identifying (authenticating) yourself when connecting to a server via [[ssh]]. A different popular authentication method is entering a password.<br />
<br />
== Why should I use it?==<br />
When you connect to a server, authenticating via a password there are two main problems:<br />
* Someone could bruteforce or guess your password, since many passwords are commonly weak or used for multiple applications.<br />
* Someone could intercept your password, since it has to be send to the server at some point in some form.<br />
<br />
== How-to-use-it ==<br />
<br />
=== Generate a key ===<br />
You should start by generating a key pair:<br />
$ ssh-keygen -b 4096<br />
where you can specify the max length of the key up to 16384 bits.<br />
<br />
You can then optionally protect your key with a passphrase. (Your key is basically just a file sitting on your computer and a passphrase protects your key, if someone happens to steal/copy that file).<br />
<br />
If you did not specify a different file, the key normaly gets generated into the folder <br />
~/.ssh<br />
with the files '''id_rsa''' being your private and '''id_rsa.pub''' being your public key.<br />
<br />
=== Copy the public key to the server ===<br />
<br />
==== Method A ====<br />
<br />
This public key now has to be copied to the server into the <code>~/.ssh/authorized_keys</code> file. This can be done, by opening an [[ssh]] connection via password and then using an editor (e.g. [[vim]]) to paste the key into the file (creating the '''.ssh''' directory beforehand if it does not exist):<br />
$ mkdir -p ~/.ssh<br />
$ vim ~/.ssh/authorized_keys<br />
<br />
==== Method B ====<br />
<br />
Instead of performing the copying of the ssh key to the server manually, you can use the program <code>ssh-copy-id</code> to achieve the same goal:<br />
$ ssh-copy-id <username>@<remote-host><br />
where <code><username></code> is your username on the remote host <remote-host>. You will be prompted for your password and the program will manage the rest.<br />
<br />
Regardless of the method used, the next time you [[ssh]] to the server, it should use the key and instead of prompting for the user's pass'''word''', prompt for the pass'''phrase''' of the key (if you chose to employ one).<br />
<br />
=== Troubleshooting ===<br />
<br />
If it still asks for your password, something went wrong. In that case you should check, whether the '''authorized_keys''' file really contains the key by executing:<br />
$ cat ~/.ssh/authorized_keys<br />
on the '''server''' and <br />
$ cat ~/.ssh/id_rsa.pub<br />
on '''your local machine'''. If the key of your local machine is not contained in the '''authorized_keys''' on the server, repeat the steps of copying the key to the server.<br />
<br />
You should also make sure, that correct file access permission are set. If unsure, execute:<br />
$ chmod 600 ~/.ssh/id_rsa<br />
$ chmod 640 ~/.ssh/id_rsa.pub<br />
<br />
== How-it-works ==<br />
<br />
The basic principle is that of asymmetrical cryptography being employed by using a public-private-key-pair. A public key is like an indestructible piggy bank: Everybody can put something (data) into it, but nobody can get it back out again. A private key is the key for this. In this way you can distribute all the piggy banks you like and if someone put something in there and sends it back, only you can open it with your private key.<br />
<br />
Now since you gave the server the public key (=piggy bank), it can encrypt something (say a random number), send it back and only you can decrypt this number, since only you have the private key.<br />
<br />
For more detailed information on how this works, head over to the References.<br />
<br />
== References ==<br />
<br />
[https://wiki.archlinux.de/title/SSH-Authentifizierung_mit_Schl%C3%BCsselpaaren SSH keys on the archlinux wiki]<br />
<br />
[https://medium.com/@vrypan/explaining-public-key-cryptography-to-non-geeks-f0994b3c2d5 Public and private keys easily explained]<br />
<br />
[https://www.digitalocean.com/community/tutorials/understanding-the-ssh-encryption-and-connection-process More detailed explanation of the connection and encryption process of ssh]</div>Christian-wassermann-e30b@rwth-aachen.de