Difference between revisions of "Admin Guide Data Transfer with Globus Online"
(Autor Marcel Rodekamp) |
m |
||
Line 1: | Line 1: | ||
− | [[Category:HPC-Admin]] | + | [[Category:HPC-Admin|Data Transfer with Globus Online]] |
− | [[Category:HPC.NRW-Best-Practices]] | + | [[Category:HPC.NRW-Best-Practices|Data Transfer with Globus Online]] |
This article shows how to handle Globus as a tool to transfer large amounts of data between servers. It is described here with the example of the Bielefeld cluster system. | This article shows how to handle Globus as a tool to transfer large amounts of data between servers. It is described here with the example of the Bielefeld cluster system. |
Revision as of 15:15, 2 November 2020
This article shows how to handle Globus as a tool to transfer large amounts of data between servers. It is described here with the example of the Bielefeld cluster system.
GridFTP (Globus Toolkit)
Globus is a tool to transfer large amounts of data from server to server without you being required to be constantly logged in and watching. To use Globus you need to create an account at (https://globus.org) and connect the Bielefeld clusters endpoint to it. To get the Bielefeld cluster search for
phyadmin#influx1
and connect via your local Bielefeld username/password combination.
After having set up the Bielefeld cluster and another server, you can start a transfer between them either via the web interface or the command line
globus transfer [OPTIONS] SOURCE_ENDPOINT_ID[:SOURCE_PATH] DEST_ENDPOINT_ID[:DEST_PATH]
You can also connect your local machine to globus. To do so, download and extract globusconnectpersonal
wget https://downloads.globus.org/globus-connect-personal/linux/stable/globusconnectpersonal-latest.tgz
tar -xzf globusconnectpersonal-latest.tgz
and install the Globus CLI
(command line interface, requires Python)
pip install --upgrade --user globus-cli
The CLI
needs to be linked to your account to make use of the most commands, so log in by
globus login
and follow the given instructions. Using the CLI you can then create a local endpoint
$ globus endpoint create --personal my-linux-laptop
Message: Endpoint created successfully
Endpoint ID: <endpoint-id>
Setup Key: <setup-key>
and add it as such to your account in the web interface using the <endpoint-id>
. Then start your new endpoint using
./globusconnectpersonal -setup <setup-key>
./globusconnectpersonal -start &
Globus example: transfer files between Bielefeld and Jülich
Although judac
is already a GridFTP server, it is easiest connected to globus via a personal endpoint, if you do not have the required Grid certificates.
To do so, download globusconnectpersonal
to judac
just as you did above for your local machine. You can then create an endpoint via the globus web interface (recommended) or via the globus-CLI (discouraged), where you would need to install pip first by
wget https://bootstrap.pypa.io/get-pip.py
python get-pip.py --user
GLOBUS_CLI_INSTALL_DIR="$(python -c 'import site; print(site.USER_BASE)')/bin"
echo "GLOBUS_CLI_INSTALL_DIR=$GLOBUS_CLI_INSTALL_DIR"
export PATH="$GLOBUS_CLI_INSTALL_DIR:$PATH"
echo 'export PATH="'"$GLOBUS_CLI_INSTALL_DIR"':$PATH"' >> "$HOME/.bashrc"
pip install --upgrade --user setuptools
pip install --upgrade --user globus-cli
You will probably want to connect more directories than just your /home
, which is available per default. To do so, edit .globusonline/lta/config-paths
to list all the directories you want to connect. For example for Jülich:
~/,0,1
/p/project/chbi18,0,1
/p/scratch/chbi18,0,1
You can then setup and start an endpoint just as above and submit a transfer via the web interface.
Further information can be found here: