Revision 9 as of 2016-12-07 15:40:20

Clear message
Locked History Actions

Computing/LIP_Lisbon_Farm/3_Data_Management

Directories and filesystems

  • The fermi machines provide a wide set of filesystems so that users can access their data and applications.
  • The available filesystems in login machines are:

# df -h
Filesystem              Size  Used Avail Use% Mounted on
nfs01:/exports/exper-sw 2.8T  2.0T 767G  73% /exper-sw                  ---> NFS fs for local software
se16:/ams             200G   33M   200G   1% /home/ams                  ---> NFS fs LIP homes and data
se16:/comp            600G  6.8G   594G   2% /home/comp                 ---> NFS fs LIP homes and data
se16:/cosmo           600G  8.8G   591G   2% /home/cosmo                ---> NFS fs LIP homes and data
se16:/feno            200G   33M   200G   1% /home/feno                 ---> NFS fs LIP homes and data
se16:/nucria          600G   33M   600G   1% /home/nucria               ---> NFS fs LIP homes and data
se16:/pet             200G   33M   200G   1% /home/pet                  ---> NFS fs LIP homes and data
se16:/sno             200G  8.2G   192G   5% /home/sno                  ---> NFS fs LIP homes and data
se16:/t3atlas         200G  151G   49G   76% /home/t3atlas              ---> NFS fs LIP homes and data
se16:/t3cms           200G   29G   172G  15% /home/t3cms                ---> NFS fs LIP homes and data
se17:/x               4.6T  4.4T   165G  97% /x                         ---> NFS fs LIP ATLAS users
mdt02@tcp:/t3atlas     61T   24T   34T   42% /gstore/t3atlas            ---> Lustre FS  Tier-3 LIP ATLAS users 
mdt02@tcp:/T3CMS       81T   11T   66T   15% /gstore/t3cms              ---> Lustre FS  Tier-3 LIP CMS   users
mdt03@tcp:/calo        72T   62T   6.6T  91% /lstore/calo               ---> Lustre FS for ATLAS users
mdt02@tcp:/cmslocal   6.4T  5.6T   541G  92% /lstore/cms                ---> Lustre FS for CMS   users
mdt03@tcp:/comp        41T  9.0T   30T   24% /lstore/comp               ---> Lustre FS for COMP  users
mdt03@tcp:/sno         10T  6.4T   3.2T  67% /lstore/sno                ---> Lustre FS for SNOW  users
se27:/ams              11T  5.9T   4.2T  59% /z/ams                     ---> NFS    FS for AMS   users
se27:/comp             11T  3.8T   6.3T  38% /z/comp                    ---> NFS    FS for COMP  users
se16:/csys            200G   33M   200G   1% /home/csys                 ---> Lustre FS for AUGER users
mdt04@tcp:/auger       36T   22T   12T   65% /lstore/auger              ---> Lustre FS for AUGER users
mdt04@tcp:/hpclip     6.4T  546M   6.0T   1% /lstore/hpclip             ---> Lustre FS for HPC   users
mdt04@tcp:/lattes     5.9T   86G   5.5T   2% /lstore/lattes             ---> Lustre FS for LATTS users
mdt04@tcp:/pet        461G  115G   323G  27% /lstore/pet                ---> Lustre FS for PET   users
cvmfs2                 20G   17G   3.3G  84% /cvmfs/cms.cern.ch         ---> CVMFS  FS for CMS   users
cvmfs2                 20G  6.1G   14G   32% /cvmfs/atlas.cern.ch       ---> CVFMS  FS for ATLAS users

Data Management

  • At LIP-Lisbon, the home filesystem is not shared between the submission hosts and the execution hosts. As a result, it is the user responsibility to transfer data and applications to/from the execution machines.
  • There are several ways to manage data in LIP-Lisbon FARM:
    1. Automatic transfers via scp
    2. Data access via /hometmp (NFS)
    3. Data access via /lustre

Automatic transfers via scp

  • SCOPE: This is the most appropriate method to transfer a small number of small files.

  • The automatic transfer of data and application via scp is triggered by declaring files (or directories) to transfer in dedicated system variables defined in the submission script.
    • SGEIN{1...N}: Define one variable for each file or directory to transfer from the submission machine to the execution machine

    • SGEOUT{1...N}: Define one variable for each file or directory to transfer from the execution machine to the submission machine

# Transfer input file (MyMacro.c) to the execution machine
#$ -v SGEIN1=MyMacro.c

# Transfer output file (graph_with_law.pdf) from the execution machine
#$ -v SGEOUT1=graph_with_law.pdf
  • The full syntax for scp automatic transfers is described hereafter. Keep in mind that all paths should be relative to the current working directory (where you are submitting the job):

# My input file is called input_file1.txt and it will have the same name in the execution host
#$ -v SGEIN1=input_file1.txt

# My input file is called input_file2.txt but it will be called inputfile2.txt in the execution host
#$ -v SGEIN2=input_file2.txt:inputfile2.txt

# My input is a full directory (The directory INPUT3 must exist in the submission host)
#$ -v SGEIN3=INPUT3

# My input is the file INPUT4/input_file4.txt, and it will exist in the execution host in INPUT4/inputfile4.txt
#$ -v SGEIN4=INPUT4/input_file4.txt:INPUT4/inputfile4.txt

# My input is the directory INPUT5 and it will be called INPUT_AT_WORKERNODE1 in the execution host
#$ -v SGEIN5=INPUT5:INPUT_AT_WORKERNODE1

# My input is the file INPUT6/input_file6.txt, and it will exist in the execution host in INPUT_AT_WORKERNODE2/inputfile6.txt
#$ -v SGEIN6=INPUT6/input_file6.txt:INPUT_AT_WORKERNODE2/inputfile6.txt

# My input is the directory INPUT7 which will pass to the execution host as the tree of directories
#    INPUT_AT_WORKERNODE3/INPUT_AT_WORKERNODE4
#$ -v SGEIN7=INPUT7:INPUT_AT_WORKERNODE3/INPUT_AT_WORKERNODE4

# My input is the file INPUT8/input_file8.txt which will pass to the execution host
#    as INPUT_AT_WORKERNODE5/INPUT_AT_WORKERNODE6/inputfile8.txt
#$ -v SGEIN8=INPUT8/input_file8.txt:INPUT_AT_WORKERNODE5/INPUT_AT_WORKERNODE6/inputfile8.txt

Data access via /hometmp (NFS)

  • SCOPE: Same input files and applications are used by multiple jobs

  • If the same input files should serve multiple jobs, users should store those files under the /hometmp directory, shared between the submission hosts and the execution hosts. This is more efficient than copying the same files over and over again.

  • Simultaneously, users can use /hometmp to check the status of running jobs using, for example, dedicated logs. Check the following example:

# ! /bin/bash

MY_HOMETMP=/hometmp/csys/goncalo

OUTPUT_FILE=output_file1.txt
INPUT_FILE=input_file1.txt
OUTPUT_FILE=output_file1.txt
MyLOG=mylog.txt

echo "Starting second test on `date`"> $MY_HOMETMP/$MyLOG

tr -s 'a-z' 'A-Z' < $MY_HOMETMP/$INPUT_FILE >> $OUTPUT_FILE
mv -f $OUTPUT_FILE $MY_HOMETMP/$OUTPUT_FILE

echo "Finishing second test on `date`" >> $MY_HOMETMP/$MyLOG
  • While the job is running, the user can check the job status consulting the mylog.txt log in /hometmp

Important Disclaimer
  • Users should be aware of the following issues:
    1. Be cautious so that files are not squeezed when writing to the /hometmp, specially while sending arrays of jobs.
    2. It is preferable that users do not write OUTPUT results directly to /hometmp (due to performance degradation generated by lock management mechanisms). It is better to write OUTPUT results to the local disk (where the jobs is executing), and copy it at the end of your job to /hometmp
    3. Data in /hometmp will be deleted after 30 days.

Access data via '''/lustre'''

  • SCOPE: Store and access big/huge data files.

  • /lustre is a shared filesystem (present in the execution hosts and in the submission hosts) dedicated for the storage of big/huge files. The following directories are accessible for the local LIP groups:

    1. /lustre/lip.pt/data/calo
    2. /lustre/lip.pt/data/cosmo
    3. /lustre/lip.pt/data/pet
    4. /lustre/lip.pt/data/sno
  • Groups involved in WLCG transfer data using grid technologies to the following locations
    1. ATLAS: /lustre/lip.pt/data/atlas/atlaslocalgroupdisk (calo group has read access to this filesystem)

Important Disclaimer
  • Manipulating huge sets of small files generates performance degradation issues in /lustre due to the lock management. Therefore, you should not
    • Compile anything under /lustre

    • Store and access databases under /lustre