Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
pub:hpc:hellbender [2025/01/31 15:36] – [Getting Started with Globus] bjmfg8pub:hpc:hellbender [2025/04/14 20:07] (current) – [Storage: Research Data Ecosystem ('RDE')] bjmfg8
Line 119: Line 119:
 | Dell C6525 | 112    | 128        | 490 GB        | 1.6 TB          | 14336  | c001-c112  | | Dell C6525 | 112    | 128        | 490 GB        | 1.6 TB          | 14336  | c001-c112  |
  
-**The 2024 pricing is: $2,702 per node per year.**+**The 2025 pricing is: $2,702 per node per year.**
  
 ==== GPU Node Lease  ==== ==== GPU Node Lease  ====
Line 129: Line 129:
 | Dell R740xa | 17     | 64         | 238 GB        | A100 | 80 GB      | 4     | 1.6 TB        | 1088    | Dell R740xa | 17     | 64         | 238 GB        | A100 | 80 GB      | 4     | 1.6 TB        | 1088   
  
-**The 2024 pricing is: $7,692 per node per year.**+**The 2025 pricing is: $7,692 per node per year.**
  
 ==== Storage: Research Data Ecosystem ('RDE') ==== ==== Storage: Research Data Ecosystem ('RDE') ====
Line 138: Line 138:
   * Storage lab allocations are protected by associated security groups applied to the share, with group member access administered by the assigned PI or appointed representative.   * Storage lab allocations are protected by associated security groups applied to the share, with group member access administered by the assigned PI or appointed representative.
  
-**What is the Difference between High Performance and General Performance Storage?**+**What is the Difference between High Performance and General Performance Storage? **
  
-On Pixstor, which is used for standard HPC allocations, general storage is pinned to the SAS disk pool while high performance allocations are pinned to all flash NVME pool.  Meaning writes and recent reads will have lower latency with HPC allocations.+On Pixstor, which is used for standard HPC allocations, general storage is pinned to the SAS disk pool while high performance allocations are pinned to all flash NVME pool.  Meaning writes and recent reads will have lower latency with High performance allocations.
    
 On VAST, which is used for non HPC and mixed HPC / SMB workloads, the disks are all flash but general storage allocations have a QOS policy attached that limits IOPS to prevent the share from the possibility of saturating the disk pool to the point where high-performance allocations are impacted.  High Performance allocations may also have a QOS policy that allows for much higher IO and IOPS.  RSS reserves the right to move general store allocations to lower tier storage in the future if facing capacity constraints. On VAST, which is used for non HPC and mixed HPC / SMB workloads, the disks are all flash but general storage allocations have a QOS policy attached that limits IOPS to prevent the share from the possibility of saturating the disk pool to the point where high-performance allocations are impacted.  High Performance allocations may also have a QOS policy that allows for much higher IO and IOPS.  RSS reserves the right to move general store allocations to lower tier storage in the future if facing capacity constraints.
Line 153: Line 153:
   * Workloads that require sustained use of low latency read and write IO with multiple GB/s, generally generated from jobs utilizing multiple NFS mounts   * Workloads that require sustained use of low latency read and write IO with multiple GB/s, generally generated from jobs utilizing multiple NFS mounts
  
 +
 +**Snapshots**
 +
 +  *VAST default policy retains 7 daily and 4 weekly snapshots for each share
 +  *Pixstor default policy is 10 daily snapshots
  
 **__None of the cluster attached storage available to users is backed up in any way by us__**, this means that if you delete something and don't have a copy somewhere else, it is gone. Please note the data stored on cluster attached storage is limited to Data Class 1 and 2 as defined by [[https://www.umsystem.edu/ums/is/infosec/classification-definitions| UM System DCL]]. If you have need to store things in DCL3 or DCL4 please contact us so we may find a solution for you. **__None of the cluster attached storage available to users is backed up in any way by us__**, this means that if you delete something and don't have a copy somewhere else, it is gone. Please note the data stored on cluster attached storage is limited to Data Class 1 and 2 as defined by [[https://www.umsystem.edu/ums/is/infosec/classification-definitions| UM System DCL]]. If you have need to store things in DCL3 or DCL4 please contact us so we may find a solution for you.
  
-**The 2024 pricing is: General Storage: $25/TB/Year, High Performance Storage: $95/TB/Year**+**The 2025 pricing is: General Storage: $25/TB/Year, High Performance Storage: $95/TB/Year**
  
 To order storage please fill out our [[https://missouri.qualtrics.com/jfe/form/SV_6zkkwGYn0MGvMyO| RSS Services Order Form]] To order storage please fill out our [[https://missouri.qualtrics.com/jfe/form/SV_6zkkwGYn0MGvMyO| RSS Services Order Form]]
Line 169: Line 174:
 **Costs** **Costs**
  
-The cost associated with using the RDE tape archive is $8/TB for short term data kept in inside the tape library for 1-3 years or $140 per tape rounded to the number of tapes for tapes sent offsite for long term retention up to 10 years. We send these tapes off to record management where they are stored in a climate-controlled environment. Each tape from the current generation LTO 9 holds approximately 18TB of data These are flat onetime costs and you have the option to do both a short term in library copy, and a longer-term offsite copy, or one or the other, providing flexibility.+The cost associated with using the RDE tape archive is $8/TB for short term data kept in inside the tape library for 1-3 years or $140 per tape rounded to the number of tapes for tapes sent offsite for long term retention up to 10 years. We send these tapes off to record management where they are stored in a climate-controlled environment. Each tape from the current generation LTO 9 holds approximately 18TB of data These are flat onetime costsand you have the option to do both a short term in library copy, and a longer-term offsite copy, or one or the other, providing flexibility.
  
 **Request Process** **Request Process**
Line 175: Line 180:
 To utilize the tape archive functionality that RSS has setup, the data to be archived will need to be copied to RDE storage if it does not exist there already. This would require the following steps. To utilize the tape archive functionality that RSS has setup, the data to be archived will need to be copied to RDE storage if it does not exist there already. This would require the following steps.
   * Submit a RDE storage request if the data resides locally and a RDE share is not already available to the researcher: [[http://request.itrss.umsystem.edu|RSS Account Request Form]]   * Submit a RDE storage request if the data resides locally and a RDE share is not already available to the researcher: [[http://request.itrss.umsystem.edu|RSS Account Request Form]]
-  * Create an archive folder or folders in the relevant RDE storage share to hold the data you would like to archive. The folder(s) can be named to signify the contents, but we ask that the name includes _archive at then end. For example, something akin to: labname_projectx_archive_2024.+  * Create an archive folder or folders in the relevant RDE storage share to hold the data you would like to archive. The folder(s) can be named to signify the contents, but we ask that the name includes _archive at the end. For example, something akin to: labname_projectx_archive_2024.
   * Copy the contents to be archived to the newly created archive folder(s) within the RDE storage share.   * Copy the contents to be archived to the newly created archive folder(s) within the RDE storage share.
   * Submit a RDE tape Archive request: [[https://missouri.qualtrics.com/jfe/form/SV_5o0NoDafJNzXnRY]]   * Submit a RDE tape Archive request: [[https://missouri.qualtrics.com/jfe/form/SV_5o0NoDafJNzXnRY]]
-  * Once the tape archive jobs are completed ITRSS will notify you and send you Archive job report after which you can delete the contents of the archive folder. +  * Once the tape archive jobs are completed ITRSS will notify you and send you an Archive job report after which you can delete the contents of the archive folder. 
-  * We request that subsequent archive jobs be added to a separate folder or the initial folder renamed to something that signifies the time of archive for easier retrieval *_archive2024, *archive2025, etc.+  * We request that subsequent archive jobs be added to a separate folderor the initial folder renamed to something that signifies the time of archive for easier retrieval *_archive2024, *archive2025, etc.
  
 **Recovery** **Recovery**
Line 221: Line 226:
  
   * **[[https://status.missouri.edu| UM System Status Page]]**   * **[[https://status.missouri.edu| UM System Status Page]]**
-  * **[[https://po.missouri.edu/scripts/wa.exe?SUBED1=RSSHPC-L&A=1| RSS Announcement List: Please Sign Up]]**+  * **[[https://LISTS.UMSYSTEM.EDU/scripts/wa-UMS.exe?SUBED1=RSSHPC-L&A=1&SUB=1| RSS Announcement List: Please Sign Up]]**
   * **[[https://missouri.qualtrics.com/jfe/form/SV_6zkkwGYn0MGvMyO|RSS Services: Order Form]]**   * **[[https://missouri.qualtrics.com/jfe/form/SV_6zkkwGYn0MGvMyO|RSS Services: Order Form]]**
   * **[[https://request.itrss.umsystem.edu/|Hellbender: Account Request Form]]**   * **[[https://request.itrss.umsystem.edu/|Hellbender: Account Request Form]]**
Line 253: Line 258:
 | **Model**  | **Nodes** | **Cores/Node** | **System Memory** | **CPU**                                  | **Local Scratch**   | **Cores** | **Node Names** | | **Model**  | **Nodes** | **Cores/Node** | **System Memory** | **CPU**                                  | **Local Scratch**   | **Cores** | **Node Names** |
 | Dell C6525 | 112       | 128            | 490 GB            | AMD EPYC 7713 64-Core                    | 1.6 TB              | 14336     | c001-c112      | | Dell C6525 | 112       | 128            | 490 GB            | AMD EPYC 7713 64-Core                    | 1.6 TB              | 14336     | c001-c112      |
-| Dell R640  | 32        | 40             192 GB            | Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz |                     | 1280      | c113-c145      | +| Dell R640  | 32        | 40             364 GB            | Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz | 100 GB              | 1280      | c113-c145      | 
-| Dell C6420 | 64        | 48             384 GB            | Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz | 1 TB                | 3072      | c146-c209      | +| Dell C6420 | 64        | 48             364 GB            | Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz | 1 TB                | 3072      | c146-c209      | 
-| Dell R6620 | 12        | 256            | 1 TB              | AMD EPYC 9754 128-Core Processor         | 1.5 TB              | 3072      | c210-c221      |+| Dell R6620 | 12        | 256            | 994 GB            | AMD EPYC 9754 128-Core Processor         | 1.5 TB              | 3072      | c210-c221      |
 |            |                          |                                                            | Total Cores         | 21760                    | |            |                          |                                                            | Total Cores         | 21760                    |
  
Line 264: Line 269:
 | Dell XE8640 | 2         | 104            | 2002 GB           | H100     | 80 GB          | 4        | 3.2 TB            | 208     | g018-g019      | | Dell XE8640 | 2         | 104            | 2002 GB           | H100     | 80 GB          | 4        | 3.2 TB            | 208     | g018-g019      |
 | Dell XE9640 | 1         | 112            | 2002 GB           | H100     | 80 GB          | 8        | 3.2 TB            | 112     | g020           | | Dell XE9640 | 1         | 112            | 2002 GB           | H100     | 80 GB          | 8        | 3.2 TB            | 112     | g020           |
-| Dell R730   | 4         | 20             128 GB            | V100     | 32 GB          | 1        | 1.6 TB            | 80      | g021-g024      | +| Dell R730   | 4         | 20             113 GB            | V100     | 32 GB          | 1        | 1.6 TB            | 80      | g021-g024      | 
-| Dell R7525  | 1         48             | 512 GB            | V100S    | 32 GB          | 3        | 480 GB            | 48      | g025           | +| Dell R7525  | 1         96             | 490 GB            | V100S    | 32 GB          | 3        | 480 GB            | 96      | g025           | 
-| Dell R740xd |         | 44             | 384 GB            | V100     | 32 GB          | 3        | 240 GB            | 132     | g026-g028      | +| Dell R740xd |         | 40             | 364 GB            | V100     | 32 GB          | 3        | 240 GB            | 80      | g026-g027      
-|                                      |                            | Total GPU      | 100      | Total Cores       1688                   |+| Dell R740xd | 1         | 44             | 364 GB            | V100     | 32 GB          | 3        | 240 GB            | 44      | g028           
 +|                                      |                            | Total GPU      | 100      | Total Cores       1708                   |
  
 A specially formatted sinfo command can be ran on Hellbender to report live information about the nodes and the hardware/features they have. A specially formatted sinfo command can be ran on Hellbender to report live information about the nodes and the hardware/features they have.
Line 353: Line 359:
 ==== Open OnDemand ==== ==== Open OnDemand ====
  
-  * https://ondemand.rnet.missouri.edu - Hellbender Open OnDemand (HB OOD+  * https://ondemand.rnet.missouri.edu - Hellbender Open OnDemand (Researcher
-  * https://hb-classes.missouri.edu - Hellbender Classes Open OnDemand (HB OOD classes)+  * https://hb-classes.missouri.edu - Hellbender Classes Open OnDemand (Classes)
  
 OnDemand provides an integrated, single access point for all of your HPC resources. The following apps are currently available on Hellbender's Open Ondemand. OnDemand provides an integrated, single access point for all of your HPC resources. The following apps are currently available on Hellbender's Open Ondemand.
Line 903: Line 909:
  
 Finally, you need to give Globus permission to use your identity to access information and perform actions (like file transfers) on your behalf. Finally, you need to give Globus permission to use your identity to access information and perform actions (like file transfers) on your behalf.
-{{:pub:hpc:globus_terms_6.png?800|}} {{:pub:hpc:globus_allow_or_deny_6.png?1200|}}+{{:pub:hpc:globus_terms_6.png?600|}} {{:pub:hpc:globus_allow_or_deny_6.png?800|}}
  
 ==== Tutorial: Globus File Manager ==== ==== Tutorial: Globus File Manager ====
  
 After you’ve signed up and logged in to Globus, you’ll begin at the File Manager. After you’ve signed up and logged in to Globus, you’ll begin at the File Manager.
 +
 +**note:  Symlinks may not be transferred via Globus, preview:
 +https://docs.globus.org/faq/transfer-sharing/#how_does_globus_handle_symlinks
 +If symlinks need to be copied, consider using the rsync on the DTN with with the -l flag**
 +
 +
 +
  
 The first time you use the File Manager, all fields will be blank: The first time you use the File Manager, all fields will be blank:
Line 924: Line 937:
  
 **Access A Collection** **Access A Collection**
-  * Click in the Collection field at the top of the File Manager page and type "globus tutorial endpoint".+  * Click in the Collection field at the top of the File Manager page and type "globus tutorial collection 1".
   * Globus will list collections with matching names. The collections Globus Tutorial Endpoint 1 and Globus Tutorial Endpoint 2 are collections administered by the Globus team for demonstration purposes and are accessible to all Globus users without further authentication.   * Globus will list collections with matching names. The collections Globus Tutorial Endpoint 1 and Globus Tutorial Endpoint 2 are collections administered by the Globus team for demonstration purposes and are accessible to all Globus users without further authentication.
  
-{{:pub:hpc:globus_endpoint_tutorial_search.png?1200|}}+{{:pub:hpc:collection_search.png?800|}}
  
-  * Click on Globus Tutorial Endpoint 1.  +  * Click on Globus Tutorial Collection 1.  
-  * Globus will connect to the collection and display the default directory, /~/. (It will be empty.) Click the "Path" field and change it to /share/godata/. Globus will show the files in the new path: three small text files+  * Globus will connect to the collection and display the default directory, /~/. (It will be empty.) Click the "Path" field and change it to home/share/godata/. Globus will show the files in the new path: three small text files
  
- +{{:pub:hpc:test_collection_godata1.png?800|}}
-{{:pub:hpc:tutorial_1.png?1200|}} +
- +
-{{:pub:hpc:tutorial_2.png?1200|}}+
  
 **Request A File Transfer** **Request A File Transfer**
Line 941: Line 951:
   * A new collection panel will open, with a "Transfer or Sync to" field at the top of the panel.   * A new collection panel will open, with a "Transfer or Sync to" field at the top of the panel.
  
-{{:pub:hpc:tutorial_3.png?1200|}}+{{:pub:hpc:transfer_or_sync.png?1200|}}
  
-  * Find the Globus Tutorial Endpoint 2 collection and connect to it as you did with the Globus Tutorial Endpoint 1 above. +  * Find the "Globus Tutorial Collection 2collection and connect to it as you did with the Globus Tutorial Endpoint 1 above. 
   * The default directory, /~/ will again be empty. Your goal is to transfer the sample files here.   * The default directory, /~/ will again be empty. Your goal is to transfer the sample files here.
-  * Click on the left collection, Globus Tutorial Endpoint 1, and select all three files. The Start> button at the bottom of the panel will activate.+  * Click on the left collection, Globus Tutorial Collection 1, and select all three files. The Start> button at the bottom of the panel will activate.
    
-{{:pub:hpc:tutorial_4.png?1200|}}+{{:pub:hpc:select_files_start.png?800|}}
  
   * Between the two Start buttons at the bottom of the page, the Transfer & Sync Options tab provides access to several options.   * Between the two Start buttons at the bottom of the page, the Transfer & Sync Options tab provides access to several options.
Line 954: Line 964:
   * Click the Start> button to transfer the selected files to the collection in the right panel. Globus will display a green notification panel—​confirming that the transfer request was submitted—​and add a badge to the Activity item in the command menu on the left of the page.   * Click the Start> button to transfer the selected files to the collection in the right panel. Globus will display a green notification panel—​confirming that the transfer request was submitted—​and add a badge to the Activity item in the command menu on the left of the page.
  
-{{:pub:hpc:tutorial_5.png?1200|}} +{{:pub:hpc:transfer_success_submitted.png?800|}}
- +
-{{:pub:hpc:tutorial_6.png?1200|}}+
  
 **Confirm Transfer Completion** **Confirm Transfer Completion**
Line 964: Line 972:
   * On the Activity page, click the arrow icon on the right to view details about the transfer. You will also receive an email with the transfer details.   * On the Activity page, click the arrow icon on the right to view details about the transfer. You will also receive an email with the transfer details.
  
-{{:pub:hpc:tutorial_7.png?1200|}}+{{:pub:hpc:transfer_complete.png?1200|}}
  
 ==== Tutorial: Sharing Data - Create a Guest Collection ==== ==== Tutorial: Sharing Data - Create a Guest Collection ====