This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| pub:hpc:hellbender [2025/06/18 15:55] – [GPU Node Lease] bjmfg8 | pub:hpc:hellbender [2025/11/12 15:21] (current) – [Hardware] epkknd | ||
|---|---|---|---|
| Line 3: | Line 3: | ||
| **Request an Account:** | **Request an Account:** | ||
| You can request an account for access to Hellbender by filling out the form found at: | You can request an account for access to Hellbender by filling out the form found at: | ||
| - | [[https://request.itrss.umsystem.edu/ | + | [[https://tdx.umsystem.edu/ |
| ==== What is Hellbender? ==== | ==== What is Hellbender? ==== | ||
| Line 9: | Line 9: | ||
| **Hellbender** is the latest High Performance Computing (HPC) resource available to researchers and students (with sponsorship by a PI) within the UM-System. | **Hellbender** is the latest High Performance Computing (HPC) resource available to researchers and students (with sponsorship by a PI) within the UM-System. | ||
| - | **Hellbender** consists of 208 mixed x86-64 CPU nodes (112 AMD, 96 Intel) | + | **Hellbender** consists of 222 mixed x86-64 CPU nodes providing |
| ==== Investment Model ==== | ==== Investment Model ==== | ||
| Line 74: | Line 74: | ||
| ^ Service | ^ Service | ||
| |Hellbender CPU Node | $2,702.00 | Per Node/Year | Year to Year | | |Hellbender CPU Node | $2,702.00 | Per Node/Year | Year to Year | | ||
| - | |Hellbender GPU Node* | $7,691.38 | Per Node/Year | Year to Year | | + | |Hellbender |
| + | |Hellbender L40s GPU Node* | $4,785.00 | Per Node/Year | Year to Year | | ||
| + | |Hellbender H100 GPU Node* | $13, | ||
| |RDE Storage: High Performance | $95.00 | Per TB/Year | Year to Year | | |RDE Storage: High Performance | $95.00 | Per TB/Year | Year to Year | | ||
| |RDE Storage: General Performance | $25.00 | Per TB/Year | Year to Year | | |RDE Storage: General Performance | $25.00 | Per TB/Year | Year to Year | | ||
| - | ***Update | + | ***Update |
| Line 89: | Line 91: | ||
| * When running on the ' | * When running on the ' | ||
| * When running on the ' | * When running on the ' | ||
| - | * To get started please fill out our [[https://request.itrss.umsystem.edu/ | + | * To get started please fill out our [[https://tdx.umsystem.edu/ |
| - **Paid access (Investor) tier compute**: | - **Paid access (Investor) tier compute**: | ||
| Line 99: | Line 101: | ||
| * All accounts are given 50GB of storage in /home/$USER as well as 500GB in / | * All accounts are given 50GB of storage in /home/$USER as well as 500GB in / | ||
| * MU PI's are eligible for 1 free 5TB group storage in our RDE environment | * MU PI's are eligible for 1 free 5TB group storage in our RDE environment | ||
| - | * To get started please fill our our general [[https://request.itrss.umsystem.edu/ | + | * To get started please fill our our general [[https://tdx.umsystem.edu/ |
| - **Paid access (Investor) tier storage**: | - **Paid access (Investor) tier storage**: | ||
| Line 126: | Line 128: | ||
| The investment structure for GPU nodes is the same as CPU - per node per year. f you have funds available that you would like to pay for multiple years up front we can accommodate that. Once Hellbender has hit 50% of the total GPU nodes in the cluster being investor-owned we will restrict additional leases until more nodes become available via either purchase or surrendered by other PI's. The GPU nodes available for investment comprise of the following: | The investment structure for GPU nodes is the same as CPU - per node per year. f you have funds available that you would like to pay for multiple years up front we can accommodate that. Once Hellbender has hit 50% of the total GPU nodes in the cluster being investor-owned we will restrict additional leases until more nodes become available via either purchase or surrendered by other PI's. The GPU nodes available for investment comprise of the following: | ||
| - | | Model | # Nodes | Cores/Node | System Memory | GPU | GPU Memory | # GPU | Local Scratch | # Core | + | | Model | # Nodes | Cores/Node | System Memory | GPU | GPU Memory | # GPU/Node | Local Scratch | # Cores |
| - | | Dell R740xa | 17 | 64 | + | | Dell R740xa | 17 | 64 |
| + | | Dell R740xa | 6 | 64 | 490 GB | H100 | 94 GB | 2 | 1.8 TB | 384 | ||
| + | | Dell R760 | 6 | 64 | 490 GB | L40S | 45 GB | 2 | 3.5 TB | 384 | ||
| - | *Update 06/2025: Additional GPU priority partitions cannot be allocated at this time as GPU investment has reached beyond the 50% threshold. If you require capacity beyond the general pool we are able to plan and work with your grant submissions to add additional capacity to Hellbender | + | |
| - | + | * H100 Node: $13,123.00 Per Node/Year | |
| - | **The 2025 pricing is: $7,692 per node per year.** | + | |
| ==== Storage: Research Data Ecosystem (' | ==== Storage: Research Data Ecosystem (' | ||
| Line 176: | Line 180: | ||
| **Costs** | **Costs** | ||
| - | The cost associated with using the RDE tape archive is $8/TB for short term data kept in inside the tape library for 1-3 years or $140 per tape rounded to the number of tapes for tapes sent offsite for long term retention up to 10 years. We send these tapes off to record management where they are stored in a climate-controlled environment. Each tape from the current generation LTO 9 holds approximately 18TB of data These are flat onetime costs, and you have the option to do both a short term in library copy, and a longer-term offsite copy, or one or the other, providing flexibility. | + | The cost associated with using the RDE tape archive is $8/TB for short term data kept in inside the tape library for 1-3 years or $144 per tape rounded to the number of tapes for tapes sent offsite for long term retention up to 10 years. We send these tapes off to record management where they are stored in a climate-controlled environment. Each tape from the current generation LTO 9 holds approximately 18TB of data These are flat onetime costs, and you have the option to do both a short term in library copy, and a longer-term offsite copy, or one or the other, providing flexibility. |
| **Request Process** | **Request Process** | ||
| To utilize the tape archive functionality that RSS has setup, the data to be archived will need to be copied to RDE storage if it does not exist there already. This would require the following steps. | To utilize the tape archive functionality that RSS has setup, the data to be archived will need to be copied to RDE storage if it does not exist there already. This would require the following steps. | ||
| - | * Submit a RDE storage request if the data resides locally and a RDE share is not already available to the researcher: [[http://request.itrss.umsystem.edu|RSS | + | * Submit a RDE storage request if the data resides locally and a RDE share is not already available to the researcher: [[https://tdx.umsystem.edu/ |
| * Create an archive folder or folders in the relevant RDE storage share to hold the data you would like to archive. The folder(s) can be named to signify the contents, but we ask that the name includes _archive at the end. For example, something akin to: labname_projectx_archive_2024. | * Create an archive folder or folders in the relevant RDE storage share to hold the data you would like to archive. The folder(s) can be named to signify the contents, but we ask that the name includes _archive at the end. For example, something akin to: labname_projectx_archive_2024. | ||
| * Copy the contents to be archived to the newly created archive folder(s) within the RDE storage share. | * Copy the contents to be archived to the newly created archive folder(s) within the RDE storage share. | ||
| - | * Submit a RDE tape Archive request: [[https://missouri.qualtrics.com/ | + | * Submit a RDE tape Archive request: [[https://archiverequest.itrss.umsystem.edu]] |
| * Once the tape archive jobs are completed ITRSS will notify you and send you an Archive job report after which you can delete the contents of the archive folder. | * Once the tape archive jobs are completed ITRSS will notify you and send you an Archive job report after which you can delete the contents of the archive folder. | ||
| * We request that subsequent archive jobs be added to a separate folder, or the initial folder renamed to something that signifies the time of archive for easier retrieval *_archive2024, | * We request that subsequent archive jobs be added to a separate folder, or the initial folder renamed to something that signifies the time of archive for easier retrieval *_archive2024, | ||
| Line 230: | Line 234: | ||
| * **[[https:// | * **[[https:// | ||
| * **[[https:// | * **[[https:// | ||
| - | * **[[https:// | + | * **[[https:// |
| * **[[https:// | * **[[https:// | ||
| * **[[https:// | * **[[https:// | ||
| Line 256: | Line 260: | ||
| Dell C6420: .5 unit server containing dual 24 core Intel Xeon Gold 6252 CPUs with a base clock of 2.1 GHz. Each C6420 node contains 384 GB DDR4 system memory. | Dell C6420: .5 unit server containing dual 24 core Intel Xeon Gold 6252 CPUs with a base clock of 2.1 GHz. Each C6420 node contains 384 GB DDR4 system memory. | ||
| - | Dell R6620: 1 unit server containing dual 128 core AMD EPYC 9754 CPUs with a base clock of 2.25 GHz. Each R6620 node contains 1 TB DDR5 system memory. | + | Dell R6625: 1 unit server containing dual 128 core AMD EPYC 9754 CPUs with a base clock of 2.25 GHz. Each R6625 node contains 1 TB DDR5 system memory. |
| + | |||
| + | Dell R6625: 1 unit server containing dual 128 core AMD EPYC 9754 CPUs with a base clock of 2.25 GHz. Each R6625 node contains 6 TB DDR5 system memory. | ||
| | **Model** | | **Model** | ||
| Line 263: | Line 269: | ||
| | Dell C6420 | 64 | 48 | 364 GB | Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz | 1 TB | 3072 | c146-c209 | | Dell C6420 | 64 | 48 | 364 GB | Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz | 1 TB | 3072 | c146-c209 | ||
| | Dell R6625 | 12 | 256 | 994 GB | AMD EPYC 9754 128-Core Processor | | Dell R6625 | 12 | 256 | 994 GB | AMD EPYC 9754 128-Core Processor | ||
| - | | Dell R6625 | 2 | 256 | 6034 GB | AMD EPYC 9754 128-Core Processor | + | | Dell R6625 | 2 | 256 | 6034 GB | AMD EPYC 9754 128-Core Processor |
| - | | | | + | | | |
| === GPU nodes === | === GPU nodes === | ||
| - | | **Model** | + | | **Model** |
| - | | Dell R740xa | + | | Dell R750xa |
| | Dell XE8640 | 2 | 104 | 2002 GB | H100 | 80 GB | 4 | 3.2 TB | 208 | g018-g019 | | Dell XE8640 | 2 | 104 | 2002 GB | H100 | 80 GB | 4 | 3.2 TB | 208 | g018-g019 | ||
| | Dell XE9640 | 1 | 112 | 2002 GB | H100 | 80 GB | 8 | 3.2 TB | 112 | g020 | | | Dell XE9640 | 1 | 112 | 2002 GB | H100 | 80 GB | 8 | 3.2 TB | 112 | g020 | | ||
| Line 276: | Line 282: | ||
| | Dell R740xd | 2 | 40 | 364 GB | V100 | 32 GB | 3 | 240 GB | 80 | g026-g027 | | Dell R740xd | 2 | 40 | 364 GB | V100 | 32 GB | 3 | 240 GB | 80 | g026-g027 | ||
| | Dell R740xd | 1 | 44 | 364 GB | V100 | 32 GB | 3 | 240 GB | 44 | g028 | | | Dell R740xd | 1 | 44 | 364 GB | V100 | 32 GB | 3 | 240 GB | 44 | g028 | | ||
| - | | | + | | Dell R760xa | 6 | 64 |
| + | | Dell R760 | 6 | 64 | 490 GB | L40S | 45 GB | 2 | 3.5 TB | 384 | g035-g040 | ||
| + | | | ||
| A specially formatted sinfo command can be ran on Hellbender to report live information about the nodes and the hardware/ | A specially formatted sinfo command can be ran on Hellbender to report live information about the nodes and the hardware/ | ||
| Line 613: | Line 621: | ||
| Below is process for setting up a class on the OOD portal. | Below is process for setting up a class on the OOD portal. | ||
| - | - Send the class name, the list of students and TAs, and any shared storage requirements to itrss-support@umsystem.edu. | + | - Send the class name, the list of students and TAs, and any shared storage requirements to itrss-support@umsystem.edu. |
| - We will add the students to the group allowing them access to OOD. | - We will add the students to the group allowing them access to OOD. | ||
| - If the student does not have a Hellbender account yet, they will be presented with a link to a form to fill out requesting a Hellbender account. | - If the student does not have a Hellbender account yet, they will be presented with a link to a form to fill out requesting a Hellbender account. | ||
| Line 804: | Line 812: | ||
| **Documentation**: | **Documentation**: | ||
| + | |||
| + | ==== RStudio ==== | ||
| + | |||
| + | [[https:// | ||
| ==== Visual Studio Code ==== | ==== Visual Studio Code ==== | ||