This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| pub:hpc:hellbender [2025/11/12 14:38] – [Storage: Research Data Ecosystem ('RDE')] bjmfg8 | pub:hpc:hellbender [2026/04/21 16:15] (current) – [What is Hellbender?] redmonp | ||
|---|---|---|---|
| Line 9: | Line 9: | ||
| **Hellbender** is the latest High Performance Computing (HPC) resource available to researchers and students (with sponsorship by a PI) within the UM-System. | **Hellbender** is the latest High Performance Computing (HPC) resource available to researchers and students (with sponsorship by a PI) within the UM-System. | ||
| - | **Hellbender** consists of 222 mixed x86-64 CPU nodes providing | + | **Hellbender** consists of 263 mixed x86-64 CPU nodes providing |
| ==== Investment Model ==== | ==== Investment Model ==== | ||
| Line 25: | Line 25: | ||
| General access will be open to any research or teaching faculty, staff, and students for any UM system campus. General access is defined as open access to all resources available to users of the cluster at an equal fairshare value. This means that all users will have the same level of access to the general resource. | General access will be open to any research or teaching faculty, staff, and students for any UM system campus. General access is defined as open access to all resources available to users of the cluster at an equal fairshare value. This means that all users will have the same level of access to the general resource. | ||
| - | Research users of the general access portion of the cluster will be given the RDE Standard Allocation to operate from. Larger storage allocations | + | Research users of the general access portion of the cluster will be given the RDE Standard Allocation to operate from. Larger storage allocations |
| - | + | ||
| - | === Hellbender Advanced: Priority Access === | + | |
| - | + | ||
| - | When researcher needs are not being met at the general access level, researchers may request an advanced allocation on Hellbender to gain priority access. Priority access will give research groups a limited set of resources that will be available to them without competition from general access users. Priority Access will be provided to a specific set of hardware through a priority partition which contains these resources. This partition will be created, and limited to use by the user and their associated group. These resources will also be in an overlapping pool of resources available to general access users. This pool will be administered such that if a priority access user submits jobs to their priority access partition, any jobs running on those resources from the overlapping partition will be requeued and begin execution again on another resource in that partition if available, or return to wait in the queue for resources. Priority access users will retain general access status, fairshare will still play a part in moderating their access to the general resource. Fairshare inside a priority partition determine which user’s jobs are selected for execution next inside this partition. The jobs running inside this priority partition will also affect a user’s fairshare calculations even for resources in the general access partition. Meaning that running a large amount of jobs inside a priority partition will lower a user’s priority for the general resources as well. | + | |
| - | + | ||
| - | === Traditional Investment === | + | |
| - | + | ||
| - | Hellbender Advanced Allocation requests that are not approved for DRII Priority Designation may be treated as traditional investments with the researcher paying for the resources used to create the Advanced Allocation at the defined rate. These rates are subject to change based on the determination of DRII, and hardware costs. | + | |
| === Resource Management === | === Resource Management === | ||
| Line 42: | Line 34: | ||
| Priority access resources will generally be made available from existing hardware in the general access pool and the funds will be retained for a future time to allow a larger pool of funds to accumulate for expansion of the resource. This will allow the greatest return on investment over time. If the general availability resources are less than 50% of the overall resource, an expansion cycle will be initiated to ensure all users will still have access to a significant amount of resources. If a researcher or research group is contributing a large amount of funding, it may trigger an expansion cycle if that is determined to be advantageous at the time of the contribution. | Priority access resources will generally be made available from existing hardware in the general access pool and the funds will be retained for a future time to allow a larger pool of funds to accumulate for expansion of the resource. This will allow the greatest return on investment over time. If the general availability resources are less than 50% of the overall resource, an expansion cycle will be initiated to ensure all users will still have access to a significant amount of resources. If a researcher or research group is contributing a large amount of funding, it may trigger an expansion cycle if that is determined to be advantageous at the time of the contribution. | ||
| + | |||
| + | === Hellbender Advanced: Priority Access - Investment === | ||
| + | |||
| + | When researcher needs are not being met at the general access level, researchers may request an advanced allocation on Hellbender to gain priority access via investment. Priority access will give research groups a limited set of resources that will be available to them without competition from general access users. Priority Access will be provided to a specific set of hardware through a priority partition which contains these resources. This partition will be created, and limited to use by the user and their associated group. These resources will also be in an overlapping pool of resources available to general access users. This pool will be administered such that if a priority access user submits jobs to their priority access partition, any jobs running on those resources from the overlapping partition will be requeued and begin execution again on another resource in that partition if available, or return to wait in the queue for resources. Priority access users will retain general access status, fairshare will still play a part in moderating their access to the general resource. Fairshare inside a priority partition determine which user’s jobs are selected for execution next inside this partition. The jobs running inside this priority partition will also affect a user’s fairshare calculations even for resources in the general access partition. Meaning that running a large amount of jobs inside a priority partition will lower a user’s priority for the general resources as well. | ||
| === Benefits of Investing === | === Benefits of Investing === | ||
| Line 70: | Line 66: | ||
| ==== How Much Does Investing Cost? ==== | ==== How Much Does Investing Cost? ==== | ||
| - | See our rates for FY 2024-2025: | + | See our rates for FY 2025-2026: |
| ^ Service | ^ Service | ||
| Line 169: | Line 165: | ||
| **The 2025 pricing is: General Storage: $25/ | **The 2025 pricing is: General Storage: $25/ | ||
| - | To order storage please fill out our [[https://missouri.qualtrics.com/jfe/form/SV_6zkkwGYn0MGvMyO| RSS Services Order Form]] | + | To order storage please fill out our [[https://tdx.umsystem.edu/TDClient/36/DoIT/ |
| - | === Research Data Archive === | + | ==== Research Data Archive |
| **Use Case** | **Use Case** | ||
| Line 235: | Line 231: | ||
| * **[[https:// | * **[[https:// | ||
| * **[[https:// | * **[[https:// | ||
| - | * **[[https:// | + | * **[[https:// |
| - | * **[[https:// | + | * **[[https:// |
| Line 274: | Line 270: | ||
| === GPU nodes === | === GPU nodes === | ||
| - | | **Model** | + | | **Model** |
| | Dell R750xa | 17 | 64 | 490 GB | A100 | 80 GB | 4 | 1.6 TB | 1088 | g001-g017 | | Dell R750xa | 17 | 64 | 490 GB | A100 | 80 GB | 4 | 1.6 TB | 1088 | g001-g017 | ||
| | Dell XE8640 | 2 | 104 | 2002 GB | H100 | 80 GB | 4 | 3.2 TB | 208 | g018-g019 | | Dell XE8640 | 2 | 104 | 2002 GB | H100 | 80 GB | 4 | 3.2 TB | 208 | g018-g019 | ||
| Line 282: | Line 278: | ||
| | Dell R740xd | 2 | 40 | 364 GB | V100 | 32 GB | 3 | 240 GB | 80 | g026-g027 | | Dell R740xd | 2 | 40 | 364 GB | V100 | 32 GB | 3 | 240 GB | 80 | g026-g027 | ||
| | Dell R740xd | 1 | 44 | 364 GB | V100 | 32 GB | 3 | 240 GB | 44 | g028 | | | Dell R740xd | 1 | 44 | 364 GB | V100 | 32 GB | 3 | 240 GB | 44 | g028 | | ||
| - | | Dell R760xa | 6 | 64 | 490 GB | H100 | 94 GB | 2 | 1.8 TB | 384 | g029-g034* | | + | | Dell R760xa | 6 | 64 | 490 GB | H100 | 94 GB | 2 | 1.8 TB | 384 | g029-g034 |
| - | | Dell R760 | 6 | 64 | 490 GB | L40S | 45 GB | 2 | 3.5 TB | 384 | g035-g040* | | + | | Dell R760 | 6 | 64 | 490 GB | L40S | 45 GB | 2 | 3.5 TB | 384 | g035-g040 |
| - | | * = Available Oct 14 | + | | Dell XE9680 | 1 | 96 | 2048 GB | H200 | 141 GB | 8 | 28 TB | 96 | g041 | |
| + | | | ||
| A specially formatted sinfo command can be ran on Hellbender to report live information about the nodes and the hardware/ | A specially formatted sinfo command can be ran on Hellbender to report live information about the nodes and the hardware/ | ||
| Line 553: | Line 550: | ||
| ==== Moving Data ==== | ==== Moving Data ==== | ||
| + | |||
| + | ** Use one of the following options to move data. Do not move data on the login node.** | ||
| === Globus === | === Globus === | ||
| Line 559: | Line 558: | ||
| * Hellbender Collection Name: U MO ITRSS RDE | * Hellbender Collection Name: U MO ITRSS RDE | ||
| - | * Lewis Collection Name: MU RCSS Lewis Home Directories | ||
| * Mill Collection Name: Missouri S&T Mill | * Mill Collection Name: Missouri S&T Mill | ||
| - | * Foundry Collection Name: Missouri S&T HPC Storage | ||
| More detailed information on how to use Globus is at [[https:// | More detailed information on how to use Globus is at [[https:// | ||