This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

System specifications

What are the foundational components of DAIC?

    At present DAIC and DelftBlue have different software stacks. This pertains to the operating system (CentOS 7 vs Red Hat Enterprise Linux 8, respectively) and, consequently, the available software. Please refer to the respective DelftBlue modules and Software section before commencing your experiments.

    DAIC partitions and access/usage best practices

    DAIC partitions and access/usage best practices

    Operating System

    DAIC runs the Red Hat Enterprise Linux 7 Linux distribution, which provides the general Linux software. Most common software, including programming languages, libraries and development files for compiling your own software, is installed on the nodes (see Available software). However, a not-so-common program that you need might not be installed. Similarly, if your research requires a state-of-the-art program that is not (yet) available as a package for Red Hat 7, then it is not available. See Installing software for more information.

    Login Nodes

    The login nodes are the gateway to the DAIC HPC cluster and are specifically designed for lightweight tasks such as job submission, file management, and compiling code (on certain nodes). These nodes are not intended for running resource-intensive jobs, which should be submitted to the Compute Nodes.

    Specifications and usage notes

    HostnameCPU (Sockets x Model)Total CoresTotal RAMOperating SystemGPU TypeGPU CountUsage Notes
    login11 x Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz815.39 GBOpenShift EnterpriseQuadro K22001For file transfers, job submission, and lightweight tasks.
    login21 x Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz13.70 GBOpenShift EnterpriseN/AN/AVirtual server, for non-intensive tasks. No compilation.
    login32 x Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz32503.60 GBRHEVQuadro K22001For large compilation and interactive sessions.

    Compute Nodes

    DAIC compute nodes are all multi CPU servers, with large memories, and some with GPUs. The nodes in the cluster are heterogeneous, i.e. they have different types of hardware (processors, memory, GPUs), different functionality (some more advanced than others) and different performance characteristics. If a program requires specific features, you need to specifically request those for that job (see Submitting jobs).

    List of all nodes

    The following table gives an overview of current nodes and their characteristics:

    HostnameCPU (Sockets x Model)Cores per SocketTotal CoresCPU Speed (MHz)Total RAMGPU TypeGPU Count
    100plus2 x Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz16322097.488755.585 GB
    3dgi11 x AMD EPYC 7502P 32-Core Processor32322500251.41 GB
    3dgi21 x AMD EPYC 7502P 32-Core Processor32322500251.41 GB
    awi012 x Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz18362996.569376.384 GBTesla V100 PCIe 32GB1
    awi022 x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz14282900.683503.619 GBTesla V100 SXM2 16GB2
    awi032 x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz14282899.951503.625 GB
    awi042 x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz14283231.884503.625 GB
    awi052 x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz14283258.984503.625 GB
    awi072 x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz14282899.951503.625 GB
    awi082 x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz14282899.951503.625 GB
    awi092 x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz14282899.951503.625 GB
    awi102 x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz14282899.951503.625 GB
    awi112 x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz14282899.951503.625 GB
    awi122 x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz14282899.951503.625 GB
    awi192 x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz14282899.951251.641 GB
    awi202 x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz14282899.951251.641 GB
    awi212 x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz14282899.951251.641 GB
    awi222 x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz14282899.951251.641 GB
    awi232 x Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz18363221.038376.385 GB
    awi242 x Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz18362580.2376.385 GB
    awi252 x Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz18363399.884376.385 GB
    awi262 x Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz18363442.7376.385 GB
    cor12 x Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz16643599.9751510.33 GBTesla V100 SXM2 32GB8
    gpu012 x AMD EPYC 7413 24-Core Processor24482650503.402 GBNVIDIA A403
    gpu022 x AMD EPYC 7413 24-Core Processor24482650503.402 GBNVIDIA A403
    gpu032 x AMD EPYC 7413 24-Core Processor24482650503.402 GBNVIDIA A403
    gpu042 x AMD EPYC 7413 24-Core Processor24482650503.402 GBNVIDIA A403
    gpu052 x AMD EPYC 7413 24-Core Processor24482650503.402 GBNVIDIA A403
    gpu062 x AMD EPYC 7413 24-Core Processor24482650503.402 GBNVIDIA A403
    gpu072 x AMD EPYC 7413 24-Core Processor24482650503.402 GBNVIDIA A403
    gpu082 x AMD EPYC 7413 24-Core Processor24482650503.402 GBNVIDIA A403
    gpu092 x AMD EPYC 7413 24-Core Processor24482650503.402 GBNVIDIA A403
    gpu102 x AMD EPYC 7413 24-Core Processor24482650503.402 GBNVIDIA A403
    gpu112 x AMD EPYC 7413 24-Core Processor24482650503.402 GBNVIDIA A403
    gpu142 x AMD EPYC 7543 32-Core Processor32642794.613503.275 GBNVIDIA A403
    gpu152 x AMD EPYC 7543 32-Core Processor32642794.938503.275 GBNVIDIA A403
    gpu162 x AMD EPYC 7543 32-Core Processor32642794.604503.275 GBNVIDIA A403
    gpu172 x AMD EPYC 7543 32-Core Processor32642794.878503.275 GBNVIDIA A403
    gpu182 x AMD EPYC 7543 32-Core Processor32642794.57503.275 GBNVIDIA A403
    gpu192 x AMD EPYC 7543 32-Core Processor32642794.682503.275 GBNVIDIA A403
    gpu202 x AMD EPYC 7543 32-Core Processor32642794.6511007.24 GBNVIDIA A403
    gpu212 x AMD EPYC 7543 32-Core Processor32642794.6461007.24 GBNVIDIA A403
    gpu222 x AMD EPYC 7543 32-Core Processor32642794.9631007.24 GBNVIDIA A403
    gpu232 x AMD EPYC 7543 32-Core Processor32642794.6581007.24 GBNVIDIA A403
    gpu242 x AMD EPYC 7543 32-Core Processor32642794.6641007.24 GBNVIDIA A403
    grs12 x Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz8163499.804251.633 GB
    grs22 x Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz8163577.734251.633 GB
    grs32 x Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz8163499.804251.633 GB
    grs42 x Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz8163499.804251.633 GB
    influ12 x Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz16322955.816376.391 GBGeForce RTX 2080 Ti8
    influ22 x Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz16322300187.232 GBGeForce RTX 2080 Ti4
    influ32 x Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz16322300187.232 GBGeForce RTX 2080 Ti4
    influ42 x AMD EPYC 7452 32-Core Processor32641500251.626 GB
    influ52 x AMD EPYC 7452 32-Core Processor32642350503.611 GB
    influ62 x AMD EPYC 7452 32-Core Processor32641500503.61 GB
    insy152 x Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz16322300754.33 GBGeForce RTX 2080 Ti Rev. A4
    insy162 x Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz16322300754.33 GBGeForce RTX 2080 Ti Rev. A4
    Total1206238028 TB101

    CPUs

    All nodes have multiple Central Processing Units (CPUs) that perform the operations. Each CPU can process one thread (i.e. a separate string of computer code) at a time. A computer program consists of one or multiple threads, and thus needs one or multiple CPUs simultaneously to do its computations (see wikipedia's CPU page ).

    The number of threads running simultaneously determines the load of a server. If the number of running threads is equal to the number of available CPUs, the server is loaded 100% (or 1.00). When the number of threads that want to run exceed the number of available CPUs, the load rises above 100%.

    The CPU functionality is provided by the hardware cores in the processor chips in the machines. Traditionally, one physical core contained one logical CPU, thus the CPUs operated completely independent. Most current chips feature hyper-threading: one core contains two (or more) logical CPUs. These CPUs share parts of the core and the cache, so one CPU may have to wait when a shared resource is in use by the other CPU. Therefore these CPUs are always allocated in pairs by the job scheduler.

    GPUs

    A few types of GPUs are available in some of DAIC nodes, as shown in table 1. The total numbers of these GPUs/type and their technical specifications are shown in table 2. See using graphic cards for requesting GPUs for a computational job.

    Table 2: Counts and specifications of DAIC GPUs
    GPU (slurm) type
    CountModelArchitectureCompute CapabilityCUDA coresMemory
    a4066NVIDIA A40Ampere8.61075246068 MiB
    turing24NVIDIA GeForce RTX 2080 TiTuring7.5435211264 MiB
    v10011Tesla V100-SXM2-32GBVolta7.0512032768 MiB

    In table 2: the headers denote:

    • Model: The official product name of the GPU
    • Architecture: The hardware design used, and thus the hardware specifications and performance characteristics of the GPU. Each new architecture brings forward a new generation of GPUs.
    • Compute capability: determines the general functionality, available features and CUDA support of the GPU. A GPU with a higher capability supports more advanced functionality.
    • CUDA cores: The number of cores perform the computations: The more cores, the more work can be done in parallel (provided that the algorithm can make use of higher parallelization).
    • Memory: Total installed GPU memory. The GPUs provide their own internal (fixed-size) memory for storing data for GPU computations. All required data needs to fit in the internal memory or your computations will suffer a big performance penalty.

    Memory

    All machines have large main memories for performing computations on big data sets. A job cannot use more than it’s allocated amount of memory. If it needs to use more memory, it will fail or be killed. It’s not possible to combine the memory from multiple nodes for a single task. 32-bit programs can only address (use) up to 3Gb (gigabytes) of memory. See Submitting jobs for setting resources for batch jobs.

    Storage

    DAIC compute nodes have direct access to the TU Delft home, group and project storage. You can use your TU Delft installed machine or an SCP or SFTP client to transfer files to and from these storage areas and others (see data transfer) , as is demonstrated throughout this page.

    File System Overview

    Unlike TU Delft’s DelftBlue , DAIC does not have a dedicated storage filesystem. This means no /scratch space for storing temporary files (see DelftBlue’s Storage description and Disk quota and scratch space ). Instead, DAIC relies on direct connection to the TU Delft network storage filesystem (see Overview data storage ) from all its nodes, and offers the following types of storage areas:

    Personal storage (aka home folder)

    The Personal Storage is private and is meant to store personal files (program settings, bookmarks). A backup service protects your home files from both hardware failures and user error (you can restore previous versions of files from up to two weeks ago). The available space is limited by a quota limit (since this space is not meant to be used for research data).

    You have two (separate) home folders: one for Linux and one for Windows (because Linux and Windows store program settings differently). You can access these home folders from a machine (running Linux or Windows OS) using a command line interface or a browser via TU Delft's webdata . For example, Windows home has a My Documents folder. My documents can be found on a Linux machine under /winhome/<YourNetID>/My Documents

    Home directoryAccess fromStorage location
    Linux  home folder
    Linux/home/nfs/<YourNetID>
    Windowsonly accessible using an scp/sftp client (see SSH access)
    webdatanot available
    Windows home folder
    Linux/winhome/<YourNetID>
    WindowsH: or \\tudelft.net\staff-homes\[a-z]\<YourNetID>
    webdatahttps://webdata.tudelft.nl/staff-homes/[a-z]/<YourNetID>

    It’s possible to access the backups yourself. In Linux the backups are located under the (hidden, read-only) ~/.snapshot/ folder. In Windows you can right-click the H: drive and choose Restore previous versions.

    Group storage

    The Group Storage is meant to share files (documents, educational and research data) with department/group members. The whole department or group has access to this storage, so this is not for confidential or project data. There is a backup service to protect the files, with previous versions up to two weeks ago. There is a Fair-Use policy for the used space.

    DestinationAccess fromStorage location
    Group Storage
    Linux/tudelft.net/staff-groups/<faculty>/<department>/<group> or
    /tudelft.net/staff-bulk/<faculty>/<department>/<group>/<NetID>
    WindowsM: or \\tudelft.net\staff-groups\<faculty>\<department>\<group> or
    L: or \\tudelft.net\staff-bulk\ewi\insy\<group>\<NetID>
    webdatahttps://webdata.tudelft.nl/staff-groups/<faculty>/<department>/<group>/

    Project Storage

    The Project Storage is meant for storing (research) data (datasets, generated results, download files and programs, …) for projects. Only the project members (including external persons) can access the data, so this is suitable for confidential data (but you may want to use encryption for highly sensitive confidential data). There is a backup service and a Fair-Use policy for the used space.

    Project leaders (or supervisors) can request a Project Storage location via the Self-Service Portal or the Service Desk .

    DestinationAccess fromStorage location
    Project Storage
    Linux/tudelft.net/staff-umbrella/<project>
    WindowsU: or \\tudelft.net\staff-umbrella\<project>
    webdatahttps://webdata.tudelft.nl/staff-umbrella/<project> or
    https://webdata.tudelft.nl/staff-bulk/<faculty>/<department>/<group>/<NetID>

    Local Storage

    Local storage is meant for temporary storage of (large amounts of) data with fast access on a single computer. You can create your own personal folder inside the local storage. Unlike the network storage above, local storage is only accessible on that computer, not on other computers or through network file servers or webdata. There is no backup service nor quota. The available space is large but fixed, so leave enough space for other users. Files under /tmp that have not been accessed for 10 days are automatically removed.

    DestinationAccess fromStorage location
    Local storage
    Linux/tmp/<NetID>
    Windowsnot available
    webdatanot available

    Memory Storage

    Memory storage is meant for short-term storage of limited amounts of data with very fast access on a single computer. You can create your own personal folder inside the memory storage location. Memory storage is only accessible on that computer, and there is no backup service nor quota. The available space is limited and shared with programs, so leave enough space (the computer will likely crash when you don’t!). Files that have not been accessed for 1 day are automatically removed.

    DestinationAccess fromStorage location
    Memory storage
    Linux/dev/shm/<NetID>
    Windowsnot available
    webdatanot available

    Workload scheduler

    DAIC uses the Slurm scheduler to efficiently manage workloads. All jobs for the cluster have to be submitted as batch jobs into a queue. The scheduler then manages and prioritizes the jobs in the queue, allocates resources (CPUs, memory) for the jobs, executes the jobs and enforces the resource allocations. See the job submission pages for more information.

    A slurm-based cluster is composed of a set of login nodes that are used to access the cluster and submit computational jobs. A central manager orchestrates computational demands across a set of compute nodes. These nodes are organized logically into groups called partitions, that defines job limits or access rights. The central manager provides fault-tolerant hierarchical communications, to ensure optimal and fair use of available compute resources to eligible users, and make it easier to run and schedule complex jobs across compute resources (multiple nodes).