# Taiwania3 $\alpha$ test note
## Hardware comparison
| | Taiwania 1 || Taiwania 2 | Taiwania 3 ||
| :--------: | :--------: | :--------: | :--------: | :--------: | :--------: |
| CPU | "Intel Xeon Platinum 6148(2.4GHz)" || "Intel Xeon Gold 6154(3.0GHz)" | "Intel Xeon Platinum 8280(2.7GHz)" |
| GPU | NA || "NVIDIA Tesla V100-SXM2-32GB" || "NVIDIA Tesla V100-32GB"(Only 96 GPU) |
| MEMORY | 192GB | 384GB | 768GB | 192GB ||
| cores/node | 40 || 32(36) | 56 |
| NVMe | -------- | 480GB | -------- | 3.2TB |
| Network | "Intel Omni-Path (100 Gb/s)" || "InfiniBand EDR (100 Gb/s)" | "InfiniBand HDR100 (100 Gb/s)" |
| NODES | 562 | 188 | 252 | 900 | 12 |
| cores(gpus) | 22480 | 7520 | 9072(2016) | 50400 | 672(96) |
### lstopo
Login node:
![Login node](https://i.imgur.com/sFhH6Ez.png)
CPU computing node:
![CPU computing node](https://i.imgur.com/DEim4GG.png)
GPU computing node:
![GPU computing node](https://i.imgur.com/VTvXC8Q.png)
### lscpu
Taiwania3(Login/CPU Computing/GPU Computing node)
```
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 112
On-line CPU(s) list: 0-111
Thread(s) per core: 2
Core(s) per socket: 28
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz
Stepping: 7
CPU MHz: 2700.000
BogoMIPS: 5400.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 39424K
NUMA node0 CPU(s): 0-27,56-83
NUMA node1 CPU(s): 28-55,84-111
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 invpcid_single intel_ppin intel_pt ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp_epp pku ospke avx512_vnni md_clear spec_ctrl intel_stibp flush_l1d arch_capabilities
```
### numactl
Taiwania3(Login/CPU Computing/GPU Computing node)
```
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
node 0 size: 191894 MB
node 0 free: 176448 MB
node 1 cpus: 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111
node 1 size: 193494 MB
node 1 free: 177814 MB
node distances:
node 0 1
0: 10 21
1: 21 10
```
### nvidia-smi
Taiwania3(GPU Computing node)
```
Tue Jan 12 17:40:26 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.05 Driver Version: 450.51.05 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:1B:00.0 Off | 0 |
| N/A 27C P0 38W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Tesla V100-SXM2... On | 00000000:1C:00.0 Off | 0 |
| N/A 27C P0 39W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 Tesla V100-SXM2... On | 00000000:3D:00.0 Off | 0 |
| N/A 28C P0 39W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 Tesla V100-SXM2... On | 00000000:3E:00.0 Off | 0 |
| N/A 27C P0 41W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 4 Tesla V100-SXM2... On | 00000000:B1:00.0 Off | 0 |
| N/A 26C P0 39W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 5 Tesla V100-SXM2... On | 00000000:B2:00.0 Off | 0 |
| N/A 26C P0 39W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 6 Tesla V100-SXM2... On | 00000000:DB:00.0 Off | 0 |
| N/A 27C P0 41W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 7 Tesla V100-SXM2... On | 00000000:DC:00.0 Off | 0 |
| N/A 28C P0 39W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
```
### Network Topology
![](https://cos.twcc.ai/SYS-MANUAL/uploads/upload_978c56bb0ccf1e74301116e54b3b6432.jpeg)
#### Spine-Leaf Architecture
![](https://cos.twcc.ai/SYS-MANUAL/uploads/upload_81eb0cd554a7b4b174642287823d5fe2.jpeg)
adopt from [Aruba](https://www.arubanetworks.com/spine-leaf-architecture/)
[聊一聊Spine/Leaf是什麼](https://showipprotocols-tw.blogspot.com/2016/08/spine-leaf.html)
## Software
### OPENHPC software stack
<img src="" style="cursor:pointer;max-width:100%;" onclick="(function(img){if(img.wnd!=null&&!img.wnd.closed){img.wnd.focus();}else{var r=function(evt){if(evt.data=='ready'&&evt.source==img.wnd){img.wnd.postMessage(decodeURIComponent(img.getAttribute('src')),'*');window.removeEventListener('message',r);}};window.addEventListener('message',r);img.wnd=window.open('https://viewer.diagrams.net/?client=1&page=0');}})(this);"/>
### Queing system
#### PBS to Slurm Conversion Cheat Sheet
User Commands
| User Commands | PBS | Slurm |
| :--------: | :--------: | :--------: |
|Job submission|qsub [script_file]|sbatch [script_file]|
|Job deletion|qdel [job_id]|scancel [job_id]|
|Job status (by job)|qstat [job_id]|squeue [job_id]|
|Job status (by user)|qstat -u [user_name]|squeue -u [user_name]|
|Job hold|qhold [job_id]|scontrol hold [job_id]|
|Job release|qrls [job_id]|scontrol release [job_id]|
|Queue list|qstat -Q|squeue|
|Node list|pbsnodes -l|sinfo -N OR scontrol show nodes|
|Cluster status|qstat -a|sinfo|
Environment
| Environment | PBS | Slurm |
| :--------: | :--------: | :--------: |
|Job ID|$PBS_JOBID|$SLURM_JOBID|
|Job NAME|$PBS_JOBNAME|$SLURM_JOB_NAME|
|Submit Directory|$PBS_O_WORKDIR|$SLURM_SUBMIT_DIR|
|Submit Host|$PBS_O_HOST|$SLURM_SUBMIT_HOST|
|Node List|$PBS_NODEFILE|$SLURM_JOB_NODELIST|
|Q|$PBS_ARRAYID|$SLURM_ARRAY_TASK_ID|
Job Specifications
| Job Specification | PBS | Slurm |
| :--------: | :--------: | :--------: |
|Script directive|#PBS|#SBATCH|
|Queue/Partition|-q [name]|-p [name] *Best to let Slurm pick the optimal partition|
|Node Count|-l nodes=[count]|-N [min[-max]] *Autocalculates this if just task # is given|
|Total Task Count|-l ppn=[count] OR -l mppwidth=[PE_count]|-n OR --ntasks=ntasks|
|Wall Clock Limit|-l walltime=[hh:mm:ss]|-t [min] OR -t [days-hh:mm:ss]|
|Standard Output File|-o [file_name]|-o [file_name]|
|Standard Error File|-e [file_name]|-e [file_name]|
|Combine stdout/err|-j oe (both to stdout) OR -j eo (both to stderr)|(use -o without -e)|
|Copy Environment|-V|--export=[ALL | NONE | variables]|
|Event Notification|-m abe|--mail-type=[events]|
|Email Address|-M [address]|--mail-user=[address]|
|Job Name|-N [name]|--job-name=[name]|
|Job Restart|-r [y | n]|--requeue OR --no-requeue|
|Resource Sharing|-l naccesspolicy=singlejob|--exclusive OR --shared|
|Memory Size|-l mem=[MB]|--mem=[mem][M / G / T] OR --mem-per-cpu=[mem][M / G / T]|
|Accounts to charge|-A OR -W group_list=[account]|--account=[account] OR -A|
|Tasks Per Node|-l mppnppn [PEs_per_node]|--tasks-per-node=[count]|
|CPUs Per Task| |--cpus-per-task=[count]|
|Job Dependency|-d [job_id]|--depend=[state:job_id]|
|Quality of Service|-l qos=[name]|--qos=[normal | high]|
|Job Arrays|-t [array_spec]|--array=[array_spec]|
|Generic Resources|-l other=[resource_spec]|--gres=[resource_spec]|
|Job Enqueue Time|-a “YYYY-MM-DD HH:MM:SS”|--begin=YYYY-MM-DD[THH:MM[:SS]]|
| Never be requeued| | --no-requeue|
Taiwania1 submission script example
```
###############################################
# Intel MPI job script example #
###############################################
#!/bin/bash
#PBS -l select=2:ncpus=40:mpiprocs=40
#PBS -N mpijob
#PBS -q ctest
#PBS -P TRI654321
#PBS -j oe
cd $PBS_O_WORKDIR
module load intel/2018_u1
export I_MPI_HYDRA_BRANCH_COUNT=-1
mpiexec.hydra -PSM2 ./cpi.exe
```
Taiwania3 submission script example
```
#複雜版
#!/bin/bash
#SBATCH --account=ENT108161 # (-A) Account/project number
#SBATCH --job-name=hello_world # (-J) Job name
#SBATCH --partition=test # (-p) Specific slurm partition
#SBATCH --mail-type=END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=user@mybox.mail # Where to send mail. Set this to your email address
#SBATCH --ntasks=24 # (-n) Number of MPI tasks (i.e. processes)
#SBATCH --cpus-per-task=1 # (-c) Number of cores per MPI task
#SBATCH --nodes=2 # (-N) Maximum number of nodes to be allocated
#SBATCH --ntasks-per-node=12 # Maximum number of tasks on each node
#SBATCH --ntasks-per-socket=6 # Maximum number of tasks on each socket
#SBATCH --distribution=cyclic:cyclic # (-m) Distribute tasks cyclically first among nodes and then among sockets within a node
#SBATCH --mem-per-cpu=600mb # Memory (i.e. RAM) per processor
#SBATCH --time=00:05:00 # (-t) Wall time limit (days-hrs:min:sec)
#SBATCH --output=%j.log # (-o) Path to the standard output and error files relative to the working directory
#SBATCH --error=%j.err # (-e) Path to the standard error ouput
#SBATCH --nodelist=cpn[3001-3002] # (-w) specific list of nodes
module load compiler/intel/2020u4 IntelMPI/2020
mpiexec.hydra -bootstrap slurm -n 24 /home/user/bin/intel-hello
```
| PARTITION | NODES | NODELIST |
| :--------: | :--------: | :--------: |
| cpu | 900 | cpn[3001-3900] |
| test | 64 | cpn[3001-3064] |
| bgm | 4 | bgm[3001-3004] |
| gpu | 12 | gpn[3001-3012] |
## Reference
### lstopo
Taiwania1
Taiwania2(Login node)
![Login node](https://cos.twcc.ai/SYS-MANUAL/uploads/upload_8a07674d89dea76a7f1ad6190e48d14f.png)
Taiwania2(Computing node)
![Computing node](https://cos.twcc.ai/SYS-MANUAL/uploads/upload_8d2c08eed741406a23a16244921928dc.png)
### lscpu
Taiwania1(Login/Computing node)
```
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 40
On-line CPU(s) list: 0-39
Thread(s) per core: 1
Core(s) per socket: 20
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
Stepping: 4
CPU MHz: 2400.000
BogoMIPS: 4800.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 28160K
NUMA node0 CPU(s): 0-19
NUMA node1 CPU(s): 20-39
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch arat epb pln pts dtherm intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local
```
Taiwania2(Login/Computing node)
```
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 72
On-line CPU(s) list: 0-71
Thread(s) per core: 2
Core(s) per socket: 18
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz
Stepping: 4
CPU MHz: 3000.000
BogoMIPS: 6000.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 25344K
NUMA node0 CPU(s): 0-17,36-53
NUMA node1 CPU(s): 18-35,54-71
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 intel_ppin intel_pt mba tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local ibpb ibrs stibp dtherm ida arat pln pts hwp_epp pku ospke spec_ctrl intel_stibp
```
### numactl
Taiwania2(Login/Computing node)
```
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
node 0 size: 391828 MB
node 0 free: 333111 MB
node 1 cpus: 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
node 1 size: 393216 MB
node 1 free: 318520 MB
node distances:
node 0 1
0: 10 21
1: 21 10
```
### nvidia-smi
Taiwania2(Login/Computing node)
```
Tue Jan 12 17:34:37 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.80.02 Driver Version: 450.80.02 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:1B:00.0 Off | 0 |
| N/A 27C P0 42W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Tesla V100-SXM2... On | 00000000:1C:00.0 Off | 0 |
| N/A 24C P0 42W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 Tesla V100-SXM2... On | 00000000:3D:00.0 Off | 0 |
| N/A 26C P0 42W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 Tesla V100-SXM2... On | 00000000:3E:00.0 Off | 0 |
| N/A 28C P0 40W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 4 Tesla V100-SXM2... On | 00000000:B1:00.0 Off | 0 |
| N/A 27C P0 41W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 5 Tesla V100-SXM2... On | 00000000:B2:00.0 Off | 0 |
| N/A 29C P0 42W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 6 Tesla V100-SXM2... On | 00000000:DB:00.0 Off | 0 |
| N/A 28C P0 42W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 7 Tesla V100-SXM2... On | 00000000:DC:00.0 Off | 0 |
| N/A 26C P0 41W / 300W | 0MiB / 32510MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
```