Computing Facilities

SmartData@PoliTO operates Politecnico’s BigData@Polito Cluster, built in 2015 as a pilot project realized to satisfy the center’s research aims. This cluster was composed by 36 nodes with a maximum capacity of 220 TB of data (around 650 TB of raw disk space). In the context of HPC4AI project financed by the Piemonte Region, the  BigData@Polito Cluster will be upgraded in 2020.

The complete infrastructure will have more than 1,400 CPU cores, 15 TB of RAM memory and 8 PB of raw storage available for users.

This infrastructure serves today several academic activities: In particular, it represents the reference point for a Master’s Degree course dedicated to Big Data with more than 300 students and it’s used daily from several researches of various departments.

The cluster architecture has been developed to be horizontally scalable. Scaling out means that when current resources become poor for the requested tasks, brand new hardware could be installed just beside the existing one in order to enhance global performance.

2020 BigData@Polito Cluster Upgrade

In order to draw up the fundamental requirements of the new architecture, a survey has been conducted among all the SmartData@PoliTO members. Based on feedback from the researchers, the BigData@Polito Cluster will be extended with the following new capabilities: 

  • 33 storage workers equipped with:
    • 216 TB of raw disk storage
    • 384 GB of RAM
    • Two CPUs with 18 cores/36 threads each
    • Two 25 GbE network interfaces
  • Introduction of 2 nodes equipped with 4 GPUs for experimentation
  • More than 50 GB/s of data reading and processing speed

BigData@Polito Cluster Technical Features

BigData@Polito Cluster consists of 85 servers and switches, organized in four racks and two data-centers:

  • 33 PowerEdge R740xd (Storage Workers)
  • 18 PowerEdge R720xd (Storage Workers)
  • 2 PowerEdge R740 (GPU Workers)
  • 5 PowerEdge R640 (Master Nodes)
  • 3 PowerEdge R620 (Management Nodes)
  • 1 Synology – 120 TB RAID-6 (Master Backup Node)
  • 12 Supermicro 6027R-TRF (developing and staging nodes)
  • 1 Dell Networking Z9100-ON (100 GbE Switch)
  • 6 Dell Networking S3048-ON (1 GbE Switches)

The rack cooling is based on a closed-architecture solution, an energy efficient method where the cooling unit is integrated on the side the rack without affect the temperature of the room where the two rack are situated. Under this approach the air can circulate only inside the rack and not within the whole room. This is an advantage for the cluster because it is easier to maintain the desired temperature and reduce the energy wastage.

Detail Node Characteristics

3 Master nodes DELL PowerEdge R620

Processor typen. 2 processor Intel E5-2630v2 6 cores, 2.6GHz
RAM128 GB DDR3 ECC 1600Mhz
Disk spacen. 3 hard disk 600GB, hot-plug SAS 6Gb/s, 2.5”, 10.000 rpm
Network interfacen. 5 port GbE (4 for internal network + 1 for management)

18 Worker node DELL PowerEdge R720XD

Processor typen. 2 processor Intel E5-2620v2 6 cores, 2.6GHz
RAM96 GB DDR3 ECC 1600Mhz
Disk spacen. 12 hard disk 3TB, hot-plug SAS 6Gb/s, 3.5”, 7.200 rpm
Network interfacen. 5 port GbE (4 for internal network + 1 for management)

2 Worker nodes SuperMicro from DET cluster

Processor typen. 1 processor Intel Xeon 6 cores, 2.5GHz
RAM64 GB DDR3 ECC 1600Mhz
Disk spacen. 5 hard disk 2TB, hot-plug SATA, 3.5”, 7.200 rpm
Network interfacen. 3 port GbE (2 for internal network + 1 for management)

10 Worker nodes SuperMicro from DET cluster

Processor typen. 1 processor Intel Xeon 6 cores, 2.5GHz
RAM32 GB DDR3 ECC 1600Mhz
Disk spacen. 5 hard disk 2TB, hot-plug SATA, 3.5”, 7.200 rpm
Network interfacen. 3 port GbE (2 for internal network + 1 for management)

Hadoop provide also an implementation of the MapReduce programming parading, that broadly consists in splitting up huge volumes of data into small pieces and processing each one of this in parallel, distributing the computation across many machines.

The BigData@Polito Cluster is optimized for the processing of large quantities of data, taking into account high-availability and continuity of services. In order to achieve that, Hadoop framework is used to store and maintain data across the cluster: the HDFS (Hadoop Distributed File System) allows to replicate each “piece” of data across many different machines (e.g. 3), making data resilient and fault-tolerant.

One of the main drawbacks of Hadoop is that it relies a lot on disk operations (each chunk of data is read/written from/to disk), resulting in slow computation times. The solution is to pass to an in-memory processing approach, such as the one implemented by Spark. This strongly enhances performances, with an improvement up to 100 times in computation speed.