Copy

GTC16 Newsletter
Email not displaying correctly?
View this email in your browser

GTC16 Wrap-Up, OpenPOWER Foundation and Deep Learning  
Microway exhibited recently at the NVIDIA GPU Technology Conference in San Jose, California. It was an exciting week with a focus on GPU-accelerated applications, a GPU appliance built on NVIDIA Tesla “Pascal” architecture, and technical discussions from many of the world’s leading researchers. GPUs are taking over deep learning, data analytics, cloud computing workloads, virtual reality, and visualization. Fields being transformed by GPUs include the automotive and aerospace industries, biotech, computational finance and many of the sciences.
 
At our booth we showed “Day Dreams of a Machine” demos. These demos, which use Google’s Deep Dream algorithm, extract features from a trained artificial neural network for image classification. Surreal dream­like images appear from the seed images. See for yourself: machines day dream on clouds in the sky, and then dreamily contemplate a row of DIMMs​.
Have you ever asked your computer what it sees in the sky?
DGX-1 Deep Learning System Announced at GTC
The most significant announcement during the keynote by NVIDIA’s CEO, Jen-Hsun Huang, was the DGX­-1 Deep Learning System. This product is the world’s most advanced, dedicated Deep Learning supercomputer for training neural networks. A training process that could take 150 hours on a non-GPU accelerated cluster can now be achieved in as little as two hours on the DGX-1.
 
The DGX-1 Deep Learning appliance contains eight NVIDIA Tesla® P100 “Pascal” accelerators incorporating NVLink™ technology. These provide 170 Teraflops of processing power that potentially replaces up to 250 non-GPU accelerated nodes. Employing Ubuntu® Linux and pre-built Docker™ containers with GPU-accelerated Deep Learning frameworks pre-installed, the 3U system arrives ready-to-run. It is built for data-intensive workloads with 512GB memory, 128GB high-bandwidth GPU memory and 7.68TB of high-speed SSD-based cache.
 
For more information, please visit our DGX­-1 product page​. Microway is taking orders now for deliveries starting in June. Also,​ be sure to check out Jen-­Hsun’s GTC16 Keynote address, which could easily be subtitled “The Future is NOW” at ​ustream.tv​.
"Pascal" is now public!
The DGX-1 system is built with NVIDIA’s new Tesla P100 GPUs, utilizing the NVIDIA “Pascal” GPU architecture. NVLink technology connects the GPUs together and makes it possible to deliver significant speedups on HPC applications. Tesla P100 delivers 5.3 TFLOPS double-precision, 10.6 TFLOPs single-precision, and 21.2 TFLOPs half-precision performance, with 16GB of high-bandwidth ECC memory providing 720 GB/s bandwidth.
 
The NVLink high­speed GPU interconnect is bi­directional and delivers 5x higher bandwidth than PCI-Express, which provides only 12GB/s in practice. In a hybrid cube NVLink topology, up to eight “Pascal” GPUs communicate directly with one another. These GPUs can also communicate directly with OpenPOWER CPUs, enabling both GPUs and CPUs to share and directly access each other’s memory.
 
Compared to the previous “Maxwell” microarchitecture, the Tesla P100 delivers 3x memory performance due to the Chip­on­Wafer­on­Substrate (CoWoS) process with HBM2 stacked memory. For data intensive applications, this improved memory bandwidth realizes significant performance gains.
 
Programming “Pascal” GPUs has been made easier with the introduction of Page Migration Engine. Data can be loaded into memory (either on the CPUs or on the GPUs) and seamlessly shared between all devices. This allows software engineers to focus on computing performance instead of the intricacies of data transfers. This also enables applications to access memory volumes beyond that of the GPU's physical memory. CUDA version 8 is required to leverage these new features and to program the new Pascal architecture.
 
The New Tesla M40 GPU, Enhanced with 24GB of GDDR5 Memory
As GTC opened, NVIDIA announced the release of its enhanced 24GB M40 Deep Learning GPU. Like the 12GB Tesla M40, the enhanced M40 provides an exceptional 7.0 TFLOPS of single-precision compute performance. The GDDR5 memory on the 24GB M40 comes enabled with ECC, although this can be disabled by users who prefer performance over accuracy.
 
The Tesla M40 24GB is to be ideal for cost-effective clusters running Deep Learning workloads. It provides more memory than any other compute GPU on the market. More detailed information is available on Microway’s ​HPC Tech Tips blog.​

 
Intel Releases New Dual-Socket Xeon E5-2600v4 Processors
On March 31, Intel announced its new Xeon E5-2600v4 series processors. They provide more CPU cores, more cache, faster memory access and more efficient operation. These are based upon the Intel microarchitecture code-named “Broadwell” – we expect them to be the HPC processors of choice in the future.

The highest core count now reaches 22 cores per CPU, compared to 18 cores for the previous generation. The DDR4 memory controllers operate at 2400MHz, compared to 2133MHz for “Haswell”. Almost all new Microway HPC Cluster and WhisperStation builds now incorporate the new CPUs, although we still offer the previous generations.
 
While the HPC community awaits further benchmarks to be posted, performance gains are estimated to range from 10% to over 40%, depending on application.  For a more in-­depth review of the Xeon Broadwell, see Microway’s HPC Tech Tips ​Blog post “Xeon E5­-2600v4 Broadwell Review”.​​  ​If you are interested in taking a test drive with the newest Broadwell CPUs on Microway’s benchmark cluster, please sign up here.​
More Software Applications Now GPU Accelerated
NVIDIA has updated its ever­growing list of major software applications which have been accelerated on NVIDIA GPUs. For a full list of ported applications, visit NVIDIA’s applications catalog​.​ 
 
The growing list of Deep Learning GPU­accelerated frameworks includes: Caffe, Chainer, DL4J, Julia, Keras, MatConvNet, Microsoft CNTK, Minerva, mxnet, Nervana Systems Neon, OpenDeep, Purine, Pylearn2, Google TensorFlow, theano, Torch, and Trakomatic OSense, OTrack.
 
To try a test drive on Microway’s GPU cluster, visit our ​test drive registration page.

 
Benchmark Results for DDR4 Memory on Xeon CPUs with 3 DIMMs Per Channel
Microway engineering recently used STREAM to test the memory bandwidth of Xeon CPUs with configurations of one, two, and three DIMM(s) per channel. The goal of this benchmarking study was to determine how different numbers of DIMMs effect memory bandwidth performance. See more benchmarking results on our ​Microway Tech Tips blog.​
Microway Joins the OpenPOWER Foundation

We’re excited to announce that Microway has joined the OpenPOWER Foundation as a Silver member. We will be offering server systems and HPC clusters with OpenPOWER technologies. We are also offering our HPC software tools on OpenPOWER. Our experts ensure that Microway systems “just work”, so expect nothing less from our OpenPOWER offerings.

We are currently offering two OpenPOWER servers based on the IBM Power® Systems products. The first, designed for compute, provides dual POWER8™ CPUs and dual Tesla® GPUs. The second, designed for Hadoop, Spark and other data analytics workloads, provides a single POWER8 CPU along with large storage capabilities. These products are described further on our POWER Technology page.


Those who’d like to see the performance for themselves can request access to a bare-metal Test Drive system with 10-core POWER8 CPUs and NVIDIA Tesla K80 GPUs. Access is limited, but we want to foster an open atmosphere – both to demonstrate the capabilities of OpenPOWER and to support as many applications as possible.

Please follow the links we've provided for the most detailed information for the exciting progress and
products in the HPC world.

As always, if you have any questions, Microway's experts are happy to offer advice and share our technical expertise. 
Feel free to call us when you design your next cluster or WhisperStation.
Connect With Us
Tweet
Share
Forward
Share
Contact Us
sales@microway.com
(508) 746-7341
GSA Schedule GS-35F-0431N
Eliot Eshelman (508) 732-5534
Ed Hinkel (508) 732-5523
John Murphy (508) 732-5542
Samantha Wheeler (508) 732-5526
Copyright © 2016 Microway Inc., All rights reserved.


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list