The Intel® Xeon Phi™ processor is a bootable host processor that delivers massive parallelism and vectorization to support the most demanding high-performance computing applications. The integrated and power-efficient architecture delivers significantly more compute per unit of energy consumed versus comparable platforms to give you an improved total cost of ownership.1 The integration of memory and fabric topples the memory wall and reduces cost to help you solve your biggest challenges faster.
With up to 72 out-of-order cores, the new Intel® Xeon Phi™ processor delivers over 3 tera FLOPS (floating-point operations per second) of double-precision peak while providing 3.5 times higher performance per watt than the previous generation.2 3 As a bootable CPU with integrated architecture, the Intel® Xeon Phi™ processor eliminates PCIe* bottlenecks, includes on-package high-bandwidth memory, and available integrated Intel® Omni-Path Fabric (Intel® OP Fabric) architecture to deliver fast, low-latency performance.
The Intel® Xeon Phi™ processor allows you to simplify code modernization and reduce programming costs by sharing code and a developer base with Intel® Xeon® processors. Standardizing on a unified Intel® architecture means you can use a single programming model for all your code, thereby reducing operational and programming expenses through a shared developer base and code reuse.
Take advantage of the Intel® Xeon Phi™ processor’s common x86 architecture to get amazing utilization across any workload. The broad ecosystem of partners and robust roadmap you get by building on Intel® architecture allows for scalability, easy flexibility, and long-term support in compute, memory/storage, I/O, and software.
Explore how Intel® solutions for Lustre* software can unleash the performance and scalability of the Lustre* parallel file system and how the Intel® Solid State Drive Data Center Family can take storage performance to new heights.
Based on comparison with a system with a 2-socket E5-2697 v4 running DGEMM. Intel® Xeon Phi™ 7250 was measured as 2070/215 (GFLOP/Watt) vs. 1054/290 (GFLOP/Watt) on the E5-2697 v4. Source: Intel measured or estimated as of March 2016.
Intel® Xeon® E5-2697 v4 Configuration Parameters:
1-Node, 2 x Intel® Xeon® processor E5-2697 v4 on Grantley-EP (Wellsburg) with 128 GB Total Memory on Red Hat Enterprise Linux* 7.1 kernel 3.10.0-229 using stream_omp v5.4 with Intel compiler 126.96.36.199 with following command: icc stream_omp.c -O3 -openmp -o stream_omp -static -freestanding -o stream_omp_v5.4_IC188.8.131.52_80M.
Intel® Xeon Phi™ Processor Configuration Parameters:
Platform Used Inside Intel for Testing: Intel Adams Pass Product Concept Board (ADP PC), 96 GB DDR4 (6 x 16GB @ 2133 MHz)
BIOS: CRB BIOS 08.R00.RC085
Processors used for this edition:
OS: RHEL* 7
Kernel options: noreplace-paravirt idle=halt mce=on
Environment Variable(s): See how each individual workload was executed for specific environment variables
KNL Self Boot Software Package MPSP 1.2.2
MICPERF 1.3.0 Early Release
ComposerXE 2016 or equivalent redistributable package installed
MKL-based HPL Package 11.3.2.009
Intel MPI version 5.1.2-150
DGEMM: 20000 x 20000 or 26000 x 26000
SGEMM: 30000 x 30000
LINPACK Problem Size: 100000
Claim based on calculated theoretical peak double precision performance capability for a single coprocessor. 16 tiles/core * 61 cores * 1.238 GHz = 1.208 TeraFLOPS. For the Intel® Xeon Phi™ processor: 32 tiles/core * 68 cores * 1.4GHz = 3.046 teraFLOPs.
Benchmark results were obtained prior to implementation of recent software patches and firmware updates intended to address exploits referred to as "Spectre" and "Meltdown". Implementation of these updates may make these results inapplicable to your device or system.
Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit https://www.intel.sg/benchmarks.
Current Intel® Xeon Phi™ coprocessor achieves 1.208TFLOPs per 300 W (0.004027 TF/W) vs. Intel® Xeon Phi™ processor 7250, which achieves 3.046TFLOPs per 215W (0.014167 TF/W). 0.014167/0.004027 = 3.5.