The Intel® Xeon Phi™ processor is a bootable host processor that delivers massive parallelism and vectorization to support the most demanding high-performance computing applications. The integrated and power-efficient architecture delivers significantly more compute per unit of energy consumed versus comparable platforms to give you an improved total cost of ownership.1 The integration of memory and fabric topples the memory wall and reduces cost to help you solve your biggest challenges faster.

Features and Benefits

Solve Challenges Faster

With up to 72 out-of-order cores, the new Intel® Xeon Phi™ processor delivers over 3 tera FLOPS (floating-point operations per second) of double-precision peak while providing 3.5 times higher performance per watt than the previous generation.2 3 As a bootable CPU with integrated architecture, the Intel® Xeon Phi™ processor eliminates PCIe* bottlenecks, includes on-package high-bandwidth memory, and available integrated Intel® Omni-Path Fabric (Intel® OP Fabric) architecture to deliver fast, low-latency performance.

Realize Unmatched Value

The Intel® Xeon Phi™ processor allows you to simplify code modernization and reduce programming costs by sharing code and a developer base with Intel® Xeon® processors. Standardizing on a unified Intel® architecture means you can use a single programming model for all your code, thereby reducing operational and programming expenses through a shared developer base and code reuse.

Maximize Future Potential

Take advantage of the Intel® Xeon Phi™ processor’s common x86 architecture to get amazing utilization across any workload. The broad ecosystem of partners and robust roadmap you get by building on Intel® architecture allows for scalability, easy flexibility, and long-term support in compute, memory/storage, I/O, and software.

Benchmarks for Intel® Xeon Phi™ Processors

See complete speed, performance, and configuration specs.

Elements of Intel® Scalable System Framework (Intel® SSF)

Fuel your insight with a well-balanced, scalable, and flexible system design capable of supporting both compute- and data-intensive workloads.


Intel® Xeon® processors and Intel® Xeon Phi™ processors include a variety of cutting-edge technologies that help to improve parallel throughput and overall performance, while reducing energy consumption.

Intel® Xeon Phi™ product family

Intel® Xeon® processor E5 family

Memory and Storage

Explore how Intel® solutions for Lustre* software can unleash the performance and scalability of the Lustre* parallel file system and how the Intel® Solid State Drive Data Center Family can take storage performance to new heights.

Intel® solutions for Lustre* software

Intel® Solid State Drive Data Center Family


Intel® Omni-Path Architecture (Intel® OPA) offers scalability and cost advantages that will increase with every new Intel® platform generation.

Intel® Omni-Path Architecture (Intel® OPA)


Speed your time to insight with Intel® HPC Orchestrator and achieve efficiencies through a reliable HPC system management software. Boost application performance with Intel's software developer tools.

Intel® HPC Orchestrator

Intel® Software tools for HPC

Product and Performance Information


Based on comparison with a system with a 2-socket E5-2697 v4 running DGEMM. Intel® Xeon Phi™ 7250 was measured as 2070/215 (GFLOP/Watt) vs. 1054/290 (GFLOP/Watt) on the E5-2697 v4. Source: Intel measured or estimated as of March 2016. 

Configuration Details:

Intel® Xeon® E5-2697 v4 Configuration Parameters:

1-Node, 2 x Intel® Xeon® processor E5-2697 v4 on Grantley-EP (Wellsburg) with 128 GB Total Memory on Red Hat Enterprise Linux* 7.1 kernel 3.10.0-229 using stream_omp v5.4 with Intel compiler with following command: icc stream_omp.c -O3 -openmp -o stream_omp -static -freestanding -o stream_omp_v5.4_IC16.0.3.174_80M.

Intel® Xeon Phi™ Processor Configuration Parameters:

Platform Used Inside Intel for Testing: Intel Adams Pass Product Concept Board (ADP PC), 96 GB DDR4 (6 x 16GB @ 2133 MHz)


BIOS Settings:

  • Load Default Settings (Turbo is On)
  • Set Cluster Mode to Quad
  • Set DDR Memory Speed to 2133 or Auto
  • MCDRAM Memory Mode varies between Flat and Cache

Processors used for this edition:

  • KNL B0 tQS (Bin3) Processor 7210 QDF# QKTA:  
    • 32 Tiles / 64 Cores, 16GB MCDRAM,
    • 1.5 GHz (single core turbo), 1.4 GHz (all core turbo), 1.1 GHz (AVX-P1), 1.3 GHz, (non-AVX-P1)
    • 1.6 GHz mesh, 6.4 GT/s OPIO
  • KNL B0 tQS (Bin2) Processor 7230 QDF# QKTB:  
    • 32 Tiles / 64 Cores, 16GB MCDRAM,
    • 1.5 GHz (single core turbo), 1.4 GHz (all core turbo), 1.1 GHz (AVX-P1), 1.3 GHz, (non-AVX-P1)
    • 1.7 GHz mesh, 7.2 GT/s OPIO
  • KNL B0 tQS (Bin1) Processor 7250
    • 34 Tiles / 68 Cores, 16GB MCDRAM,
    • 1.6 GHz (single core turbo), 1.5 GHz (all core turbo), default P ratios
    • 1.7 GHz mesh, 7.2 GT/s OPIO


Kernel options: noreplace-paravirt idle=halt mce=on

Environment Variable(s): See how each individual workload was executed for specific environment variables

KNL Self Boot Software Package MPSP 1.2.2

MICPERF 1.3.0 Early Release

ComposerXE 2016 or equivalent redistributable package installed

MKL-based HPL Package

Intel MPI version 5.1.2-150

Matrix Sizes:

DGEMM: 20000 x 20000 or 26000 x 26000

SGEMM: 30000 x 30000

LINPACK Problem Size: 100000


Claim based on calculated theoretical peak double precision performance capability for a single coprocessor. 16 tiles/core * 61 cores * 1.238 GHz = 1.208 TeraFLOPS. For the Intel® Xeon Phi™ processor: 32 tiles/core * 68 cores * 1.4GHz = 3.046 teraFLOPs.


Benchmark results were obtained prior to implementation of recent software patches and firmware updates intended to address exploits referred to as "Spectre" and "Meltdown". Implementation of these updates may make these results inapplicable to your device or system.

Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit

Current Intel® Xeon Phi™ coprocessor achieves 1.208TFLOPs per 300 W (0.004027 TF/W) vs. Intel® Xeon Phi™ processor 7250, which achieves 3.046TFLOPs per 215W (0.014167 TF/W). 0.014167/0.004027 = 3.5.