Intel launches a new generation of processors every year. While the last processor generation introduced a new low-power 22-nm process technology, the novelty of the recently launched 4th Generation of Core i processors, previously codenamed “Haswell”, lies in the advanced architecture.
New features include extended registers, a vector computer unit with expanded functionality and performance, more powerful graphics, standard hardware support for AES encryption for all models, revised power management, more configuration options for turbo mode, plus comprehensive performance management for adapting to the chosen cooling solution.
In addition, the second component of the chipset, the Processor Controller Hub (PCH), is now manufactured on the low-power 32 nm process. This enables the first single-chip solutions, similar to the already existing 4th Generation processors for Ultrabooks.
Performance made to measure
In the long term, the improvements to the existing micro architecture will enable efficiency gains of up to 10%, although they are currently only used to increase performance.
New on-chip voltage regulators reduce the footprint and overall power draw of the system, however, the TDP (thermal design power) of the individual chips only marginally increases. The TDP of the i7-4700EQ is 47 W, 2 W more than that of the previous i7-3615QE model. Thanks to a new “configurable TDP down” function, it is possible to lower the maximum TDP from 47W to 37W. As a rule, this is implemented by reducing the processor’s basic clock rate from 2.4 GHz to 1.7 GHz and results in a power loss in base load operation.
With the introduction of individually configurable turbo boost and the ability to configure graphics performance independently from processor performance, it is now possible to define specific thermal performance profiles for each application and optimize them for the available cooling solutions. Intel calls the resulting individual thermal performance Scenario Design Power (SDP).
Depending on the application, SDP can be significantly lower than the maximum TDP, often allowing passive cooling and hermetic encapsulation where it has previously been impossible in this class. Costs for maintenance and cleaning/disinfection can therefore be significantly reduced.
Thanks to a new design and improved power management, power draw in idle mode is also greatly reduced. Despite an increase in overall system performance the stand-by time of battery-powered devices is significantly longer and the system can cool down faster during load breaks and build up higher thermal reserves. When computing power is required, the processor cores and graphics unit can then operate longer in fast turbo mode before overheating triggers downclocking.
Unrivalled: AVX2
The most impressive performance increase of the 4th Generation of Core i processors stems from the new AVX2 vector unit. When introducing AVX (Advanced Vector Extensions) with its previous processor generation, Intel had already significantly increased the performance of the SSE unit in floating point calculations by extending the instruction sets from 128 bit to 256 bit vectors and providing more powerful buffers, in particular a larger reorder buffer. With 4th Generation processors, the buffer was enlarged yet again and the execution unit extended by another integer ALU and a second branch unit.
In addition, FMA (Fused Multiple Add) and TSX (Transactional Synchronization Extension) were added to the instruction set, making individual operations run up to 4 times faster. In practice, the computing power of the vector unit is on average doubled for large fixed and floating point operations compared to the previous model. This is a particular advantage for demanding scientific calculations, especially when they are based on algorithms that can not yet be parallelized in a meaningful way. It is also a plus when running computationally intensive operations or processing images with large amounts of data such as CT, MRI and ultrasound, as well as for highly complex calculations performed by analytical and diagnostic devices.
The HD 4600 standard graphics used in the i7-4700EQ features 20 instead of 16 integrated computing units (Execution Units) found in the HD 4000 graphics of the previous model. At the same time, the basic clock rate of the graphics unit was reduced from 650 MHz to 400 MHz and a new, more efficient multi-format codec used. In applications where the highest graphics performance is not required at all times, it has therefore been possible to reduce the overall power draw despite up to 30% higher peak performances.
The HD 5000 and Iris families – which Intel recently introduced for Ultrabook processors and which feature up to twice the graphics performance, 40 execution units and optional eDRAM based on GT3/GT3e – are not currently available for the embedded processors of the ISG (Intelligent Systems Group).
Perfect data protection
While hardware support of AES encryption was previously reserved for top models, the Intel® Advanced Encryption Standard New Instructions (Intel® AES-NI) is also going to be available for budget and mainstream models. This makes it possible to implement highly compute intensive packaging and encryption routines of the known cryptographic algorithm AES (Advanced Encryption Standard) quickly and safely in hardware. Sensitive patient data can be reliably and fully encrypted with long keys and certificates in real time without burdening the processor cores and the applications running on it. AES-NI is supported by standard operating systems such as Windows and Linux as well as many well-known applications.
conga-TS87 makes new technology immediately available
The use of Computer-on-Modules (COMs) in a range of proven standards has the advantage that developers can concentrate on their core competencies and the specific peripherals, while being able to install the latest technology pre-integrated on their own circuits. This can save a lot of time and money, especially for small and medium production runs up to a few thousand per year. The conga-TS87 from congatec is a low-power COM Express module featuring long-term availability, a “Basic” footprint and the Type 6 pin out.
The module is currently available with the embedded quad core Intel® Core™ i7-4700EQ processor, comes with 6MB L2 cache, can be powered by 4x 2,4 GHz and has a TDP of 47W. With the new configurable turbo boost mode, the clock speed of individual cores can be increased up to 3.3 GHz. The “configurable TDP” function can be used to limit the power to 37 W TDP; as a result, the processor cores run at a basic frequency of 1.7 GHz with reduced performance.
The COM is equipped with the low-power Mobile Intel® QM87 Express Chipset, which has been manufactured in 32 nm technology. It can also be upgraded with future, compatible i3, i5 and i7 dual-core and quad-core processors as they become available. The module can support up to 16 GB, 1600 MT/s fast dual-channel DDR3 memory and supports the most energy efficient L versions with 1.35 V supply voltage.
The integrated graphics is about 25% more powerful than previous models and supports Intel® Flexible Display Interface (FDI), DirectX 11.1, OpenGL 4, OpenCL 1.2 and high-performance, flexible hardware decoding. It can therefore encode and decode high-resolution full HD video in parallel, several times in parallel and at different rates. 4K2K resolutions of up to 3840 x 2160 pixels with DisplayPort and 4096 x 2304 pixels with HDMI are natively supported. It is also possible to connect up to three independent display interfaces via DVI or LVDS and VGA. The module’s native USB 3.0 support provides fast data transmission with low power consumption. A total of eight USB ports are provided, four of them capable of supporting SuperSpeed USB (USB 3.0).
Seven PCI Express 2.0 lanes, PCI Express graphics 3.0 (PEG) x16 lanes for high-performance external graphics cards, four SATA ports with up to 6 Gb/s and RAID support plus Gigabit Ethernet interface make fast and flexible system extensions possible. Additional functions include fan control, LPC bus for easy integration of legacy I/O interfaces and Intel® High Definition Audio.
Summary and outlook
In the initial launch phase of the new Haswell architecture Intel’s focus has been on providing ultimate performance for graphics and complex calculations. This benefits primarily the users of power hungry applications such as CT, MRI and image analysis.
Things will become of wider interest for many classic embedded applications when Intel will be releasing power-optimized quad-core and dual-core models in a second launch phase. It will then be even easier to realize battery-powered, passively cooled mobile and ultramobile devices. Typical applications will include mobile ultrasound, analytical and diagnostic devices.
The long-term availability of processors and modules guarantees a future-proof solution with low maintenance and certification costs. Thanks to the high performance integrated graphics, expensive and rarely long-term available external graphics cards are only required in exceptional cases. One of the greatest advantages of modular COM Express technology is that any technology improvements to processors and chipsets can be integrated quickly and easily. With the conga-TS87, congatec provides a powerful, ready-to-use COM module based on established firmware. In combination with congatec’s proven driver support, customers can take early, yet risk-free advantage of design-ins.
Zeljko Loncaric, Marketing Engineer, congatec AG