Intel has confirmed a bug in some Skylake CPUs could cause them to lock up under “complex workload conditions” but noted that a fix is on the way.
The bug initially was reported only with the Core i7-6700K desktop CPU with Hyper-Threading enabled, but Intel’s confirmation seems to indicates it impacts more CPUs in the lineup.
In a post on Intel’s forums, an Intel community manager wrote: “Intel has identified an issue that potentially affects the 6th Gen Intel Core family of products. This issue only occurs under certain complex workload conditions, like those that may be encountered when running applications like Prime95. In those cases, the processor may hang or cause unpredictable system behavior. Intel has identified and released a fix and is working with external business partners to get the fix deployed through BIOS.”
The bug has apparently been stewing for weeks on forums at hardwareluxx.de and thenMersenne.org which created the software, Prime95, that is used to induce the bug.Prime95 is used to find prime numbers and is also very popular with performance and the overclocking community as a stress and performance test.
Besides the community post, Intel also confirmed to PCWorld the existence of the bug but placed an emphasis on the word “might” because there’s no guarantee you’ll hit the bug.
“Under some complex workload conditions, like those encountered when running applications such as Prime95, the processor may hang or cause unpredictable system behavior. Intel has released a fix that resolves the issue and we are working with external business partners to deploy this fix through BIOS updates,” an Intel official told PCWorld.
This may seem like Intel’s sugar-coating it, but the bug is truly sporadic. Some people have run into it, while others can’t reproduce it. The hang sometimes occurs after minutes, sometimes hours, and others never experience the lockup.
In my own testing, I used a Skylake-based laptop with a quad-core Core i7-6820HK and didn’t experience the hangup in a few hours of testing. Many of the people who’ve reported hitting the bug are doing so with the desktop Core i7-6700K chip, though.
Why this matters: In the grand scheme of CPU “errata” or bugs, this one is fairly esoteric and really not worthy of being mentioned in the same paragraph as Intel’s infamous 1994 FDIV bug. The FDIV bug would manifest itself in Microsoft Excel spreadsheets and was serious enough that Intel recalled the CPUs to the tune of millions of dollars. The bigger bummer for PC users was when Intel couldn’t resolve a bug in its new TSX instructions, so it simply switched it off for many people. The Skylake bug can apparently be patched with a microcode fix.
How to test for the Skylake bug
Chasing down a capricious bug sounds like a good way to avoid doing something else, so here’s how to do it. Download Prime95 version 28.7 from Mersenne95 and decompress the file into a folder.
If you use the current 28.7 version, you will need to create a text file in the folder using Notepad. You can do this by right-mouse-clicking in the Prime95 folder and selecting New > Text Document. Give the text document the title “local.txt.” Once the file is created, double-click it to open with Notepad and type the line
CpuSupportsFMA3=0. Save the file in the same folder.
You have to do this because, according to the bug finders, by default the newer version of Prime95 will use AVX2 and the error appears to only occur with AVX.
Start Prime95 by double-clicking Prime95.exe. Dismiss the dialog by clicking Just Stress Testing.
A dialog box will appear to Run a Torture Test. Select Custom and change the Min FFT size (in K) to 768, and change Max FFT size (in K) to 768. Select Run FFs in-place and also set the run time to 120 minutes or longer. Clicking OK will start the Torture Test.
Now just wait and see if it locks up. Most of the problems seem to occur with the top-end desktop Core i7-6700K, but Intel seems to be implying it could occur on other CPUs.
Before you run this test, you should be aware that Prime95 puts a heavy load on CPUs. Systems that are marginal on cooling or overclocked may crash on their own, so it’s probably best to run this test on a PC with stock settings to make sure it isn’t just an unstable overclock.