Monday, April 29th 2024

Intel Statement on Stability Issues: "Motherboard Makers to Blame"

A couple of weeks ago, we reported on NVIDIA directing users of Intel's 13th Generation Raptor Lake and 14th Generation Raptor Lake Refresh CPUs to consult Intel for any issues with system stability. Motherboard makers, by default, often run the CPU outside of Intel's recommended specifications, overvolting the CPU through modifying voltage curves, automatic overclocks, and removing power limits.

Today, we learned that Igor's Lab has obtained a statement from Intel that the company prepared for motherboard OEMs regarding the issues multiple users report. Intel CPUs come pre-programmed with a stock voltage curve. When motherboard makers remove power limits and automatically adjust voltage curves and frequency targets, the CPU can be pushed outside its safe operating range, possibly causing system instability. Intel has set up a dedicated website for users to report their issues and offer support. Manufacturers like GIGABYTE have already issued new BIOS updates for users to achieve maximum stability, which incidentally has recent user reports of still being outside Intel spec, setting PL2 to 188 W, loadlines to 1.7/1.7 and current limit to 249 A. While MSI provided a blog post tutorial for stability. ASUS has published updated BIOS for its motherboards to reflect on this Intel baseline spec as well. Surprisingly, not all the revised BIOS values match up with the Intel Baseline Profile spec for these various new BIOS updates from different vendors. You can read the statement from Intel in the quote below.
Intel has observed that this issue may be related to out of specification operating conditions resulting in sustained high voltage and frequency during periods of elevated heat.

Analysis of affected processors shows some parts experience shifts in minimum operating voltages which may be related to operation outside of Intel specified operating conditions.

While the root cause has not yet been identified, Intel has observed the majority of reports of this issue are from users with unlocked/overclock capable motherboards.

Intel has observed 600/700 Series chipset boards often set BIOS defaults to disable thermal and power delivery safeguards designed to limit processor exposure to sustained periods of high voltage and frequency, for example:
  • Disabling Current Excursion Protection (CEP)
  • Enabling the IccMax Unlimited bit
  • Disabling Thermal Velocity Boost (TVB) and/or Enhanced Thermal Velocity Boost (eTVB)
  • Additional settings which may increase the risk of system instability:
  • Disabling C-states
  • Using Windows Ultimate Performance mode
  • Increasing PL1 and PL2 beyond Intel recommended limits
Intel requests system and motherboard manufacturers to provide end users with a default BIOS profile that matches Intel recommended settings.

Intel strongly recommends customer's default BIOS settings should ensure operation within Intel's recommended settings.

In addition, Intel strongly recommends motherboard manufacturers to implement warnings for end users alerting them to any unlocked or overclocking feature usage.

Intel is continuing to actively investigate this issue to determine the root cause and will provide additional updates as relevant information becomes available.

Intel will be publishing a public statement regarding issue status and Intel recommended BIOS setting recommendations targeted for May 2024.
Source: Igor's Lab
Add your own comment

272 Comments on Intel Statement on Stability Issues: "Motherboard Makers to Blame"

#101
Crackong
dgianstefaniHow about the one Intel sets? The one you're asking me to pick, or that which motherboard manufacturers hallucinate is irrelevant.

Depending on the Intel spec you are looking at, i.e. baseline, extreme, default, values change. This isn't that complicated.

I suggest reading the Intel datasheet if you want to learn more. Myself and others have posted links.
125/188
150/320
320/320

All these numbers were mentioned in your posted materials /datasheet/whatever

Just pick one, or give us a number that you somehow 'understand' from all the convoluted Intel spec.
We will see if it is right, or just your 'speculation'.
Posted on Reply
#102
Heiro78
CrackongI don't know either.
Intel did not post it on their performance index.
Are you saying that intel put out a statement with a mispelled word (tweat) and you're reusing it as a gag towards them? When I read the word in a few comments, I wasn't sure if it was meant as tweaked or treated.
Posted on Reply
#103
Crackong
Heiro78Are you saying that intel put out a statement with a mispelled word (tweat) and you're reusing it as a gag towards them? When I read the word in a few comments, I wasn't sure if it was meant as tweaked or treated.
Oh, I am sorry for my typo.
Is that all you wanted?
Posted on Reply
#104
Onasi
CrackongI will make it simple.
What should be the 'baseline' setting, for 14900KS?

125/188 ?
150/320 ?
320/320 ?

Pick one.
Technically, the least idiotic thing to do based on the spec sheet would just to have two presets in the UEFI - Normal mode (which, for KS would be the 150/320) and an Extreme one which would be PL1=PL2. Normal would be the default OOB and the one that gets set via Recommended Defaults. And both would pertain only to the Power Limit and nothing else. Any other setting change should trip the OC bit flag and be clearly labeled as such.
Could also include an Eco mode ala Ryzen with a, say, 125W overall limit.
Posted on Reply
#105
Heiro78
CrackongOh, I am sorry for my typo.
Is that all you wanted?
Yea, I wasn't sure if it was intended since I saw it like 3 times the same way. Thanks for clarifying
Posted on Reply
#106
dgianstefani
TPU Proofreader
Do people find it confusing that AMD has an "ECO" mode? This essentially does the same thing as "Intel baseline spec".

There's nothing inherently wrong with having multiple sets of values for different performance/efficiency targets.

The issue, is board partners not using these sets of values, doing their own thing, then Intel picking up the tab when there's instability.
Posted on Reply
#107
Crackong
OnasiTechnically, the least idiotic thing to do based on the spec sheet would just to have two presets in the UEFI - Normal mode (which, for KS would be the 150/320) and an Extreme one which would be PL1=PL2. And both would pertain only to the Power Limit and nothing else. Any other setting change should trip the OC bit flag and be clearly labeled as such.
Could also include an Eco mode ala Ryzen with a, say, 125W overall limit.
If it were that easy.
It seems that even the motherboard manufacturers have a hard time figuring these out, so Asus would have PL1 = PL2 as baseline, while Gigabyte had 188W basline with 1.7v voltage loadline calibration..
Heiro78Yea, I wasn't sure if it was intended since I saw it like 3 times the same way. Thanks for clarifying
It was my mistake, Thanks for pointing it out.
Posted on Reply
#108
dgianstefani
TPU Proofreader
OnasiTechnically, the least idiotic thing to do based on the spec sheet would just to have two presets in the UEFI - Normal mode (which, for KS would be the 150/320) and an Extreme one which would be PL1=PL2. Normal would be the default OOB and the one that gets set via Recommended Defaults. And both would pertain only to the Power Limit and nothing else. Any other setting change should trip the OC bit flag and be clearly labeled as such.
Could also include an Eco mode ala Ryzen with a, say, 125W overall limit.
From what I understand, many of the out of box profiles from manufacturers already trip the OC bit flag "IccMax Unlimited bit".

I agree that the best thing motherboard manufacturers could do would simply be to have a few default profiles directly using the values off the Intel Datasheet, which has various baseline, normal and extreme presets already dialled in (and validated). Despite some people seeming to think that this endeavour would be too difficult for manufacturers to figure out.

The extra "AI OC" or whatever marketing wants to call fiddling with settings and overclocks that Intel hasn't validated for every CPU bin of the SKU should still be an option, but not the default, and with a UI warning as Intel is suggesting in their memo.
Posted on Reply
#109
close
dgianstefaniAnandtech is one of the other review sites that also tests using Intel Spec, not the motherboard defaults.


I wonder if board makers are taking this seriously since again, they're not the ones who have to deal with returns, most of the time, unless people realise it's the motherboard.
They do go by the book for a lot of things, AMD too even if subjectively it may blunt the value of some reviews (like JEDEC even when nobody uses those timings), but I can see the value in going "standard". My point was that even a big outlet like that is known to have very publicly given into Intel's strongarming or hand greasing (whichever was the case). Most other reviewers will cave and not go against Intel or get on their bad side lest they start doing reviews on store bought parts long after everyone else has published their day-1 review.

We'll see how many reviewers publish and advertise an update to all the reviews made when the parts were launched at least to flag the situation even if no number correction. Not really seeing this news on too many front pages today but it's also a good litmus test for my personal future reading preferences. Will keep an eye out even if I'm generally very behind the times so I almost never buy current generation. But I was still burned by super optimistic day-1 reviews which were never updated to account for the real life performance losses as the day-1 "optimizations" aimed at getting flashy numbers had to be turned off in the real life.

P.S. And I'm still not entirely sure this is just a matter of "staying standard", I'm fairly certain there are a lot more hidden changes under the hood that contribute to this situation, beyond just the one power topic.
Posted on Reply
#110
Crackong
dgianstefaniDo people find it confusing that AMD has an "ECO" mode? This essentially does the same thing as "Intel baseline spec".
Hmm...No?
I think AMD's ECO mode isn't for maximum stability, but for energy efficiency.

On the other hand, 'Intel Baseline Spec' is advertised to be the 'Safest & most Stable' profile, not for energy efficiency.
Posted on Reply
#111
Daven
dgianstefaniSpeculation.

This news post comments about facts.
Any and all comments to any news story should be treated as 100% speculation including yours and mine. I don’t know you and you don’t know me but we can have an open and honest discussion using our own opinions and interpretation of facts.

I don’t accept your interpretation that Intel did not approve of these default settings. Intel says nothing about past compliance in their statement. They just give guidance going forward using words like requests and recommends (all present tense). So you speculated that Intel told these manufacturers NOT to do this in the past and they disregarded. This is not based on any facts and is just your opinion of how a company like Intel ought to act.
Posted on Reply
#112
dgianstefani
TPU Proofreader
CrackongHmm...No?
I think AMD's ECO mode isn't for maximum stability, but for energy efficiency.

On the other hand, 'Intel Baseline Spec' is advertised to be the 'Safest & most Stable' profile, not for energy efficiency.
I haven't seen any advertising for the "Intel Baseline Spec" from Intel, could you link?

From ASUS' patch notes.

Interesting that ASUS is referring only to the Intel Baseline Profile spec as the factory default, when there are several other standard and "extreme" profiles too.

DavenAny and all comments to any news story should be treated as 100% speculation including yours and mine. I don’t know you and you don’t know me but we can have an open and honest discussion using our own opinions and interpretation of facts.

I don’t accept your interpretation that Intel did not approve of these default settings. Intel says nothing about past compliance in their statement. They just give guidance going forward using words like requests and recommends (all present tense). So you speculated that Intel told these manufacturers NOT to do this in the past and they disregarded. This is not based on any facts and is just your opinion of how a company like Intel ought to act.
When did I do this?
Posted on Reply
#113
londiste
CrackongWhat should be the 'baseline' setting, for 14900KS?

125/188 ?
150/320 ?
320/320 ?

Pick one.
There is no "should be". It is 253/253.
Posted on Reply
#114
Random_User
zmeulThe Intel baseline should've been the factory defaults, not the optional

Most users at home don't update the BIOS and will also won't know where these options are if they do update it
Not only Intel, but AMD either. They both should have had enforced the safe recomended specs as default upon those shady motherboard manufacturers. Like it used to be for years (for those who like to tinker, it doesnt requires to much effort to make couple clicks in the BIOS/UEFI). Because it damages ot Intel and AMD brands and reputation. CPU is perhaps the most stable and sturdy component in the entire PC, but the MB vendors managed to screw up so royally, that both CPU brands get hurt out of nowhere, while being used at default state. Just because the partners want to sell "gamur" "xxx edition, mega ultra OC" "for those who dare", while literally knowing, that with the complexity of modern CPUs and their ability to self boost, the OC is mostly dead nowadays.
Once upon a time, the Intel branded motherboards (Foxconn OEM), while lacking the "bells and whistles" of the other MB vendors, were working out of the box and were definition of stability. Now both AMD and Intel, made the supervision so loose, and the QA of MB manufacturers is so bad, that they both have to pay with their reputational damage.
CrackongSo Intel admitting they don't have a default profile and relies on motherboard manufacturers to make their own 'Default' .

And also it is Intel themselves using PL1 = 253W in their own CPU performance index,


Maybe every review site should honor Intel's decision and re-do 12/13/14 gen benchmark with PL1&PL2 = 125W, I bet the results will be fascinating.

They just trying to sell the snake oil ASAP, and at all costs. How else they could get money, if the rival CPUs do the same job at twice less energy usage?
Eventually, the Core i is an established brand, and the Core Ultra, may introduce some uncertainty. So, that's why they are so desparate.
Just thoughts aloud.
Posted on Reply
#115
Crackong
dgianstefaniI haven't seen any advertising for the "Intel Baseline Spec" from Intel, could you link?
Nothing from Intel right now.
All we've got now is Gigabyte and Asus
Posted on Reply
#116
dgianstefani
TPU Proofreader
zmeulThe Intel baseline should've been the factory defaults, not the optional

Most users at home don't update the BIOS and will also won't know where these options are if they do update it
Mostly agreed. For the K/KS chips it's not unreasonable to expect the default to be one of the higher Intel Datasheet specs, there's a few profiles with different performance targets. The issue is wild levels of "tuning" out of the box, that does not conform to any of the profiles Intel provides.

The Intel Baseline Profile is just one of several options. None of which seem to be used by default out of the box, even after the "Intel baseline profile" BIOS updates vendors have made, still deviations and made up numbers.
Posted on Reply
#117
Crackong
londisteThere is no "should be". It is 253/253.
Are you sure?

Since 14900KS had a PL1/PL2 = 150/320, which is differ from regular 14900K's 125/253
If they had the same baseline profile, it will render them basically the same SKU.
Posted on Reply
#118
Assimilator
dgianstefaniIntel needs to be firmer with enforcing their spec, and dictating how deviations should be presented. That's the issue.
The fact that Intel's CPU power delivery specification is so convoluted, with so many knobs and dials, would reasonably suggest a pressing need for Intel to carefully validate any firmware that board partners release, in order to prevent blown up CPUs. If this is true, then it would appear that there is little possibility that Intel could not have known about its boards partners' deviations until now. The most rational explanation, therefore, is that Intel knows exactly what has been going on but chose to turn a blind eye because its board partners' practice of de facto overclocking its CPUs from the factory, had a material benefit for Intel.
Posted on Reply
#119
dgianstefani
TPU Proofreader
AssimilatorThe fact that Intel's CPU power delivery specification is so convoluted, with so many knobs and dials, would reasonably suggest a pressing need for Intel to carefully validate any firmware that board partners release, in order to prevent blown up CPUs. If this is true, then it would appear that there is little possibility that Intel could not have known about its boards partners' deviations until now. The most rational explanation, therefore, is that Intel knows exactly what has been going on but chose to turn a blind eye because its board partners' practice of de facto overclocking its CPUs from the factory, had a material benefit for Intel.
No CPUs have been blown up that we know of, this isn't the "Meltdown" fiasco, there has been some instability in certain workloads. While I agree with your sentiment that partner BIOS values should line up with Intel specifications, I don't think they're overly convoluted.

It's the job of the managers, systems engineers/coders etc. at these companies to understand these things. Skimming through the datasheet provided by Intel, it's not that difficult for an end user to plug in these values to their BIOS, so why is it difficult for a huge international company to copy and paste values? We've seen that even with the "Intel baseline profile" BIOS updates, values still do not line up with the first party Intel specification, which is nicely summarized in a few tables. Explicitly explained with references and full details in a comprehensive document. What more do partners need to adhere to spec?

I still think there is a lot of jumping on this topic to attack Intel, and not enough people criticising the fact that board partners who should know better are possibly comically incompetent to the point of not being able to copy and paste several numbers from a datasheet, or potentially still trying to gain competitive advantage by using the wrong values.
Posted on Reply
#120
chrcoluk
I dont see why some are acting confused, PL2 is 125, pl1 changes between perf and baseline as either 253 or 188.

After reading the documentation posted here, next time I reboot I am changing pl2 to 125w.

Buildzoid on his video checked specs and concluded sustained power was 125w as well, he also has the opinion he is not convinced the baseline mode on his gigabyte board was from intel or something gigabyte whipped together and has the opinion we will probably never know.
Posted on Reply
#121
Vya Domus
Another thing is that it's simply difficult to believe any of these companies would do anything without Intel's seal of approval.
Posted on Reply
#122
Dr. Dro
stimpy88The whole concept of the 14th gen was and is a complete failure and serves only as a cash grab.
I mean, "14th Gen" chips don't have as much as a new stepping. It's just a repackage and re-release of their existing chips to satisfy shareholders, because they didn't have Arrow Lake available on time and Meteor Lake isn't suitable to replace Raptor as a high-performance desktop processor (maxing out at around i5 level). Still, Intel did manage to improve yield to the point that the 14900K is now a mass-produced 13900KS, and the 14900KS pushed that even further, even if it's by only a few MHz or so. I didn't believe they'd be able to pull a 14900KS at all, and they indeed haven't with the "6.5 GHz" claims from early rumors, still, 6.2 with 5.9 all-core is not too bad. It's 300 MHz up from the 13900KS's average, which means they're excellent bins.
Sabotaged_EnigmaIntel's "Bulldozer" moment... They can't admit it's their fault despite the fact it is.
Engineering that's no matter how powerful doesn't beat laws of Physics anyway...
The difference is that Bulldozer sucked, and Raptor Lake doesn't.
AleXXX666make a "stable" MB BIOS from scratch - NO WAY
make a "patch" BIOS every later then - SURE, GOOD IDEA!:roll:
Everyone does this. Remember how Zen 4 launched with completely broken memory training (it'd take minutes to boot), how it had a clock ceiling on the memory that wasn't because of hardware but because AGESA was flat out broken, how the Ryzen chips actually caught fire because the AGESA-level current control wasn't functional, etc.

Basically: if you want a stable platform nowadays, just don't buy latest-generation gear. "Settle" for like, a Zen 3 or Rocket Lake platform with a fully updated BIOS.
londisteThere is no "should be". It is 253/253.
The 320 W setting is considered to be an "Extreme Power Profile" that is exclusive to the Core i9-12900KS, 13900KS and 14900KS SKUs, iirc. Otherwise you're correct.
Posted on Reply
#123
chrcoluk
Vya DomusAnother thing is that it's simply difficult to believe any of these companies would do anything without Intel's seal of approval.
My view is the opposite, when you consider some of the things I have witnessed, out of spec SA voltage, 120C tjmax.

My view is its unlikely they run things by Intel.

We also currently have a baseline mode on Asus that keeps 253w set. I think that wasnt ran by intel.

Asus also setting voltages that was blowing up AMD chips, dont think that was ran by AMD.

Most board vendors have just 1 or 2 bios dev as revealed by the guy who used to work for EVGA. Its not a large professional operation.
Posted on Reply
#124
dgianstefani
TPU Proofreader
chrcolukMy view is the opposite, when you consider some of the things I have witnessed, out of spec SA voltage, 120C tjmax.

My view is its unlikely they run things by Intel.

We also currently have a baseline mode on Asus that keeps 253w set. I think that wasnt ran by intel.

Asus also setting voltages that was blowing up AMD chips, dont think that was ran by AMD.

Most board vendors have just 1 or 2 bios dev as revealed by the guy who used to work for EVGA. Its not a large professional operation.
This was an AGESA problem, shown by how other manufacturers also had issues with failing chips.

ASUS just happened to have aggressive enough tuning that the problem was further exacerbated on some of their boards.

IIRC the problem was automatic voltage algorithms in the AGESA that linked memory voltage and memory controller voltage. So engaging EXPO would push internal chip voltages past safe limits.

This was an issue particularly with X3D chips due to lower voltage tolerances, but also impacted standard Zen 4 chips.
Posted on Reply
#125
Crackong
dgianstefaniNo CPUs have been blown up that we know of, this isn't the "Meltdown" fiasco, there has been some instability in certain workloads.
If CPUs were blown up people gets RMA and case would move a lot quicker.
What leaves everyone's mouth bitter is the CPUs aren't blowing up, no RMA, but confirmed decreased performance, and no clear solution is provided, yet.
dgianstefaniI still think there is a lot of jumping on this topic to attack Intel, and not enough people criticising the fact that board partners who should know better are possibly comically incompetent to the point of not being able to copy and paste several numbers from a datasheet, or potentially still trying to gain competitive advantage by using the wrong values.
It is Intel's CPU, it is their job to make sure the motherboard vendors having a correct 'Default' profile so it works 100% of the time.
This lack of communication alone is a big issue and is one of the Intel biggest fault.

And, if your CPUs are this fragile, measurements should be taken to 'prevent' the partners further messing it up.
Like AMD, with their X3D voltage issue, they forced new voltage setting very quickly and RMA every affected case.
Like Nvidia, Nvidia does a great job make sure the AIB cannot mess up their GPUs and, if something's up like the 12vhpwr issue, Nvida took the responsibility and took care every affected case.

Please noted in the above mentioned cases,
Although customers do blame the AIB partners,
But AMD/Nvidia themselves didn't actively placed the blame on their partners.
They just went in, solved the problem, and get out ASAP.

if they can do it, why not Intel ?
Posted on Reply
Add your own comment
Jun 1st, 2024 22:41 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts