Past EED rants

Labels

Live leaderboard

Poker leaderboard

Voice of EED

Wednesday, 9 February 2005

Dodgy Intel: The Pentium M fiasco part 1 [Lurks]

For a company with so many resources to hand, it is quite remarkable how Intel have dropped the ball. Mid July last year the problems were so great that Intel CEO Barrett issued a company wide memo in which he was direct about the failings to that point. "I believe, as you do, that this is not the Intel we all know and that it is not acceptable", he said.
He was talking about a range of delays and failures but I want toconcentrate on one in particular. Potentially the biggest one Intel has ever made and one which is mired in internal politics and single monumental bad decision from the marketing arm which dictating the entire direction of product development in their largest business unit. We're talking about the Pentium 4 here.
You might recall that the Pentium III was topping out at around 1.2GHz or so.Some place inside Intel, a meeting went on where what was discussed was theneed to deliver CPUs that were clocked faster. Not that performedfaster, but were clocked faster. It turns out that the team working on anew processor, the Pentium 4, were going to crazy lengths to get the speedup. It would have a 20-stage pipeline rather than the 10-stages of thePentium III. The reason for a boosted number of stages is that at fasterspeeds, it's progressively harder to coordinate clock signals. It's alittle bit like taking smaller, faster, steps because you can be more sureof your footing.
Meanwhile in Israel, Intel's design team had been working on a notebook CPUfor el-cheapo notebooks called Timna. They concluded that the Pentium 4 Netburstarchitecture (the crazy 20-stage long one) was hopeless for powerconservation. They wanted to get more work done per clock, not less. Timnawas also to have a built-in memory controller and graphics functionality.The project was cancelled however, and some people say this experience iswhy Intel was never again open to an on-board memory controller such asthat found on AMD's Athlon 64 range. Another bad decision.
The Israeli team were then tasked to develop a new processor dedicated tomobiles. They knew the Pentium III architecture very well and so worked withthat to reduce power and improve performance. The pipeline was stretched alittle to around 12-stages (no one actually knows outside of Intel) and animproved branch prediction unit installed. This component minimizes the effect of amispredicted branch - a problem that gets worse the longer a pipeline is.Another analogy, if you've guessed at your turn off and you've driven for 20 milesinstead of 10, before you realised, your journey time is hurt more. You canminimize that by getting a smarter navigator.
What the Israeli team were working on came to be known as the Bania core, aka the Pentium M processor. The Centrino branded notebooks have Pentium Mprocessors in them and it's clear how well they've done - they've all butkicked AMD out of the market in thin and light notebooks.
At the same time, Intel was ramping up the clock speed on the Pentium 4 yetfurther. Up to 3GHz and beyond. The AMD Athlon 64 series has a 12-stagepipeline and is clocked at much lower speeds to get the same work done andhence they introduced the numbering scheme again. Eg the Athlon 644000+ processor which actually runs at 2400MHz. Obviously consumers aren'tsavvy enough to know the difference between these clock speed apples andoranges so the marketing departments have rolled up a number that makesmore sense.
The trouble is, about the time of Barrett's announcement to the company,that the Pentium 4 'Northwood' core was out of steam and the replacementcore, the 'Prescott' was delayed. This would have an absolutely insane31-stage pipeline offset by a branch prediction unit virtually self-awarein it's complexity and to compensate for reduced performance across the board, Intel spent the spare die space (smaller transistors) on a big whack of cache - 2MB of it. It was, justlike all other Intel cores in this lunatic escapade, slower at the sameclock speed than the previous unit. Also despite the fact it's fabricatedon the bleeding-edge 90nm production process (very very small transistors,which normally make things go faster and use less power), Prescott in factused lots more power and was slower. But hey, you can buy it in faster MHzso all is good right?
At this stage, people in Intel must really have been feeling the heat -literally in Prescott's case. AMD wasn't showing signs of slowing down andthe firm had managed to get Microsoft onboard their 64-bit x86 instructionsand the mighty AMD Athlon FX series CPUs were basically faster processorsin everything except for media encoding type applications. The FX serieswas notably a lot faster for gaming than Intel CPUs and that's a badthing because hard core gamers are generally the consumers of the highestend most expensive chips.
Desperation began to show. Back in the Northwood era Intel fabricated aridiculous CPU, essentially a Pentium 4 Xeon with 2MB of cache, and calledit the 'Extreme Edition' and charged more money for this CPU than you cango down PC World and pick up an entire computer system for. Just to beknown as king of the hill once more. It didn't work, it still wasn't fasterthan AMD. Then things got so bad that Intel couldn't shore things up easily any more.
4GHz. AMD had Athlon 64 4000+ on the roadmap. In fact you can buy thosenow and the bigger brother, the FX-55 CPU. Intel was forced to admit thateven Prescott couldn't do 4GHz without heroic measures. What heroicmeasures were those? Well, basically redesigning the entire ATX PC formfactor to place a hoover over the CPU and blow air directly over the CPUfrom outside of the case. The industry largely greeted this developmentwith howls of derision and eventually Intel cancelled the 4GHz Pentium 4.
My God, now we have Intel which has no answer to the AMD's finest andcertainly no answer to AMD's next generation goodies. The 4200+ and theFX-57, due out shortly. What the hell were they going to do?
During this end-game, Intel went back on the whole marketing philosophywhich had lead them down this road. Their branch prediction unit had failedin the most spectacular way and sent the bulk of their desktop CPUdevelopment down a dead end path. Like their outrageous Prescott CPU, theyhad to lumber back a long way to come up with an alternative.
Suddenly clock speeds disappeared from Pentium 4 chips when the move to theLGA socket 775 happened. On the market now; the Pentium 4 530 is a 3.0GHz unit and the 540 is a 3.2GHz CPU. You've really got to ask why they moved to another core designed to ramp MHz up and then promptly remove the MHz ratings? Crazy stuff indeed.
What are they going to do in the future? Firstly, it's widely known thatthe next desktop processor, Smithfield, is going to be dual core.Some people had speculated, mne included, that this was going to betwo Pentium M cores on a single die. That's not correct. In fact Smithfieldis a fairly typical Intel stopgap measure. It appears most likely to be twoPentium 4 Prescott cores in a single package (the physical thing with allthe pins) wired together internally. Heh, that means it's going to be evenhotter than Prescott is and the initial figures of 130W of power appears toconfirm that - being about 15% more than the current Prescott. These willmost likely be clocked slower than the single Prescott CPUs availabletoday.
So ironically once again, Intel will launch a CPU which will be slower thanthe one it replaces. Only this time you'll have to re-write your softwareto be multi-threaded, or run multiple programs, to gain the performanceboost. In fairness, that's not so bad because this change was coming anyway and pretty much everyone designing CPUs is looking at multi-cores for the future.
Intel will really only finally get back on track when they bring out Yonah, a good year away according to the current roadmap. This is a dualcore processor based on Dothan-like (the latest Pentium M, more in part 2) Pentium M cores. This is the culminationin Intel cancelling an entire architecture, dumping the marketing relianceon clock rates and admitting that the Israeli team have produced asignificantly superior architecture. One which you can actually buy todayin laptops but no one thought to ask how it might run on the desktop.
Or did they? Intel would have thought about it but imagine letting thosepesky Israeli's show up their entire North American engineering operations? Unthinkable.It's just as well that no one would get to compare the processors side byside. Just as well no one would benchmark the Pentium 4 Prescott versus thePentium M Dothan right?
Then the unthinkable happened; some Taiwanese bastards went and made adesktop motherboard that took a Pentium M CPU.
Stay tuned for part 2.

4 comments:

  1. Irrespective of performance the price delta between a 770 Pentium CPU and a Pentium 4 3.4GHz 550 is still 200-300$ AND you have to overclock it to hit the benchmark numbers. Smithfield will as you say be not too successful performance wise but what else are they supposed to do? From my point of view you are getting to Pentium 4 processors for the price of 1, as the L2 Cache is split 50/50 between the 2. All it takes is for some one to tweak Doom3/HL2 to support it and there'll be a significant performance boost, but without the price that goes with it. as for Dothan - it will only ramp so far...as far as I can see they are topping out at 780 (2.2GHz) before Yonah in Q1 next year, which will only have 1MB L2 cache per core instead of Dothan's 2MB.

    ReplyDelete
  2. Hmm I can counter a lot, well pretty much all, of what you've said. First of all, it's true there is a big price difference between those CPUs but that has very little to do with the actual cost in producing those cores - purely that high end Pentium Ms go in more expensive products. It's a price the market will bare. The Dothan core is 84mm squared versus the 112mm square core of Prescott. That means that Prescott is much more expensive to make.
    What are Intel supposed to do instead of the ridiculous Smithfield core? Well how about manufacturing a Dothan-like core with a built-in memory controller and a modern chipset with PCI-Express etc to support it. This would be remarkably faster than anything Prescott could do.
    'All it takes' is a tweak to convert games to being multi-threaded? I think you've got a perilously shady grasp of software engineering. Id Software has tried for years to make their games SMP compatibile and it's never worked. Valve have yet to even try. Furthermore, these games aren't really CPU limited anyway.
    I'm not entirely sure I buy Dothan topping out at 2.2GHz. Why am I not entirely sure? Because my fucking lounge server is a Dothan running at 2.5GHz, that's why. The heatsink is barely warm to touch and it's absolutely annihilating my P4 3.6GHz Prescott in pretty much every test other than memory bandwidth. A limitation which we know has nothing to do with the core and everything to do with the architecture around it. 2.2GHz is before the shift to 65nm too.
    It's just because the core is so small, and because with the move to 65 it will be even smaller, putting two on a single die is feasible. Then Intel will have something really kick arse on the table.
    Finally, all of your comments so far seem to be defending the use of the Pentium 4 in practical terms today - on the desktop. That is not the thrust of my argument here. My argument is that Intel have had a higher performance, highly efficient core in production which makes the Prescott look deeply silly.
    To say nothing of the fact that Intel have virtually created an industry in CPU cooling which did not, when you examine the engineering, actually need to exist. If you need further proof, consider this. The Prescott CPU I bought for around about £160 needs this to cool it silently while the Dothan CPU I bought for £143 only needs this to cool it silently (please note the parallel socket as a size comparison) and what's more; overclocking my Prescott as far as it will go ends up encoding a song in LAME in 49 seconds and on the Dothan it takes 42 seconds. That's on lousy slow single channel memory too.
    It doesn't take a genius to realise we've been had.

    ReplyDelete