Nuvia: don’t hold your breath

Cmaier

Site Master
Staff Member
Site Donor
Top Poster Of Month
Posts
5,502
Reaction score
8,933
I just asked the author and he confirmed that the X Elite allows you to target the NPU on Windows whereas the M3 doesn’t and so the results for the M3 are NPU plus CPU, which would account for the energy difference I would imagine.

Macos really needs an api to target the NPU exclusively.
they tend to avoid that when they think that their own solution is a stopgap until Arm adopts an equivalent functionality.
 

Artemis

Power User
Posts
102
Reaction score
50
i still say it does. whether you choose to take advantage of it is another matter. Depending on your package strategy, the RAM is located either right above the CPU, or adjacent to it. Call it 10 mm away. That means the time of flight is 60ps. Put the memory in DIMMS nearby on the motherboard instead. Call that 10cm away, That’s 600ps. So let’s call it 500ps difference. Assume your CPU runs at 4GHz. That means your CPU cycle time is 250ps. So it takes 2 extra cycles to address the RAM, and 2 extra cycles to read the RAM. Boo.

That’s assuming you optimize your design to take advantage of all that.
Yeah, all else equal I would guess it does, by a few nanoseconds.
 

Jimmyjames

Site Champ
Posts
785
Reaction score
881
LOL

Andrei used powermetrics — the old one that was better than we have today that still included DRAM — to show that it sucked. So weird you keep coming back to “he used it”. Dude he used it and thinks it sucks and that was why he published about it.



And he did that before they removed DRAM measurements too, so even if it were accurate it’d still have that issue. Mind you, the accuracy issue both ways.


Geekerwan only recently started using Apple’s internal modeling tool, which may also include other things PowerMetrics modifies.

And regardless, Andrei also called out those same 3.6W/7W A17/M4 measurements you posted here or elsewhere as “nonsense” (or bullshit, I can’t remember) in the Chips n Cheese discord and he would actually know. I happen to agree, the A17 especially is very unlikely.

The 11W M4? He says that’s reasonable.

Btw, notebookcheck does a skimmed powermetrics but their main measurements are via the wall, with an external monitor these days.

And second of all, that’s fine! RE: MS funded research, but I’d rather have it from the VRMs there too and compare to Apple. You can get directional ideas such as that an NPU might be more efficient than a CPU, I’m just not a huge fan of using first party APIs and firmware tools to compare two SoCs on power when we can do other ways and when the horse’s races are close.

Battery rundown quite frankly is probably one of the best ways to do things, and use multiple suites to do so.
You keep stating this and yet show zero proof.

Where did Andrew say it sucked? Show it. Where did he say 11 watts for the cpu alone is reasonable?

You are obsessed with dram power usage when we are discussing he cpu usage alone.

I’m sure you know better than MS. Maybe they said powermetrics sucked too!

All methods have their pros and cons, pretending that power metrics is worse than wall power for a measurement like cpu is nonsense.
 

Cmaier

Site Master
Staff Member
Site Donor
Top Poster Of Month
Posts
5,502
Reaction score
8,933

Cmaier

Site Master
Staff Member
Site Donor
Top Poster Of Month
Posts
5,502
Reaction score
8,933
Yeah, all else equal I would guess it does, by a few nanoseconds.

When i was busy shaving time off of opteron critical paths, a nanosecond was more than my entire cycle time :) If apple had the manpower to spend on it, they’d take advantage of that and gain the tiny latency advantage. But they don’t. (At AMD we would have. We were pretty relentless back then. Though, to be fair, keeping track of every one of millions of transistors in your part of the chip did tend to make us a little bit insane.

But I got better.

I swear.
 

Artemis

Power User
Posts
102
Reaction score
50
You keep stating this and yet show zero proof.

Where did Andrew say it sucked? Show it. Where did he say 11 watts for the cpu alone is reasonable?

You are obsessed with dram power usage when we are discussing he cpu usage alone.

I’m sure you know better than MS. Maybe they said powermetrics sucked too!

All methods have their pros and cons, pretending that power metrics is worse than wall power for a measurement like cpu is nonsense.
The Chips n Cheese discord has his comments. I’m not obsessed with DRAM, but DRAM is part of the power draw, that’s also how Apple keeps power down and is part of the system. If you want to know the CPU power, don’t be surprised when AMD and Intel guys quote implausibly low numbers relative to what you expect, and then you’ll balk and I’ll laugh and say “told you” because crappy PD and smaller cache on the CPU influences that.

And RE; software: other tools are more accurate than Apple’s.






IMG_2504.jpeg

IMG_2503.jpeg



RE: Apple and Powermetrics
IMG_2506.jpeg


IMG_2507.jpeg

I’m going to take Andrei on this, lmao.
 

Artemis

Power User
Posts
102
Reaction score
50
When i was busy shaving time off of opteron critical paths, a nanosecond was more than my entire cycle time :) If apple had the manpower to spend on it, they’d take advantage of that and gain the tiny latency advantage. But they don’t. (At AMD we would have. We were pretty relentless back then. Though, to be fair, keeping track of every one of millions of transistors in your part of the chip did tend to make us a little bit insane.

But I got better.

I swear.
Haha. Cliff it’s been a while since I remember you talking about adiabatic logic - did you ever work with any of that? This is random, but I feel like I remember you mentioning this.
 

Cmaier

Site Master
Staff Member
Site Donor
Top Poster Of Month
Posts
5,502
Reaction score
8,933
Haha. Cliff it’s been a while since I remember you talking about adiabatic logic - did you ever work with any of that? This is random, but I feel like I remember you mentioning this.

No :) No sane person actually did that outside a university. Except maybe Intel - I seem to recall they had some ALU they published about in JSSC and the press went crazy and it never came to anything. I did do other weird stuff (like fully-differential current mode logic with bipolar transistors) (which i even did again at exponential). But nothing as crazy as that stuff.
 

Artemis

Power User
Posts
102
Reaction score
50
No :) No sane person actually did that outside a university. Except maybe Intel - I seem to recall they had some ALU they published about in JSSC and the press went crazy and it never came to anything. I did do other weird stuff (like fully-differential current mode logic with bipolar transistors) (which i even did again at exponential). But nothing as crazy as that stuff.

Ahh it might’ve been bipolar transistors I was thinking about 😁. Yeah, reversible computing is more out there.

I need to go back and read some of your dissertation again which is above my head but still enjoyable. Caching is just an incredibly fun topic. Everything’s a tradeoff and there is no free lunch, but you might find a cheap date here and there is how I see it.
 

Jimmyjames

Site Champ
Posts
785
Reaction score
881
The Chips n Cheese discord has his comments. I’m not obsessed with DRAM, but DRAM is part of the power draw, that’s also how Apple keeps power down and is part of the system. If you want to know the CPU power, don’t be surprised when AMD and Intel guys quote implausibly low numbers relative to what you expect, and then you’ll balk and I’ll laugh and say “told you” because crappy PD and smaller cache on the CPU influences that.

And RE; software: other tools are more accurate than Apple’s.






View attachment 29537
View attachment 29538


RE: Apple and PowermetricsView attachment 29539

View attachment 29540
I’m going to take Andrei on this, lmao.
When it suits you I’m sure you do cherry pick. All methods have errors involved. if he said it then fair enough. I only know his Twitter account. We’ll see how accurate he is when the M4 reaches the Mac.

It’s worth remembering he compiled the initial data which quoted the +3200 gb score. A number which was used despite it being due to a lack of fan control on Linux. Before he worked for Apple’s competitor, he used powermetrics, now he works for Qualcomm he doubts it’s accurate. That is not to say he is wrong, but as with everything else, it has to be proved. Especially when battery life seems better on the M4 iPad Pro vs the M2.

To be clear, we’re expected to believe that the M3 which used about 15 watts multi core according to the terrible chart they show here

And now they are using 66% of that on single core performance in the M4, without losing battery life?
 
Last edited:
Top Bottom
1 2