Categorie
HardwareSoftware

Microsoft’s Project Scorpio: More Hardware Details Revealed

This news piece contains speculation, and suggests silicon implementation based on released products and roadmaps. The only elements confirmed for Project Scorpio are the eight x86 cores, >6 TFLOPs, 320 GB/s, it’s built by AMD, and it is coming in 2017. If anyone wants to officially correct any speculation, please get in touch. 

One of the critical points of contention with consoles, especially when viewed through the lens of the PC enthusiast, is the hardware specifications. Consoles have long development processes, and are thus already behind the curve at launch – leading to a rapid expansion away from high-end components as the life-cycle of the console is anywhere from five to seven years. The trade-off is usually that the console is an optimized platform, particularly for software: performance is regular and it is much easier to optimize for.

For six months or so now, Microsoft has been teasing its next generation console. Aside from launching the Xbox One S as a minor mid-season revision to the Xbox One, the next-generation ‘Project Scorpio’ aims to be the most powerful console available. While this is a commendable aspiration (one that would look odd if it wasn’t achieved), the meat and potatoes of the hardware discussion has still been relatively unknown. Well, some of the details have come to the surface through a PR reveal with Eurogamer’s Digital Foundry.

We know the aim with Project Scorpio is to support 4K playback (4K UHD Blu-Ray), as well as a substantial part of 4K gaming. With recent introductions in the PC space of ‘VR’ capable hardware coming down in price, Microsoft is able to carefully navigate what hardware it can source. It is expected that this generation will still rely on AMD’s semi-custom foundry business, given that high-end consoles are now on x86 technologies and Intel’s custom foundry business is still in the process of being enabled (Intel’s custom foundry is also expected to be expensive). Of course, pairing an AMD CPU and AMD GPU would be the sensible choice here, with AMD launching a new GPU architecture last year in Polaris.

Here’s a table of what the reveal is:

Microsoft Console Specification Comparison
  Xbox 360 Xbox One Project Scorpio
CPU Cores/Threads 3/6 8/8 8 / ?
CPU Frequency 3.2 GHz 1.75 GHz 2.3 GHz
CPU µArch IBM PowerPC AMD Jaguar AMD Jaguar
Shared L2 Cache 1MB 2 x 2MB 2 x 2MB ?
GPU Cores   16 CUs
768 SPs
853 MHz
40 CUs
2560 SPs ?
1172 MHz
Peak Shader Throughput 0.24 TFLOPS 1.23 TFLOPS >6 TFLOPs
Embedded Memory 10MB eDRAM 32MB eSRAM None
Embedded Memory Bandwidth 32GB/s 102-204 GB/s None
System Memory 512MB GDDR3-1400 8GB DDR3-2133 12GB GDDR5-1700
(6.8 Gbps)
System Memory Bus 128-bits 256-bits 384-bit
System Memory Bandwidth 22.4 GB/s 68.3 GB/s 326GB/s
Manufacturing Process   28nm 16nm TSMC

At the high level, we have eight ‘custom’ x86 cores running at 2.3 GHz for the CPU and 40 compute units at 1172 MHz for the GPU. The GPU will be paired with 12GB of GDDR5, to give 326GB/s of bandwidth. Storage is via a 1TB HDD, and the optical drive supports 4K UHD Blu-Ray.

Let’s break this down with some explanation and predictions.

Eight Custom CPU Cores: But They’re Still Jaguar (or almost)

The Xbox One uses AMD’s Jaguar cores – these are low powered and simpler cores, aimed at a low-performance profile and optimized for cost and power. In non-custom designs, we saw these CPUs hit above 2 GHz, but these were limited to 1.75 GHz in the Xbox One.

AMD technically has several cores potentiall available for Scorpio: Excavator (Bulldozer-based, as seen on 28nm), Jaguar-based (also from 28nm) or Zen based (Seen on 14nm GF). While the latter is a design that has returned AMD to the high-end of x86 performance computing, offering high performance for reasonable power, a Zen design would be relatively quick turnaround from a consumer launch a month ago. While eight Zen cores would fit in with a standard Zeppelin silicon design, AMD has been manufacturing them hand-over-fist since the launch of desktop-based Zen CPUs for PCs in March. One of the detractors against Zen inside Scorpio is the fact that it was only launched recently, and arguably the desktop PC market is more financially lucrative for AMD over the semi-custom business. Because of the time frame, even if Microsoft could go for Zen in the Scorpio, it this would increase the base cost of the console by redesigning the cores on 16nm TSMC. However, if Microsoft were going for a premium console ($ 700+), this might make sense.

In the Digital Foundary piece, Microsoft stated that the CPU portion of Scorpio has a 31% performance gain over the Xbox. This isn’t IPC, this is just raw performance. Moving from Jaguar to Zen would be more than 60%, and actually the frequency difference between the 2.3 GHz in Scorpio and 1.75 GHz in Xbox One is exactly 31%. So we are dealing with a Jaguar-style style (although perhaps modified).

(A note on Zen power and frequency – 2.3 GHz is a low frequency for a Zen CPU based on what we have seen in desktop PCs. Some work done internally on the power consumption of Zen CPUs has shown that the design requires a lot of power to move between 3.5 GHz and 4.0 GHz, perhaps suggesting that 2.3 GHz is so far down the DVFS curve that the power consumption is relatively low. Also, we’re under the impression that getting a super high frequency on Zen is a tough restriction when it comes to binning chips – offering a low-frequency bin would mean that all the silicon that doesn’t make it to desktop retail due to an inability to go up the DVFS curve could end up in devices like the Scorpio. The spec list doesn’t have a turbo frequency, which remains an unknown (if present).)

That being said, this is a ‘custom’ x86 core. Microsoft could have requested specific IP blocks and features not present in the original Jaguar CPUs but present in things such as Zen, such as power management techniques. Typically a console shares DRAM between the CPU and GPU, so it might be something as simple as the CPU memory controller supporting GDDR5. So instead of seeing Zen coming to consoles, we’re seeing another crack at using Jaguar (or Jaguar+) but revised for a smaller process node to keep overall costs down – and given that the main focus on a console is the GPU, that’s entirely possible.

40 Customized Compute Units

AMD launched Polaris 10 last year in the RX series. This is latest compute architecture, released on a 14nm Global Foundries process, and give substantial power efficiency gains over previous 28nm designs. The first consumer GPUs were aimed at the $ 200-$ 230 market and below, which is something that would be of interest to console manufacturers.

Bypassing AMD’s Fiji GPUs using silicon interposers and high-bandwidth memory, AMD’s latest design is the RX480. The RX 480 is a 36 compute unit design, using 4GB or 8GB of 256-bit GDDR5 memory, giving 256GB/s of total memory bandwidth. According to the information given to Digital Foundry, Scorpio will have 40 compute units, 12 GB of GDDR5, and will be good for 320 GB/s of memory bandwidth. Technically the RX 480 is a fully enabled design, and only offers 36 compute units in total, suggesting that Scorpio is either using a new silicon spin version of this design (with a potentially lop-sided memory configuration), or is moving on to a Vega based design. The fact that the spec list has a 384-bit memory bus listed, and Polaris designs so far are limited to 256-bit, suggests that we might be dealing with Vega despite the Polaris-like configuration. That being said, it’s also a cost issue again: Vega is expected to cost a pretty penny, whereas consoles are often low-cost designs. So while the memory bus is high, this is most likely a Polaris implementation, especially as we already know that Scorpio will be > 6 TFLOPs on 40 CUs, and the RX 480 is ~5 TFLOPs on 36 CUs.

The Eurogamer article quotes Andrew Goossen, Technical Fellow for Graphics at Microsoft:

Those are the big ticket items, but there’s a lot of other configuration that we had to do as well,” says Goossen, pointing to a layout of the Scorpio Engine processor. “As you can see, we doubled the amount of shader engines. That has the effect of improvement of boosting our triangle and vertex rate by 2.7x when you include the clock boost as well. We doubled the number of render back-ends, which has the effect of increasing our fill-rate by 2.7x. We quadrupled the GPU L2 cache size, again for targeting the 4K performance.”

The memory bus is listed as a 384-bit interface which means 12 32-bit channels. We are told that the GDDR5 modules are running at 6.8 GB/s. The 12GB of GDDR5 is split with 4GB available for the system and 8GB available for developers. There is no ESRAM, given the reason that the bandwidth of the GDDR5 is sufficient. The counter to this is a slightly higher latency, which Microsoft expects developers to hide when pushing higher resolutions.

Ideally, I want to get Ryan’s thoughts on this and will do so when he signs in for the day, but his analysis on some of the specifications back in June 2016 still stands:

The memory bandwidth of Project Scorpio, 320 GB/s, is also relatively interesting given the current rates of the RX 480 topping out at 256 GB/s. The 320 GB/s number seems round enough to be a GPU only figure, but […] how much is impossible to say at this point.

Additional: On 4K support, the latest AMD media block supports 4K60 with HEVC, as well as HDMI 2.0. When rendering 4K content to a 1080p screen, Microsoft has mandated that Ultra-HD rendering should super-sample down to 1080p to all developers. Microsoft also confirms full DX12 support, making use of new features to push draw calls and better multi-threaded capabilities.

Designing for 16nm at TSMC

If we move forward with a Jaguar plus Polaris prediction, it means that both designs will have to be reconfigured for TSMC’s 16nm process. For the Jaguar-based CPU, it would result in much lower power than 28/32nm, and also a much lower die area. Compared to the GPU, an 8-core Jaguar design might be 10-15% of the entire silicon. The GPU will likely be on similar terms, although with a larger memory bus and more CUs (44 in the design, 40 in use).

AMD recently afforded additional quarterly costs for using foundries other than Global Foundries (as per their renegotiated wafer agreement), which a number of analysts chalked up to future server designs being made elsewhere. A few of us postulated it’s more to do with AMD’s semi-custom business,  and either way it points to silicon Zen being redesigned for 16nm TSMC.

Digital Foundry reported the total die size for the combination chip is listed 360mm2 at seven billion transistors (including CPU and GPU), with four shader engines each containing 11 compute units (one is disabled per block). This is all within 7 billion transistors. It was also mentioned that the floor plan of the silicon, aside from four groups of 11 CUs, also had two clusters of two CPU cores.

For anyone familiar with Zen, this would initially suggest we might be dealing with CCX units, given each is four cores, but the performance metrics still point to Jaguar. This means that the CPUs are a tiny chunk of the die area on the silicon, probably under one fifth of the chip. We don’t know the size of the GPU, but 36 CUs of Polaris 10 on GloFo 14nm is 232mm2 at 5.7 billion transistors. Scaled up to 40 CUs, this is around 257 mm2, leaving 100mm2 for cores, a memory controller, and other IO. 

Microsoft also states that the power supply with the unit can be suited up to 245W. If we assume a low-frequency Jaguar CPU inside, that could be around 25W max, leaving 150-220W for the GPU. A full sized RX 480 comes in at 150W, and given this GPU is a little more than that, perhaps nearer 170W (or tuned to 100-150W, depending on low-frequency ranges). The power supply, in a Jaguar + Polaris configuration, seems to have a good 20-25% power budget in hand.


Source: Digital Foundry

Based on some of the discussion from the source, it would seem that AMD is implementing a good number of its power saving features from Excavator and Zen, particularly related to unique DVFS profiles per silicon die as it comes off the production line, rather than a one-size fits all approach. The silicon will also be paired with a vapor chamber cooler, using a custom centrifugal fan.

What We Don’t Know

Hardware aside, the launch titles will be an interesting story in itself, especially with recent closures of dedicated MS studios such as Lionhead.

Project Scorpio is due out in Fall / Q3 2017.

 

This article originally predicted a Zen + Polaris configuration, but due to a secondary analysis, is now a Jaguar + Polaris prediction.

Source: Digital Foundry

Autore: AnandTech

Lascia un commento

Il tuo indirizzo email non sarà pubblicato. I campi obbligatori sono contrassegnati *

Questo sito usa Akismet per ridurre lo spam. Scopri come i tuoi dati vengono elaborati.