Data center power in 2019 - EDN

2022-08-08 10:44:33 By : Mr. mftecknology W

Before I begin discussing data center power management solutions for 2019, I would like to briefly take a general overview of data centers. In 2016, I wrote about power supply solutions for improved efficiency; much has changed since then.

Figure 1 Data Center Frontier discussed trends in data centers for 2019 (Image of a data hall at a LinkedIn facility in Hillsboro, Oregon, courtesy of LinkedIn)

It is expected that there will be more than 175 zettabytes of data by 2025. Data center construction and deployment, as well as upgrading efforts in existing older ones, is booming with the advent of 5G, starting in earnest at the 2020 Olympics in Japan (6G is already being discussed for future development) and the growth of artificial intelligence (AI) and machine learning (ML).

Learn more in these related Special Project articles: 

There will be numerous connected technologies that will want to use the advantages of 5G. Their challenge is one of processing the high volumes of data at high speeds. Enter edge computing. This may be the next major technology trend after the cloud; edge computing describes an environment where data processing takes place as close as possible to the data source. This will ensure speed and low latency, helping to meet 5G’s performance goals. However, there will still be a need for central data centers, to handle applications’ less latency-critical needs.

What we will see is the development of next generation central offices (NGCOs). These are edge cloud data centers that can support both fixed and mobile traffic. Serving on average 35,000 subscribers per central office, compared with approximately 5,000 today, they will be located between the radio access network (RAN) and the central core.

Wherever data is stored or processed – be it on the edge, in regional centers known as metros, or centrally – there will be a growing demand for capacity. This is set to increase significantly from late 2019, and service providers will need to refine or transform their architecture to support 5G. Much of the data that will move through 5G networks will exist in the cloud, underlining the vital role that data centers must play.

Preventing disruption to the systems in the data center building is critical; downtime means dollars lost and unhappy customers. The operator can rely on uninterruptible power supply systems and power distribution units that safely and reliably control the flow of electricity to sensitive equipment. Small to mid-sized businesses and residential buildings with back-up power generation may also be candidates for load management programs. Surpassing 10kW per rack is the norm, which will make in-rack power protection less viable. The usage of end-of-row UPS systems is coming.

IPM is a combination of hardware and software that optimizes the distribution and use of electrical power in computer systems and data centers. While the installation of IPM involves up-front cost and ongoing maintenance, the technology can save money in the long term as a result of reduced electric bills, reduced downtime, and prolonged hardware life.

Most effective IPM solutions incorporate temperature monitoring and regulation, voltage regulation, current limiting, and load distribution. Advanced IPM technology deploys branch circuit protection (each group of outlets has its own breaker or fuse), centralized/integrated management to enable administrators to monitor all data center hardware so they can isolate problems and resolve them quickly. Smart load shedding is also used to methodically shut down non-essential devices under specified conditions.

The use of three-phase power will typically balance power loads and maximize the available current for each load. Using zoned cooling can prevent isolated overheating incidents with a minimum of wasted energy. System redundancy will ensure uninterrupted operation in cases of localized hardware or software failure, and coordinated management of power-supply hardware from multiple vendors can keep Data Centers up and running continuously.

New cooling techniques such as direct liquid cooling are beginning to get noticed. The market is exploring multiple options for this technology, with everything from direct water-to-chip to fully submerged servers on the table (see Submerge your power supply, and other options).

Schneider Electric’s data center division is looking at direct liquid cooling as its next big growth area. Hyperscale data center operators, the cloud platforms, should drive most of that demand. Even direct-to-chip and full immersion are being discussed.

Efficient Power Conversion’s eGaN thermal enhancements will be covered later in this article.

Lithium-ion batteries are now taking over for lead-acid batteries in data center UPS systems for battery backup.

Higher-voltage distribution of power reduces I2 R losses. Moving to a 48V vs. 12V power distribution scheme creates a 16× in power loss reduction.

Now let’s look at what some of the top power IC companies have for solutions that range from 48V conversion down to the extreme low voltage/high current needs of the graphics processor (GPU). I love these choices because, looking through the eyes of a power designer, they present such great options to create a power supply architecture in the increasingly challenging data server realm.

One of the key things that I learned during my 40 years as a circuit design engineer was that not all circuit architectures are created equal. One architecture may fit perfectly in a particular design where another different architecture might fit better in a different scenario. Keep an open mind with the following solutions; your particular project will have specific needs for power management as there is no ‘cookie cutter’ design for power. Especially pay attention to your customer ‘care abouts’ and then use your power design expertise to guide and advise the final customer toward what you feel is the best design. Communication and discussion are imperative when choosing the best power solution for any project.

That said, I really, really like solutions with GaN as the power element—that’s my personal first preference in my designer opinion and experience. The final particular power architecture in which GaN is employed is many and varied—choose wisely.

In Reference 5, Alex Lidow from EPC examines the cases for isolation vs. no isolation and regulation vs. no regulation architectures. The idea is that first stage may not need to be regulated while operating as a DC transformer (DCX). In these cases Lidow's article goes on to look at four different solutions with intermediate buses:

It was determined that the best efficiency was achieved with either the dual inductor hybrid converter (DIHC) or the LLC with a 6 VOUT .  The DIHC topology, however, is relatively new and has yet to be widely adopted.  The 48 VIN – 6 VOUT LLC, coupled with a 6 VIN – 1 VOUT buck converter is being quickly adopted in new AI and gaming applications due to its high efficiency, high power density, and low cost. 

In all the topologies with 48 VIN , the highest efficiency was achieved with GaN devices.  This is due to their lower capacitance and smaller size. 

It makes so much sense to me that GaN should be the power transistor of choice in data center power architectures where size, efficiency, and speed are critical. The EPC ‘gurus,’ led by Alex Lidow, have come up with a really neat design architecture.

I particularly like EPC’s chip-scale packaging that allows six-sided cooling10 for their eGaN devices in the data center. With the ever-increasing enormous power appetites of GPUs, some data centers are going toward submersible liquid cooling. eGaN can help delay that cost and effort better than any other power element.

To isolate or not isolate, that is the question5

At the front end of the power design, EPC has a hard-switched, 48V to 12V buck, transformer-isolated, regulated 500W, 1/8th brick converter with a 12V, 42A output; their EPC9115 has a 96 to 97% efficiency. High frequency GaN switching shrinks the magnetics and hence there is a smaller board footprint than conventional silicon solutions.

For a non-isolated front end, there is a 48V to 5-12V synchronous buck converter at 25A. Again, this design uses GaN power elements with high-frequency switching for a small board footprint that silicon power elements cannot match. This design yields a 97% peak efficiency at a 15A load and 96.5% at a 25A load.

48V step-down to 12V LLC DC transformer front end

This design is able to maintain a high efficiency over a wide operating range when it is operated as a DC transformer (DCX) with a fixed conversion ratio. EPC has a 48V to 12V demo board at 900W while exceeding 98% efficiency.

(b) Figure 2 Power architecture schematic of N:1 LLC converter configured with center tapped rectifier (a) and photo of the 1 MHz, 900 W capable, 48 V to 12 V LLC converter with dimensions (b) (Reference 5)

48V to 6V LLC converter with 8:1 ratio transformer5

This design can handle 900W at 1 MHz. This transformer design uses a 14-layer PCB and has a magnetizing inductance of 2.2 µH and, of course, GaN power elements.

Figure 3 A 1 MHz, 900W, 48V to 6V LLC converter design (Reference 5)

There seems to be a larger trend in transitioning to 4V to the load architectures right now. It depends upon the output transistors and some other components, but you really do not lose that much by going from 6V to 4V output. The transformer would become a bit bigger at 900W because it scales with that ratio since it is a matrix transformer (all of the original matrix transformer patents have expired, and that IP is now in the public domain). Efficiency-wise this architecture would be around 98%.

Alex Lidow said 6, 5, 4, and even 3.3V architectures are all viable. His piece on powering graphics processors from a 48V bus5 analyzes several intermediate stage architectures going down to POL.

48V to 1V hybrid converter5,8,9

This design uses a dual inductor hybrid converter (DIHC)8 based on the Dickson switched-capacitor converter9 . The DIHC architecture uses two interleaved inductors at the output and does not need the two large synchronous switches that the hybrid Dickson converter needed (S9 and S10 in Figure 4 ).

Figure 4 8-1Dickson converter from Reference 9

This design allows the DIHC to have almost 2× lower DC output impedance contribution of conduction of switches and flying capacitors leading to 2× smaller conduction losses than the hybrid Dickson converter. See the efficiency curves for the DIHC in Figure 5 .

Figure 5 The DIHC measured efficiency with a 48V input with various low voltage outputs (Reference 5)

The higher frequency transformer bandwidths around the area of 2 MHz improves response time of the design.

The highest efficiency was achieved with either the DIHC or the LLC with a 6 VOUT . EPC acknowledges that the DIHC topology is relatively new and has yet to be widely adopted. I see good things for this kind of architecture as we move forward with GaN content in such designs.

The 48 VIN – 6 VOUT LLC, coupled with a 6 VIN – 1 VOUT buck converter is a winner in high efficiency and is a good fit for new AI and gaming applications along with high power density and low cost.

In all the topologies with 48 VIN , GaN exhibits the highest efficiency.  This is due to their lower capacitance and smaller size.

I am very impressed with STMicroelectronics’ strategy and offerings for data center power architectures. First of all, they are a part of the Power Stamp Alliance. I like the fact that both designers and procurement people, in companies developing cloud data center power, get the option of having multi-sourced power solutions that are form-fit-and-function from multiple power vendors. The data center people are happy with that as well.

Figure 6 STMicroelectronics has a really nice array of data center power solutions (Image courtesy of STMicroelectronics)

I spoke with Paolo Sandri, Product Marketing Manager for power ICs at STMicroelectronics, about their data center power management strategy. The bottom line is that ST is offering a variety of options that are based upon their customer needs. I like that because one size does not fit all.

48V to 12V intermediate unregulated bus converter (IBC)

This design has a 40V to 60V input capability with a 4:1 conversion ratio and up to 1kW thermal design power (TDP). Intel defines TDP11 as “A specification of the processor. OEMs must design thermal solutions that meet or exceed the TDP as specified by the processor’s datasheet.” ST understands the processor needs!

Figure 7 A switched tank converter (STC), an un-regulated IBC(Image courtesy of STMicroelectronics)

This design has zero crossover switching (ZCS) for all the MOSFETs with off-the-shelf components and a low profile height of less than 5mm. The size is a 1/8th brick with 98% efficiency at 360W.

The STBuck (stacked buck), 48V to 12V regulated IBC

Figure 8 A regulated intermediate bus scalable architecture as high as 3.2 kW with an 800W TDP (Image courtesy of STMicroelectronics)

This converter architecture has a 36V to 60V input with a 12V output that is adjustable by the PMBus (I like digital power control!). The form factor is a 1/8 brick and is scalable up to four cells.

The efficiency curve peaks at 98.4% with 150W output (Figure 9 ).

Figure 9 The STC efficiency curve (Image courtesy of STMicroelectronics)

48V to POL direct conversion

This design has a 40V to 60V input range with a VOUT for an Intel VR13HC processor at 1.8VTyp with full compliance to Intel’s test plan. Plus it can power a DDR at 1.2VTyp .

Figure 10 Direct conversion from 48V to POL and can be isolated or non-isolated (Image courtesy of ST Microelectronics)

It also has a 205W TDP with 413W Max. IOUT Max is 228A at a switching frequency of 570KHz. The power density is 100W/in2 and a board footprint of 1.6×2.6″.

I give STMicroelectronics an “A” rating for excellent data center power options for designers.

I spoke to Robert Gendron, Corporate Vice President, Product Marketing & Technical Resources at Vicor regarding their data center strategy. I first asked about the use of GaN in their architectures; Vicor has evaluated that technology along with other FETs.

Their non-isolated bus module (NBM), a bi-directional fixed architecture, will convert 48V to 12V at 97.9% peak efficiency and soon Vicor will have a next generation version pushing 98.5% efficiency. Given the high density of the NBM (providing 800W continuous output power in a 22.8×17.3×7.4 mm package), they have several designs where the NBM is placed on the motherboard in front of a 12V multiphase voltage regulator (VR).

Figure 11 Vicor’s VR can get close to the processor (Image courtesy of Vicor)

Their DC-DC converter module (DCM), a 48V to 12V, non-isolated converter, supports 48V high-performance GPUs in data centers that are still relying on legacy 12V power distribution,

Vicor has also built-out an entire portfolio of devices to support 380V, AC, and legacy 12 loads leading to a 48V hub. The “hub” enables the 48V or safety extra low voltage (SELV) power distribution within the rack minimizing losses. From the 48V, they can go direct to the CPU with their factorized power solutions including lateral and vertical power delivery schemes. They can also support conventional multiphase voltage regulators at 12V with their NBM and DCM products providing 48V to 12V conversion.

Figure 12 MoreVicor solutions (Image courtesy of Vicor)

Designers can easily modify 12V motherboards for 48V rack usage with very little design modification. While they were originally designed for high performance computing (HPC) use.

With GPUs drawing currents of 500A and moving to 1,000A in the future, plus AI (learn more about powerful AI here) and deep learning coming into data center processing, we need innovative design architecture thinking to properly serve those applications.

Processor designers are building so many multiple cores inside their architectures to reduce the latency nowadays for such applications as 5G, smart homes, smart factories, etc. There are 500W processors now. The goal is to get everything into the same die to reduce latency and they also want optocouplers right adjacent to the processor for the same reason.

Vicor realizes that board losses are significant. They feel that this outweighs the extra 1% increase in efficiency in the power solution. They look at power loss in 400 µOhm traces, and at 200A there is a loss of 10% into the board itself. Even with a 99% efficiency, you can lose a great deal of power at currents of 200A in those board trace I2 R losses.

Vicor’s current multiplier solution is another excellent choice for a designer’s power architecture solution to power data center processors.

Power-on-package technology enables current multiplication, allowing for higher efficiency, density, and bandwidth. Providing current multiplication within the package can reduce interconnect losses by up to 90%, while allowing processor package pins, typically required for high current delivery, to be reclaimed for expanded I/O functionality.

There are ‘off limits’ areas near the processor since noise, etc. can be a problem with processor accuracy causing errors as a result. The vertical power module (VPM) has a very low noise topology. Intel did a study about three years ago where colors represented noise in an infrared image that showed acceptable noise levels near the processor.

Vicor says that one of the problems found in a multiphase designs is that the inductors/magnetics are very noisy. Vicor does not use an inductor in their output stage so the noise levels are low, and this enables them to get right up close and personal to the processor and I/O lines. They have never had a noise issue in these architectures or tests.

Another issue is that most of the AI processors and other very high speed GPU processors need access to all four sides around the processor chip. Data centers and other electronic architectures have always wanted power to be virtually invisible. Thus these restrictions bring a challenge for power designers in these kinds of architectures.

The geared current multiplier architecture

Figure 13 The Vicor geared current multiplier (GCM) (Image courtesy of Vicor)

Vicor’s largest effort is with vertical power delivery (VPD). They charted customer requests for peak current delivery requirements in CPU, GPU, and ASICs. There is a clear change, starting around the 2010 timeframe, of peak currents dramatically increasing. This is due to AI processing capability expanding and the lower processor fabrication nodes of 14, 12, 10, and now 7nm.

Figure 14 Ever-increasing peak currents over time (Image courtesy of Vicor)

More power on the board being delivered means a greater focus on the efficiency of the converters along with the losses in the distribution or delivery of the power. One problem, as power increases, is that typically the size of the converters increase meaning they cannot be placed near the processor. This placement problem then creates additional losses in the board.

The Nvidia SXM3 is a great example of an AI processor that utilizes their factorized power solution. This type of solution can provide over 1,000A peak current. Given the size and low height of these devices, they can be placed closer to the processor than conventional type VRs.

These I2 R losses create inefficiency and also generate heat. They plotted the losses for a 400µOhm board loss. You can see that just at 200A you lose 10% efficiency into the board.

Their factorized power solutions typically reduce board losses by 90% when the devices are used in a lateral configuration much like in the Nvidia solution. However, as currents increase further, there is a need to reduce the delivery losses even more in addition to reducing the size of the VR. The size of the VR is a very important consideration because as processors consume more current, and increase their computational capability, there is a need for increased I/Os and faster I/Os. A VR placed laterally to a processor (anyone’s VR) can block valuable gateways to and from the processor.

This is what led Vicor to VPD. It almost eliminates all power delivery losses and enables unencumbered access to the processor on all four sides. They were able to adopt the same factorized power architecture that they have used now for 10 years into a VPD VR. Their current multiplier device that has evolved from the VTM, to the MCM, and now the GCM is placed directly under the processor. Not only providing the current delivery, their GCM device also contains the critical bypass capacitance that is found directly underneath a processor.

Vicor also has recently announced collaboration with Kyocera highlighting how they provide the VR solution and Kyocera provides the processor substrate (or package) design to customers. This collaboration enables designers to quickly utilize their VPD VR solution.

The switch to a 48V topology architecture is being driven by increasing power levels in the rack. Nvidia made the switch to 48V entirely last year. That’s when they launched their DGX-2 platform. The cutoff, for which it makes sense to go to a 48V solution, is between 10 to 20 kW, above which losses have to be minimized as much as possible in such a high GPU current draw architecture. Infineon has many solutions including the use of the GaN power element in some designs. An example is the power area that converts AC power to the 48V directly with the use of GaN. This power level is about 500W per processor for each CPU/GPU and all of the processor circuitry is typically in a 10×10 cm area on the board for a single processor.

At the card level, Infineon has GaN-based half bridges as well as traditional silicon solutions depending upon what tradeoffs and considerations the customer needs. Infineon has another good solution for designers to consider: GaN transistors (Reference 7).

Figure 16 Server supply comprising a totem pole AC-DC rectifier with two interleaved high frequency bridge legs and an LLC DC-DC converter with center-tapped transformer (Image courtesy of Infineon)

A 48V system in this topology (Figure 16 ) might be implemented with full-bridge rectification. As computational architectures move toward more parallel processing with GPUs, power consumption per rack is tripling to 20 kW or greater. For this use case, distribution losses using 12V supplies are too great. It’s typically the accelerator cards with GPUs and high-end FPGAs and ASICs doing customized AI workloads. Thus, on-rack 48V supplies are growing in popularity, with an emphasis on efficiency.

Further improvements can be achieved in the DC-DC stage by populating the primary side half-bridge with 35 mΩ GaN devices. Taking advantage of the order of magnitude lower Qoss charge, with associated adjustments in resonant frequency and magnetizing inductance, increases system efficiency by about 0.3%. Changing the transformer setup to a matrix structure, with series connected primary windings and parallel connected secondaries, yields another 0.3% improvement.

The combination of improvements in both the PFC and LLC stage results in an increase of peak efficiency at a power density of 30 to 35 W/inch3 up to 98.5%. Infineon is also working with all the modular power manufacturers to license their solution. The company still thinks that 12V servers will continue to be a significant piece of the server market for a while.

For the future, they see the intermediate 12V going down to 7, 6, or 5V. In these cases the size of the solution can shrink as you go to higher frequencies and the power supply can get closer to the processor. Infineon’s typical power stage today is a 5×6 and 3×5 within that same power envelope.

Analog Devices has a data center power management solution architecture in two stages with the LTC7821 (48V to 12V) and then using the LTM4700 (micromodule) to go from 12V to 0.x V at 100A plus (they are multiphase).

The first step is taking 48V down to 12V with the LTC7821, which has an input range of 40V to 60V. A small (0.75×0.73 in) 2µH inductor such as SER2011-202ML may be used since the switching frequency is up to 400 kHz and the inductor only sees half of VIN  at the switching node.

Figure 17 The LTC7821 takes 48V down to 12V/25A while switching at 400 kHz. The board footprint is 1.45 in X 0.77 in giving a power density of 640 W/in3 .

One possible layout for the bus converter, by using top and bottom of the PC board, has a footprint of 2.7 cm2 (Figure 18 ). The efficiency is close to 98% at around 15A.

Figure 18 LTC7821 layout for Figure 17

The next step is going down from 12V to the processor levels needed. The LTM4700 µModule is used for that. The converter module can output a 0.9V while switching at levels between 325 kHz to 425 kHz. Multiple modules can be paralleled for higher currents such as 8-phase operation with four LTM4700 modules at 400A.

At 12V input with a 1V output @ 100A, 200 LFM airflow, the device can operate without heat sinking at about 73.4ºC. An I2 C, SMBus, or PMBus interfaces can be used.

Designers have a wide field of power management options from which to choose for this booming data server arena. It is the designer’s role to work with the customer to create an optimum solution for their particular needs. There will be compromises; however, our design expertise coupled with the power IC companies, their diverse and creative solutions, and their experts’ experience will help render a successful final solution to fulfill the customer’s needs.

Steve Taranovich is a senior technical editor at EDN with 45 years of experience in the electronics industry.

You must Sign in or Register to post a comment.