One of the things that I am diving much deeper (pun intended) into lately is deep learning. And with deep learning, one of the things that can help you the most when it comes to having the right hardware locally is a beefy Graphics Processing Unit (GPU). As it turns out, offloading tasks that involve matrix multiplication to the GPU yield massive performance benefits compared to doing the same thing on the Central Processing Unit (CPU), especially given that those tasks need to be parallelized. We’re talking tens of times faster on the GPU. Couple that with very large data sets that need to be manipulated, and you got yourself a problem that the CPU (designed for minimizing latency) is not really the best at solving, but the GPU (designed for higher throughput) shines at.
Back in late 2020 I assembled my personal desktop PC that did its job pretty well, and even had a nifty RTX 2080 in it that is an absolutely stellar video card. However, when it came to a range of deep learning tasks, that 8GB of VRAM really ate into the potential performance gains that I wanted. So, my next step was figuring out how to build a different rig with a bit more GPU firepower than what I had available.
Given the GPU shortage, I was not very hopeful that I can find the right cards quickly so I was aware that it might take me some time to build a nice, dual-card box. That’s right, I wanted to go outside my comfort zone this time and build something that leveraged two GPUs at once, because why not. I’ve briefly dabbled with cross-GPU task parallelization, and this seemed like the right opportunity to build on that interest.
For the task, and given that I was building with at least a bit of a future-looking aspect to my choices, I thought that I would go with two NVidia RTX 3090 cards. Because they are a bit on the pricier side, finding them was also easier compared to other cards, such as the RTX 3080 Ti or the 3070, although at least in Canada the stock seems to be recovering fairly fast. On multiple occasions in the past couple of months I’d walk into a store and see shelves stacked with RTX cards. Hopefully this trend continues and if you join any of the local Discord groups you’ll have zero issues finding the right GPU relatively quickly. Do not buy from scalpers at inflated prices - the cards are coming back in stock and while we might be some time away from manufacturer’s suggested retail price (MSRP) being a reality, there is really no reason to pay a 100%+ premium right now.
With a bit of digging in the past couple of months, I ended up acquiring two RTX 3090 cards:
With the cards, I also knew that I needed to do some reshuffling in the PC components that I am using. NVidia lovingly refers to the 3090 as a BFGPU (as in - Big “Ferocious” GPU, for all you Doom fans out there), and the EVGA cards just so happen to be BFGU-ish big - they take up a massive 3 PCI-E slots, so you better have some space on your motherboard.
To the surprise of no-one, it seems that this generation of motherboards (X570 for AMD is what I am using) is woefully unprepared for a dual GPU setup of this kind. That is - the PCI-E slot layout on the board is too crammed, so if you have two 3-slot cards, they will literally be touching each other, with no gap for air. You should never do this unless you want to cook breakfast on your GPU backplate. Short of having a blower card, that’s not a realistic option. Most motherboards have a 3-slot gap, which means that there is not nearly enough space, and only a few support a 4-slot gap affording the right air circulation space:
- MSI MEG X570 Godlike, which is not available anywhere.
- EVGA X570 Dark, which is a board designed for overclockers and only has two RAM slots.
- EVGA X570 FTW, which is way overpriced for the current generation (X570 is on the way out).
OK, so - clearly not a lot of options, and I didn’t want to go out of my way to find the right motherboard for this setup since that’s going to be a challenge. What’s the alternative? Well, I could mount one of the cards vertically, and the other one in the PCI-E slot on the board. That is - the cards won’t really be near each other, giving plenty of breathing room and minimal overheating risk. However, that introduced another set of challenges:
- You need to have a case that supports vertical cards while giving the option to still keep one card in horizontal position on the board. Some cases have a vertical card mounting bracket mod available, but that puts you into an “either/or” position - you either mount your card vertically or horizontally. What about the second one?
- If you mount the card vertically, you need a PCI-E riser cable which is constrained by length, and most of them are PCI-E 3.0 versions. If you have a PCI-E 4.0 slot that you want to use (and you do, if you care about squeezing the most out of the performance available), your options dwindle.
Given this conundrum, I’ve spent some time researching, and landed on two things that will help:
- Phanteks Enthoo 719. This is a fantastic case that is built with two systems in mind. That is - its intended purpose is to serve folks who want to have an ATX and ITX setup in one case. My ITX setup lives separately, so all I needed was just a place to put the second card. The bottom part of the case fits perfectly for this task as it has a built-in vertical mount (but no PCI-E riser).
- LINKUP Extreme 4+ PCI-E 4.0 Riser (45cm). This is a PCI-E riser cable that supports PCI-E 4.0 (a rarity) and has a good enough length to provide plenty of flex for the card positioning. Be careful if you are using riser cables because most of them are too short for the Phanteks case, and you also want to have a 90 degree mount that is going to be holding the card.
With all said and done, I’ve mounted the riser in the bottom section of the case, connected it to the bottom PCI-E slot on my motherboard, and mounted one card vertically and another one horizontally.
I was a bit nervous turning this BFC (Big “Ferocious” Computer) on, given the slight potential of me not really connecting something the right way, but it all lit up like a Christmas tree when I pressed the power button:
And before you call it out - yes, I need to work on my cable management inside the case, but for now I just wanted to see it run.
Zero configuration in the BIOS, zero configuration on the OS side, and the EVGA Precision X1 software instantly identified the cards as separate entities.
Running a snippet of PyTorch code, I could also see that now I had two GPUs that I can use for my explorations:
import torch print(torch.cuda.is_available()) print(torch.cuda.current_device()) print(torch.cuda.device(1)) print(torch.cuda.device_count()) print(torch.cuda.get_device_name(0))
There is one caveat here, that I want to call out for folks who care about an often neglected aspect in some dual GPU setups - NVLink. The RTX 3090 GPUs are the only in the 30 series that support this functionality, but it’s something to consider if you are constrained by the PCI-E bandwidth. NVLink allows in certain circumstances to pool VRAM (not something supported for most games and apps) and enable cross-GPU communication through its own bridge. It’s a little gadget like this:
If you do intend to get the performance benefits of NVLink with the 3090 cards, be very aware of the bridge gap requirements and your motherboard layout. There are not flexible bridges for this technology. For example, for EVGA cards the 4-slot gap is standard, so you need a motherboard like the ones I listed above. That also means that you need a case that can fit a thick card in the bottom PCI-E slot on your motherboard - often you’ll have the Power Supply Unit (PSU) in the way. Also, NVLink bridges are vendor-specific, so one you’d get for a EVGA card might not work for a Gigabyte card because of NVLink slot positioning.
With all this, I am now a happy camper, pushing the limits of my own deep learning work.