Was looking through my office window at the data closet and (due to angle, objects, field of view) could only see one server light cluster out of the 6 racks full. And thought it would be nice to scale everything down to 2U. Then day-dreamed about a future where a warehouse data center was reduced to a single hypercube sitting alone in the vast darkness.
They tend to fill the space. I mean if you drive by a modern data center, so much grid electrical equipment is just right there. Now if hypothetically supermachine uses all that power sure, small data center. Unless they have a nuclear reactor they should (fu felon musk) only rely on grid/solar/renewables.
I sometimes wonder how powerful a computer could be made if we kept the current transistor size we have now, but still built the machine to take up an entire room. At what point would the number of transistors and the size of the machine become more of a problem than a solution? 🤔
Isnt the main limiting factor signal integrity? Like we could do a CPU the size of a room now but it’s pointless as the stuff at one end wouldnt be able to even talk to the stuff in the middle as the signal just get fucked up on the way?
IIRC, light speed delay (or technically, electricity speed delay) it’s also a factor, but I can’t remember how much of a factor.
They look silly now. Many data centers are not scaling up power per rack. With GPUs, there are often two chassis per rack.
I had this problem with Equinix! They limited our company to like 10kva per rack, and we were installing nvidia dgx servers. Depending on the model we could fit only one or two lol.
Have that problem ourselves, they didn’t provision power or cooling for this kind of density, and how do you pipe in multiple megawatts to a warehouse in the middle of nowhere?
Only if storage density out paces storage demand. Eventually, physics will hit a limit
Physics is already hitting limits. We’re already seeing CPUs be limited by things like atom size, and the speed of light across the width of the chip. Those hard physics limitations are a large part of why quantum computing is being so heavily researched.
You think that if we can scale 6 racks down into one cube that someone wouldn’t just buy 6 racks of cubes?
They’ll always hunger for more.
I think what will happen is that we’ll just start seeing sub-U servers. First will be 0.5U servers, then 0.25U, and eventually 0.1U. By that point, you’ll be racking racks of servers, with 10 0.1U servers slotted into a frame that you mount in an open 1U slot.
Silliness aside, we’re kind of already doing that in some uses, only vertically. Multiple GPUs mounted vertically in an xU harness.
The future is 12 years ago: HP Moonshot 1500
“The HP Moonshot 1500 System chassis is a proprietary 4.3U chassis that is pretty heavy: 180 lbs or 81.6 Kg. The chassis hosts 45 hot-pluggable Atom S1260 based server nodes”
That did not catch on. I had access to one and the use case and deployment docs were foggy at best
It made some sense before virtualization for job separation.
Then docker/k8s came along and nuked everything from orbit.
VMs were a thing in 2013.
Interestinly, Docker was released in March 2013. So it might have prevented a better company from trying the same thing.
Yes, but they weren’t as fast, vt-x and the like were still fairly new, and the VM stacks were kind of shit.
Yeah, docker is a shame, I wrote a thin stack on lxc, but BSD Jails are much nicer, if only they improved their deployment system
Agreed.
Highlighting how often software usability reduces adoption of good ideas.
The other use case was for hosting companies. They could sell “5 servers” to one customer and “10 servers” to another and have full CPU/memory isolation. I think that use case still exists and we see it used all over the place in public cloud hyperscalers.
Meltdown and Spectre vulnerabilities are a good argument for discrete servers like this. We’ll see if a new generation of CPUs will make this more worth it.
128-192 cores on a single epyc makes almost nothing worth it, the scaling is incredible.
Also, I happen to know they’re working on even more hardware isolation mechanisms, similar to sriov but more enforced.
128-192 cores on a single epyc makes almost nothing worth it, the scaling is incredible.
Sure, which is why we haven’t seen a huge adoption. However, in some cases it isn’t so much an issue of total compute power, its autonomy. If there’s a rogue process running on one of those 192 cores and it can end up accessing the memory in your space, its a problem. There are some regulatory rules I’ve run into that actually forbid company processes on shared CPU infrastructure.
You’ve reinvented blade servers