Take

Storage

Let's talk about storage.

Let's talk about how there's been no meaningful improvement to the day-to-day act of using storage for many years now. Two decades ago, Sun introduced ZFS, which has pools, better integrity checking and more fluid allocation. A whole lot of noise and an Oracle-fork detour later, it basically completely works. You can forget about disks, by which I mean that you're a fool if you forget about the hardware, but it works so well that it is possible to forget about it, and it is a handful of invocations to extend a pool.

Where are these improvements with regards to networked storage? Why are we still thinking about individual shares? Or rather, why are we stuck modeling everything as shares based on devices, as if that was the summit of the domain and the solution to all our problems?

For years, we have had cloud services where you point to a folder and everything you put into it syncs to a data center. For years, we have had Network Attached Storage devices with built-in RAID and tiering, where some stuff goes on fast SSDs and some on slower mechanical hard drives.

Well, okay. The pieces are in place. Where is the next step: where is the device that you turn on, stuff with storage capacity, connect to the network, and where you create a virtual partition, or pool, on your computer which secondary tier is the network device? If I have a 1 TB drive locally and I have 20 TB on the network device, and I refer to 500 GB sometimes but not always, why do I manually have to transfer it to the network device to free it up?

(People get upset, rightly, about keep-it-on-the-cloud policies when they are things that default to on and masquerade as regular folders, leading to data loss. That doesn't mean that the concept is worthless, especially if the endpoint is local much of the time, and could be proxied through the internet only when necessary.)

Every time I've looked into anything approaching this, I get stuck in the third half of the 1970's, which as far as I know extends into infinity. It's block mounted devices, shares, it's iSCSI LUNs, it's LVM and software RAID and union filesystems, brilliant technology no doubt, but all moored to the expectation of hardware wizardry. It is up to you to set up all these things, and maintain a psychic link to the documented shortcomings and incompatibilities as they are discovered in real-life use (or, if it already has, to inhale all these nuggets and make sure it is second nature by the time you need to evict or extend a disk in five months).

This overlaps with another frustration. As far as I can tell, the cousin companies VAST Data and Everpure (née Pure Storage, and to be confused with Everclear at your own peril) have both constructed appliances that they fill with tons of QLC NAND flash, usually considered the crappy kind, and simply through spreading out the load much better, are as responsive as many good individual NVMe SSDs (although probably not with as low latency). But this is locked up inside appliances for large companies.

This sounds like a good technology for the newer network device I'm considering. But it also sounds like a road towards something I've been expecting for decades now - a form of SSD that fits between "the cheapest crap you can put together" and "a 4 TB drive that stayed at the same curious price of 'too much' for many years only with the I/O speed multiplying". (Let's leave current storage prices out of this for the moment; my assumption is that they will return to reality at some point within the next few years.)

Where's the 12 TB drive of two-or-three-generations-ago technology, that costs the same as 4 TB of the current generation? You know, in case your needs are sated by mere 1-2GB/s transfer speeds. My guess is that it's limited by the M.2 (and U.2) form factor since higher capacity SATA SSDs have failed to materialize, but it wouldn't have to be if you put it on a PCIe card, and it wouldn't even have to be that cheap to still be cheaper than stacking your way to 12 TB by way of individual M.2 drives.

The caveat to all of this is that it would be new, and it would be untested, and no one would know what to make of it. And that beneath the mere bullet point of making storage easy, without even inventing any new form factors or technologies, lies a graveyard of sunken ventures, firmware bugs like harpoons in their bow.

The unfortunate truth is also that most actors that would have the resources to do this are busy either upselling you those horrendously expensive SSDs or upselling you on the idea that what you really ought to do is stick all your data in us-east-1 instead what are you some kind of fossil or something. But I hold out hope that within the carapace of these corporations, and in the general unassociated storage community, are people who just think all this is a rather interesting idea, all things considered. Who see the potential good and the end goal, and maybe even an easier time sorting out their own storage. Or who just wants to not answer so many support questions.

Previous post: Cooked