Leil wants to revive MAID as green storage tape substitute


Leil means “steam” in Estonian – although it’s a bit more involved than that. It’s the steam that rises when water is put on sauna coals, and has implications of rebirth and new life.

Estonian disk storage firm Leil promises to “give storage its second life”, to “make storage pure and simple”.

It also hopes to contribute to green storage with energy-saving features such as switching off drives that are not in use, including parity drives in its erasure coding data protection scheme and concentrating the most-accessed data onto a subset of drives.

“We are the bread of data storage, not the butter, not the caviar,” said Piotr Modrzyk, principal architect of Leil at a recent IT Press Tour event in Rome.

Core to its offer is the ability to provide spinning disk drives that can be spun down when not in use, in a rebirth of the massive array of idle disks (MAID) concept of a decade or two ago, and which is targeted at “nearline, big and long-term” workloads, he said.

The Leil execs offer HM-SMR (host-managed SMR – shingled magnetic recording) HDDs that can be spun down when not in use. This is a feature of pin 3 in the SATA connection, and is otherwise only available in systems used by the hyperscaler cloud providers that build their own infrastructure.

Leil said it can offer this functionality via its SaunaFS operating system, which is a file access NAS system. “You can’t build your own systems to use HM-SMR drives by yourself, and if you did, they wouldn’t work,” said Aleksandr Ragel, co-founder of Leil.

Backup and archive use cases

The company intends to target tape storage and other long-term and nearline media for backup and archive use cases.

Leil seeks to overcome limitations in tape, not just its slow read times, but also the limitation to 300 reads per lifetime.

It also targets green storage, claiming the ability to spin down drives cuts energy use by 18%. Later in 2024, it will introduce a “write group” feature that will cut energy costs by 43%, and in 2025, “popular data concentration” functionality that will cut power costs by 50%. “No other HM-SMR drive products work on-premise,” said Ragel.

Use cases targeted are where organisations need long-term, high-capacity storage with relatively long access times. In particular, Ragel highlighted hospitals that must keep imagery, for medical and compliance reasons, and who may wish to access it, including for artificial intelligence processing, but for whom to re-equip would with flash storage would be very costly.

Current drive sizes for HM-SMR HDDs are 28TB, with 48TB likely in 12 to 18 months. 

Leil said it has practically no competition in the market. Ceph did support HM-SMR drives, but dropped this in January 2024, with Modrzyk speculating it’s a result of Red Hat parent IBM’s interest in tape storage. He said there is also HM-SMR compatibility in the open source BTRFS operating system, but that this doesn’t scale beyond a single node.

Leil said its expertise – developed over six years – insulates it from competition. The technology of working with HM-SMR drives, said Modrzyk, “is so complex only hyperscalers with hundreds or thousands of engineers can use it”.

SMR technology interleaves magnetic recording layers between each other on drive platters to increase capacity over standard hard drives.

A number of suppliers attempted to popularise MAID technology at the end of the 2000s.

MAID arrays provided a disk-based backup target with lots of drives that could be spun down when not in use and so were suited to infrequently-used data.

The key benefit was access times quicker than tape, but avoiding some or most of the cost of powering and cooling lots of hard drives. A UK company called Copan was a pioneer of this, but by the turn of the decade, it had been acquired by SGI and MAID faded away.

Since then, Microsoft also dabbled with object storage MAID-like arrays, in-house as Azure datacentre storage, that were called Pelican.

And what about drive failure rates on HDDs that sit idle and are restarted? Modrzyk said that’s responsible for “less than 0.1% of failures”, according to “engineers”.



Source link