Hammerspace leverages smart metadata handling for AI/ML workloads


Software-defined storage maker Hammerspace claims its Hyperscale NAS functionality will offer a global file system that’s built for artificial intelligence/machine learning (AI/ML) workloads and for the high demands of GPU-driven processing. It offers performance usually only provided by dedicated high performance computing (HPC) storage products, but for data resident in any on-site or cloud location.

That’s a win for customers that want to retain data of many different types, potentially located in multiple datacentres or clouds, but which may form training datasets for analytics workloads.

Hammerspace essentially allows customers to view, access and manage data wherever it is held and whatever storage it is held on.

It’s a core element of Hammerspace’s technology stack that it parses out the metadata from the file at an earlier stage than competitors. In other words, the Linux kernel that Hammerspace is built on separates out the metadata at the client and before it is written to storage.

This lightens the load on storage, but also means metadata is offloaded when transmitted for processing such as in AI/ML workloads and GPU farms. It has been leveraged to provide the core offer of Hammerspace’s Hyperscale NAS.

Hammerspace is currently only for file data and competes with scale-out NAS makers such as NetApp, Isilon and Qumulo. With Hyperscale NAS, it attempts to provide storage targeted at HPC and AI/ML workloads.

“Historically, HPC file system products, like DDN, have been a difficult sell into enterprises that already run file system products,” said Molly Presley, senior vice-president of marketing at Hammerspace. “We’ve been able to take metadata out of the data path and create a parallel file system that removes this overhead. Traditional file system products don’t do that, and it’s not good in an HPC environment.

“In the AI world, most organisations don’t know which models they will want to use. So, what we offer gives flexibility, with data that resides in datacentres or in the cloud or is unstructured and that they then decide they want to access for AI training.”

Presley cited one customer at the US Los Alamos National Laboratory, where, she said, the organisation runs several different file systems and had struggled with being able to distribute data to collaborators.

Another customer cited by Presley runs Isilon scale-out NAS and had hit bottlenecks in how many GPUs it could feed, so the 32-node NAS system had only been able to supply a 300-node render processing farm. Using Hyperscale NAS – because it took metadata out of the I/O path – the customer was able to double the number of rendering nodes to 600.

Hammerspace is among a group of products that aim to provide global file access and collaboration with access to the latest version of files from any location. Competitors include Ctera, Nasuni, Panzura and Peer Software.

Hyperscale NAS is available now for all Hammerspace customers at no additional cost. Hammerspace licencing is based on the total amount of data under management.

In May last year, Hammerspace acquired Rozo Systems for its RozoFS, and in particular its advanced erasure coding capabilities that allow for sharding of files across multiple locations.



Source link