Exa-scale storage now available as a service: Check DESY’s dCache InfiniteSpace
dCache InfiniteSpace as the 32th Helmholtz Cloud service rounds off the service portfolio: Large scale storage is now available as a service via the Helmholtz ID to all user groups of the Helmholtz Cloud.
Mass Storage from DESY for Helmholtz
dCache is an open source project which developed a system for storing and retrieving large amounts of data, providing world-wide access. It has been built and is further developed by Deutsches Elektronen-Synchrotron (DESY), the Fermi National Accelerator Laboratory (FNAL) and the Nordic e-Infrastructure collaboration (NeIC).
Thus, the system was a perfect candidate for DESY to provide mass storage Helmholtz-wide via Helmholtz Cloud. Actually, it has been one of the first services connected to the Helmholtz AAI in early 2020 as a demonstrator. Now, it has become a regular service and is provided via the Helmholtz Cloud Portal branded as “dCache InfiniteSpace”.
The service allows Helmholtz-based user groups to store, process and publish research data with practically no storage limits. It is a perfect match to combine with data processing pipelines (service orchestration) that the HIFIS team is currently setting up in collaboration with research groups. On top, the dCache software features advanced authorization delegation, provides a user-transparent workflow to enable buffering and pre-fetching of data transfers, thus minimizing performance losses due to latency. It also supports integration with large data transfer services such as CERN’s File Transfer Service.
Early Adopters become regular Users
A few Helmholtz user groups have been using this service already in its prototypic phase. For example, Helmholtz AI’s SeisBench project uses this service as Mass Storage for Machine Learning in Seismology.
Also our partners from Helmholtz Imaging are using the storage. We also showcased our first implementation of a service pipeline for scientific data processing that will be further extended in collaboration with Helmholtz Imaging Modalities. In this pipeline, multiple services, including the large scale storage, are chained in a modular fashion to process data in an automatable and reproducible way.
On top, the storage service is an important element in large scale data transfer, for example employing the HIFIS transfer service.
Get in contact
For dCache InfiniteSpace or anything else HIFIS-related: HIFIS Support
2023-08-14: The service can now be accessed via Helmholtz Cloud Portal, hence links in this blog post have been adjusted.