Self-Hosted File Storage: Understanding Object Storage and Its Alternatives

Self-Hosted File Storage: Understanding Object Storage and Its Alternatives

What is object storage? Why has S3 become a standard? Overview of self-hosted solutions with a focus on Garage, a European alternative.

Introduction

Every business needs to store files: documents, images, backups, data exports, application logs... The question isn't "if" but "where" and "how".

For a long time, the answer was simple: a file server in a closet, or a NAS in the server room. Then the cloud arrived, and with it a new way of storing data: object storage.

This article explains what object storage is, why the S3 protocol has become the standard, and what solutions exist for self-hosting. With a focus on Garage, a European alternative that deserves your attention.

File Storage: What Are We Talking About?

Traditional Storage

Historically, we store files in hierarchical file systems: folders, subfolders, files. This is what you see on your computer, on a Windows file server, or on a NAS.

This model works well for office use. But it shows its limits when volumes explode: millions of files, petabytes of data, simultaneous access from dozens of applications.

Object Storage: A Different Approach

Object storage abandons the folder hierarchy. Each file (called an "object") is stored with a unique identifier and metadata. No complex path, just a key to retrieve the object.

The advantages:

  • Scalability: you can store billions of objects without performance degradation
  • Simplicity: a simple API (create, read, delete) that integrates easily
  • Resilience: data is automatically replicated across multiple disks or servers
  • Cost: optimized for massive storage of "cold" data
  • Versatility: stores any type of data — structured or not: images, videos, JSON files, CSV, logs, ML models...

This is the model used by all major cloud services to store photos, videos, backups, and analytical data.

And volumes keep growing. IoT, application logs, media, analytical data: companies generate ever more data that they need to store and leverage. Object storage is designed to absorb this growth without changing architecture.

It has also become the foundational building block of modern Data Lake architectures. Object storage serves as the "Bronze" layer (or raw layer) where raw data lands before transformation. It's the entry point of the Medallion architecture (Bronze / Silver / Gold): data arrives as-is, then gets cleaned and enriched in subsequent layers.

S3: The De Facto Standard

S3 (Simple Storage Service) is Amazon Web Services' object storage service, launched in 2006. Its API has become the industry standard.

Today, when we talk about "S3 storage", we're not necessarily talking about Amazon. We're talking about a protocol, an interface that everyone understands. Your backup tools, your applications, your data pipelines: they all speak S3.

This universality is what makes object storage so practical. And it's also what allows you to self-host it: if your solution speaks S3, all your applications work without modification.

Self-Hosted Object Storage Solutions

Good news: you're not obligated to use AWS to benefit from object storage. Several open source solutions allow you to host your own infrastructure, on your servers, with your rules.

Four main solutions stand out. Each has its strengths and positioning.

MinIO

The reference for self-hosted object storage. Created in 2014 by an American company (Palo Alto), MinIO has established itself through its performance and abundant documentation. The project is under AGPLv3 license and also offers commercial offerings with support.

Ceph (RADOS Gateway)

The veteran of distributed storage, maintained by the CNCF foundation. Ceph offers an S3 gateway in addition to block and file storage. It's a complete solution, but its complexity reserves it for large infrastructures with dedicated teams. The official documentation recommends a minimum of 10 nodes for a production deployment.

LGPL 2.1 license, open governance. Oversized for SMBs.

SeaweedFS

Open source project under Apache 2.0 license, with international contributors. SeaweedFS is optimized for small files and low latency, making it a good option for CDNs or web applications with many assets. The architecture is inspired by Facebook Haystack (described in their 2010 publication).

Garage

European solution developed by the French association Deuxfleurs. Garage stands out for its lightness: a single binary, minimal configuration, and the documentation indicates it runs with 1 GB of RAM. Multi-site replication is native, designed from the start.

AGPLv3 license. Younger project (first stable version in 2022), but in production at Deuxfleurs since 2020.

Focus on Garage

Who's Behind It?

Garage is developed by Deuxfleurs, a French non-profit association specializing in alternative and decentralized hosting.

The association has existed since 2018 and has been developing Garage since 2020. They use it in production for their own services: a cluster of 9 nodes distributed across 3 physical sites in France.

The project benefits from European funding via the NGI POINTER and NLnet programs, which support the development of alternatives to American technologies. Transparent governance, open source code (AGPLv3), and an active community on Matrix.

Why We Like Garage

Lightweight. Garage runs on modest hardware. A Raspberry Pi can do the job for reasonable volumes. No need to invest in powerful servers to get started.

Resilient. Multi-site replication is native. Your data is automatically copied across multiple nodes, potentially in different geographic locations. If one site goes down, the others take over.

Simple. A single binary to deploy. No zoo of services to orchestrate, no complex dependencies. The configuration fits in a readable TOML file.

Sovereign. Code developed in France, transparent associative governance, European funding. For organizations concerned about digital sovereignty, this is an argument that counts.

Which Solution to Choose?

There is no "best" solution in absolute terms. The choice depends on your context.

MinIO if you're looking for a versatile, well-documented solution with a large ecosystem. It's the default choice when you have no particular constraints.

Ceph if you have a large infrastructure, expert teams, and need unified storage (block + file + object). Not for small structures.

SeaweedFS if your use case involves many small files (images, web assets) and latency is critical. Good option for homemade CDNs.

Garage if you're an SMB, have limited hardware resources, want simple multi-site, or European sovereignty is an important criterion.

Conclusion

Self-hosted S3 storage is no longer reserved for web giants. Several mature solutions today allow you to host your object data without depending on AWS or another American hyperscaler.

Garage is a credible European alternative, particularly suited to lightweight and multi-site deployments. It's not the solution for all cases, but it perfectly fills its niche.

At Datakhi, we choose the solution adapted to each project. MinIO for its versatility, Garage for its lightness, or something else according to your needs. The important thing is that your data remains under your control.

Need advice on your storage strategy? Contact us to discuss.

Sources