Bucket Layouts
Apache Ozone offers different Bucket Layouts that define how keys (objects, files) are organized and managed within a bucket. The chosen layout significantly impacts the bucket's behavior, performance characteristics, and compatibility with different access protocols like S3 and Hadoop Compatible File System (HCFS) interfaces (ofs://
, o3fs://
).
Ozone's flexibility in bucket layouts allows a single cluster to serve as both a high-performance object store (like Amazon S3) and a scalable Hadoop Compatible File System (like HDFS), catering to diverse workloads and enabling multi-protocol access to data.
Available Layouts
Ozone provides the following bucket layouts:
-
- Namespace: Flat (like S3).
- Compatibility: Optimized for strict S3 compatibility.
- Use Case: S3-native applications, cloud-native workloads, unstructured data storage (media, backups).
- HCFS Access (
ofs://
): Not supported.
-
- Namespace: Hierarchical (like HDFS).
- Compatibility: Optimized for HCFS compatibility (
ofs://
,o3fs://
). Supports atomic directory operations via HCFS. - Use Case: HDFS replacement, traditional analytics (Spark, Hive, Impala), workloads requiring filesystem semantics.
- S3 Access: Supported, but directory operations are not atomic via S3.
-
Legacy:
- This was the original layout before OBS and FSO were introduced.
- It provides a flat namespace.
- Its behavior, especially regarding S3 access and filesystem path interpretation, can depend on the
ozone.om.enable.filesystem.paths
configuration setting. - Recommendation: For all new buckets and use cases, prefer either OBS or FSO over the Legacy layout.
Choosing the Right Layout
Selecting the appropriate bucket layout is crucial for optimal performance and compatibility:
- If your primary requirement is S3 compatibility for cloud-native applications or pure object storage, choose OBS.
- If your primary requirement is HCFS compatibility for analytics workloads or replacing HDFS, choose FSO.
You can specify the layout during bucket creation or set a default layout cluster-wide. See the specific layout pages for details.