Skip to main content

S3

Apache Ozone includes a built-in S3 Gateway that provides compatibility with the Amazon S3 REST API. This allows applications and tools designed for S3 to interact with Ozone without modification.

Overview

The S3 Gateway is a stateless service that runs alongside the Ozone cluster (often co-located with Ozone Managers or on dedicated nodes). It receives S3 API requests from clients and translates them into corresponding operations on the Ozone cluster (primarily interacting with the Ozone Manager).

Key Features:

  • S3 API Compatibility: Supports a large subset of the S3 REST API for bucket and object operations (e.g., ListBuckets, CreateBucket, PutObject, GetObject, DeleteObject, ListObjectsV2, HeadObject, Multipart Upload).
  • Tooling Ecosystem: Enables the use of standard S3 tools like AWS CLI, s3cmd, SDKs (Java, Python/Boto3, Go, etc.), and various S3-aware applications.
  • Object Storage Semantics: Best suited for interacting with Ozone buckets using object storage semantics, particularly Object Store (OBS) buckets.
  • Authentication: Supports AWS Signature Version 4 for authenticating requests. User access keys are typically mapped to Ozone users (see S3 Multi-Tenancy documentation for details).
  • Stateless: The gateway itself doesn't store data, making it horizontally scalable. Multiple gateway instances can run behind a load balancer.

Security Model

The S3 Gateway in Ozone implements a security model that maps S3 credentials to Ozone users while utilizing AWS Signature Version 4 for request authentication.

S3 Credential Management

Ozone maintains AWS-compatible credentials (access key and secret key pairs) for users who need S3 access:

  • Access Key: Typically the username of the Ozone user
  • Secret Key: A generated secure token stored in Ozone Manager's metadata store

Kerberos Authentication Requirements

In secure Ozone clusters with Kerberos enabled, you must authenticate with Kerberos before generating or revoking S3 credentials:

# Authenticate with Kerberos
kinit username

# Now you can generate S3 credentials
ozone s3 getsecret

The S3 credentials commands use your current Kerberos identity to:

  1. Authenticate you to the Ozone Manager
  2. Associate the S3 credentials with your Kerberos principal
  3. Enforce appropriate authorization for credential management

Without a valid Kerberos ticket, the credential management commands will fail in a secure cluster.

Credential Management Commands

S3 credentials are managed through Ozone CLI commands:

# Generate S3 credentials for the current user
ozone s3 getsecret

# Generate S3 credentials for a specific user (requires admin privileges)
ozone s3 getsecret -u username

# Revoke existing S3 credentials (requires confirmation with -y flag)
ozone s3 revokesecret -y

# Revoke S3 credentials for a specific user (requires admin privileges)
ozone s3 revokesecret -y -u username

The CLI output provides both the access key and secret key needed to configure S3 clients:

awsAccessKey=username
awsSecret=generated_secret_key

S3 Administrator Role

Ozone defines specific administrator roles for S3 credential management through two configuration parameters:

  • ozone.s3.administrators: List of users with S3 admin privileges
  • ozone.s3.administrators.groups: List of groups with S3 admin privileges

If these specific S3 admin configurations are not set, the system falls back to using general Ozone administrators defined by ozone.administrators and ozone.administrators.groups.

S3 administrators have privileges to:

  • Generate S3 credentials for any user in the system
  • Revoke S3 credentials from any user in the system
  • Access the S3 credential management REST endpoints

Security design principles that are enforced include:

  • Only administrators can manage credentials for other users
  • Regular users can only manage their own credentials
  • All credential operations are audit-logged for security monitoring

Configuration

Clients need to be configured to point to the Ozone S3 Gateway endpoint instead of the default AWS S3 endpoint.

AWS CLI Example Configuration (~/.aws/config and ~/.aws/credentials):

~/.aws/config:

[default]
region = us-east-1 # Region can often be arbitrary for Ozone S3
s3 =
endpoint_url = http://ozone-s3g.example.com:9878 # URL of your S3 Gateway
signature_version = s3v4
addressing_style = path # Or 'virtual' depending on gateway setup

~/.aws/credentials:

[default]
aws_access_key_id = your_ozone_access_key # Access Key obtained from Ozone
aws_secret_access_key = your_ozone_secret_key # Secret Key obtained from Ozone

Replace http://ozone-s3g.example.com:9878 with the actual address and port of your S3 Gateway instance(s). The access/secret keys are typically generated via Ozone's S3 credential management commands.

Usage Examples

AWS CLI:

# List buckets (maps to listing accessible Ozone volumes/buckets)
aws s3 ls --endpoint-url http://ozone-s3g.example.com:9878

# Create a bucket (maps to creating an Ozone bucket within a user's volume)
aws s3 mb s3://my-ozone-bucket --endpoint-url http://ozone-s3g.example.com:9878

# Upload a file
aws s3 cp local_file.txt s3://my-ozone-bucket/remote_file.txt --endpoint-url http://ozone-s3g.example.com:9878

# Download a file
aws s3 cp s3://my-ozone-bucket/remote_file.txt local_copy.txt --endpoint-url http://ozone-s3g.example.com:9878

# List objects in a bucket
aws s3 ls s3://my-ozone-bucket/ --endpoint-url http://ozone-s3g.example.com:9878

Python (Boto3):

import boto3

s3 = boto3.client('s3',
endpoint_url='http://ozone-s3g.example.com:9878',
aws_access_key_id='your_ozone_access_key',
aws_secret_access_key='your_ozone_secret_key',
region_name='us-east-1') # Region often arbitrary

# List buckets
response = s3.list_buckets()
print(response['Buckets'])

# Upload a file
s3.upload_file('local_file.txt', 'my-ozone-bucket', 'remote_file.py')

Bucket Layout Considerations

  • OBS Buckets: The S3 Gateway works most naturally with Object Store (OBS) buckets, which provide a flat namespace aligned with S3 semantics.
  • FSO Buckets: Accessing File System Optimized (FSO) buckets via the S3 Gateway is possible, but filesystem-specific operations like atomic directory renames are not available through the S3 API. Directory structures are simulated using object key prefixes with / delimiters.

When to Use

Use the S3 Gateway interface when:

  • You need to integrate applications or tools built for the Amazon S3 API.
  • You prefer object storage semantics.
  • You are migrating applications from S3 to Ozone.
  • You need compatibility with a wide range of existing S3 SDKs and tools.

The S3 Gateway provides a powerful bridge between the S3 ecosystem and Ozone's scalable storage.