HttpFS Request Flow
This document describes the journey of a client request through the HttpFS service, detailing how WebHDFS API calls are transformed into Ozone operations.
Request Processing Sequence
A typical request flow through HttpFS follows these steps:
-
Client Request Initiation
- Client sends an HTTP request to the HttpFS endpoint
- Request includes an operation type, path, and optional parameters
-
HTTP Server Processing
- The embedded HTTP server (Jetty) receives the request
- The request is routed to the appropriate servlet based on the path
-
Authentication
- Authentication filter intercepts the request
- Authenticates using one of:
- Kerberos (SPNEGO)
- Delegation token
- Simple authentication (if security is disabled)
-
Request Validation
- Validates request parameters
- Checks that required parameters are present
- Verifies parameter format and values
-
User Resolution
- Resolves the authenticated user's identity
- Sets up the user context for authorization
-
Path Resolution
- Parses the WebHDFS path into Ozone volume, bucket, and key components
- Transforms relative paths to absolute paths if necessary
-
Operation Translation
- Maps WebHDFS operations to corresponding Ozone operations:
CREATE
→ Ozone key createOPEN
→ Ozone key readMKDIRS
→ Ozone directory creationRENAME
→ Ozone key rename- And other similar mappings
- Maps WebHDFS operations to corresponding Ozone operations:
-
Authorization Check
- Checks if the user has permission to perform the operation
- Verifies access according to Ozone's permission model
-
Ozone Client Interaction
- Creates the appropriate Ozone client call
- Executes the operation through Ozone client libraries
-
Data Transfer (for read/write operations)
- For read operations: streams data from Ozone to the HTTP response
- For write operations: streams data from the HTTP request to Ozone
-
Response Generation
- Creates an HTTP response with the appropriate status code
- Formats the response body according to WebHDFS specification
- Includes error details if the operation failed
-
Response Transmission
- Sends the response back to the client
- Closes the connection if no more data will be exchanged
Specific Operation Flows
File Read Operation
- Client issues
GET /webhdfs/v1/volume/bucket/path/to/file?op=OPEN
- HttpFS authenticates the request
- Translates WebHDFS path to Ozone path:
ozone://om-host/volume/bucket/path/to/file
- Opens input stream from Ozone client
- Streams data through HTTP response
File Write Operation
- Client issues
PUT /webhdfs/v1/volume/bucket/path/to/file?op=CREATE
- HttpFS authenticates the request
- Translates WebHDFS path to Ozone path:
ozone://om-host/volume/bucket/path/to/file
- Creates output stream through Ozone client
- Reads from HTTP request body and writes to the output stream
- Closes stream and confirms completion
Directory Creation
- Client issues
PUT /webhdfs/v1/volume/bucket/path/to/dir?op=MKDIRS
- HttpFS authenticates the request
- Translates to Ozone path:
ozone://om-host/volume/bucket/path/to/dir
- Creates directory through Ozone client
- Returns success status
Listing Directory Contents
- Client issues
GET /webhdfs/v1/volume/bucket/path/to/dir?op=LISTSTATUS
- HttpFS authenticates the request
- Translates to Ozone path:
ozone://om-host/volume/bucket/path/to/dir
- Lists contents through Ozone client
- Formats result as JSON according to WebHDFS specification
- Returns formatted listing
Error Handling
When errors occur during processing:
- HttpFS catches exceptions from the Ozone client
- Maps Ozone-specific exceptions to appropriate HTTP status codes:
FileNotFoundException
→ 404 Not FoundAccessControlException
→ 403 ForbiddenInvalidPathException
→ 400 Bad Request
- Generates a JSON response with error details
- Logs the error with appropriate severity
Security Context Flow
For secure clusters, the security context flows as follows:
- Client authenticates using Kerberos or delegation token
- HttpFS validates credentials and creates a user proxy
- All Ozone operations are executed as the authenticated user
- Authorization checks are performed against the user's identity and groups
- Audit logs record the original user's actions