RANGER-5482:Create Ranger Audit Server with SOLR and HDFS as audit consumer#847
RANGER-5482:Create Ranger Audit Server with SOLR and HDFS as audit consumer#847
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a comprehensive Ranger Audit Server architecture that decouples audit collection from Ranger Admin by implementing:
- A centralized Audit Server that receives audits from plugins via REST API and produces them to Kafka
- Separate consumer services for SOLR and HDFS that consume from Kafka and write to their respective destinations
- Shared common infrastructure for Kafka consumer management with support for consumer groups and rebalancing
Changes:
- New audit server infrastructure with 3 microservices (audit-server, consumer-solr, consumer-hdfs) and shared common module
- Integration with existing Ranger plugins (HDFS, Hive) to support HTTP-based audit destination
- Docker support for testing the complete audit pipeline with Kerberos authentication
- Thread-safe audit handling with concurrent counter management
Reviewed changes
Copilot reviewed 110 out of 111 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| ranger-audit-server/scripts/* | Shell scripts for starting/stopping audit services |
| ranger-audit-server/ranger-audit-server-service/* | Core audit server service with REST API and Kafka producer |
| ranger-audit-server/ranger-audit-consumer-solr/* | SOLR consumer service with Kafka-to-SOLR pipeline |
| ranger-audit-server/ranger-audit-consumer-hdfs/* | HDFS consumer service with Kafka-to-HDFS pipeline |
| ranger-audit-server/ranger-audit-common/* | Shared utilities, base classes, and consumer registry |
| agents-audit/dest-auditserver/* | New audit destination for plugins to send to audit server |
| agents-common/src/main/java/org/apache/ranger/plugin/audit/* | Enhanced audit handler with service type in additional info |
| agents-audit/core/src/main/java/org/apache/ranger/audit/provider/* | Thread-safe counters using AtomicLong |
| dev-support/ranger-docker/* | Docker compose and Dockerfiles for audit server services |
| pom.xml, distro/* | Build configuration and assembly descriptors |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…nsumer - fix failing testing
…nsumer - fix pmd issue
…onsumer - Fix audit commit failure propagation and recovery in the consumers
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 110 out of 111 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…nsumer - audit server partition management enhancement
mneethiraj
left a comment
There was a problem hiding this comment.
@rameeshm - sending comments of the portions I reviewed so far. Will continue reviewing.
...itserver/src/main/java/org/apache/ranger/audit/destination/RangerAuditServerDestination.java
Outdated
Show resolved
Hide resolved
...itserver/src/main/java/org/apache/ranger/audit/destination/RangerAuditServerDestination.java
Outdated
Show resolved
Hide resolved
...itserver/src/main/java/org/apache/ranger/audit/destination/RangerAuditServerDestination.java
Outdated
Show resolved
Hide resolved
...itserver/src/main/java/org/apache/ranger/audit/destination/RangerAuditServerDestination.java
Outdated
Show resolved
Hide resolved
agents-audit/core/src/main/java/org/apache/ranger/audit/provider/BaseAuditHandler.java
Outdated
Show resolved
Hide resolved
...server/ranger-audit-server-service/src/main/java/org/apache/ranger/audit/rest/AuditREST.java
Outdated
Show resolved
Hide resolved
...server/ranger-audit-server-service/src/main/java/org/apache/ranger/audit/rest/AuditREST.java
Outdated
Show resolved
Hide resolved
...server/ranger-audit-server-service/src/main/java/org/apache/ranger/audit/rest/AuditREST.java
Outdated
Show resolved
Hide resolved
...server/ranger-audit-server-service/src/main/java/org/apache/ranger/audit/rest/AuditREST.java
Outdated
Show resolved
Hide resolved
...server/ranger-audit-server-service/src/main/java/org/apache/ranger/audit/rest/AuditREST.java
Outdated
Show resolved
Hide resolved
…nsumer - Fix review comments
…nsumer - PojoMappingFeature for AuditEvent Object for serialization
…onsumer - Fix review comments set #2
…nsumer - Audit Batch processing and failure reprocessing improvement
...server/ranger-audit-server-service/src/main/java/org/apache/ranger/audit/rest/AuditREST.java
Outdated
Show resolved
Hide resolved
…onsumer - Fix duplicate dependency error in the pom for sl4j
…nsumer - Fix ubuntu audit ranger module war file creation failure
| return ret; | ||
| } | ||
|
|
||
| if (!serviceName.equals(authenticatedUser)) { |
There was a problem hiding this comment.
Requiring serviceName same as the authenticatedUser doesn't seem right. I guess the intention is that the caller is the identity of the service account for the specified serviceName. This identity should be obtained from the service configuration retrieved from Ranger admin - for example configurations service.admin.users. If necessary, we can introduce a new configuration service.account.user for this purpose.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 117 out of 118 changed files in this pull request and generated 6 comments.
Comments suppressed due to low confidence (7)
ranger-audit-server/ranger-audit-server-service/src/main/java/javax/ws/rs/core/NoContentException.java:1
- Defining a class under the
javax.ws.rs.*namespace inside the project is very risky: it can conflict with the JAX-RS API/implementation provided by the container or dependency set, and can lead to classloading/linkage issues that are hard to diagnose. Prefer fixing this by adding/aligning the correct JAX-RS/Jersey dependency that providesjavax.ws.rs.core.NoContentException, or by relocating/shading any workaround class into a project-owned package and updating the reference point that needs it.
ranger-audit-server/ranger-audit-server-service/src/main/java/javax/ws/rs/core/NoContentException.java:1 - Defining a class under the
javax.ws.rs.*namespace inside the project is very risky: it can conflict with the JAX-RS API/implementation provided by the container or dependency set, and can lead to classloading/linkage issues that are hard to diagnose. Prefer fixing this by adding/aligning the correct JAX-RS/Jersey dependency that providesjavax.ws.rs.core.NoContentException, or by relocating/shading any workaround class into a project-owned package and updating the reference point that needs it.
hdfs-agent/scripts/install.properties:1 - The HDFS agent install properties set
XAAUDIT.AUDITSERVER.FILE_SPOOL_DIRto a Hive log path (/var/log/hive/...). This looks like a copy/paste error and will cause HDFS plugin audit spooling to go to the wrong directory. Update it to an HDFS-appropriate directory (consistent with other HDFS audit spool paths in this repo).
ranger-audit-server/ranger-audit-server-service/src/main/webapp/WEB-INF/security-applicationContext.xml:1 - CSRF is explicitly disabled while sessions are forced (
create-session=\"always\"). If this service is intended to be a REST-only API (JWT/delegation-token), prefer making it stateless (no forced sessions) and limiting CSRF disabling to only the endpoints/methods that require it; otherwise, leaving CSRF off while maintaining session state increases risk for browser-based clients. Consider switching to stateless session creation and/or enabling CSRF with appropriate exemptions for non-browser clients.
ranger-audit-server/ranger-audit-server-service/src/main/webapp/WEB-INF/security-applicationContext.xml:1 - CSRF is explicitly disabled while sessions are forced (
create-session=\"always\"). If this service is intended to be a REST-only API (JWT/delegation-token), prefer making it stateless (no forced sessions) and limiting CSRF disabling to only the endpoints/methods that require it; otherwise, leaving CSRF off while maintaining session state increases risk for browser-based clients. Consider switching to stateless session creation and/or enabling CSRF with appropriate exemptions for non-browser clients.
ranger-audit-server/ranger-audit-server-service/src/main/java/org/apache/ranger/audit/producer/kafka/AuditRecoveryManager.java:1 RecoveryStats.toString()starts withRecoveryStats{but never appends a closing}. This makes logs harder to read/parse. Append a closing brace before returning the string.
ranger-audit-server/ranger-audit-consumer-solr/src/main/java/org/apache/ranger/audit/rest/HealthCheckREST.java:1- A new
ObjectMapperis created for every health check request.ObjectMapperis designed to be reused and can be expensive to construct. Prefer using a static/singletonObjectMapper(or injecting one via Spring) to reduce allocation overhead.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
agents-common/src/test/java/org/apache/ranger/plugin/audit/TestRangerDefaultAuditHandler.java
Show resolved
Hide resolved
agents-common/src/main/java/org/apache/ranger/plugin/audit/RangerDefaultAuditHandler.java
Show resolved
Hide resolved
agents-common/src/main/java/org/apache/ranger/plugin/audit/RangerDefaultAuditHandler.java
Show resolved
Hide resolved
agents-common/src/main/java/org/apache/ranger/plugin/service/RangerBasePlugin.java
Show resolved
Hide resolved
| COPY ./scripts/audit-server/service-check-functions.sh ${RANGER_SCRIPTS}/ | ||
| COPY ./scripts/audit-server/ranger-audit-server.sh ${RANGER_SCRIPTS}/ |
There was a problem hiding this comment.
The Dockerfile copies scripts to ${RANGER_SCRIPTS}/ but the ENTRYPOINT uses a hard-coded /home/ranger/scripts/... path. If ${RANGER_SCRIPTS} is not exactly /home/ranger/scripts, the container will fail to start. Use a consistent path (either copy to /home/ranger/scripts explicitly, or set ENTRYPOINT to ${RANGER_SCRIPTS}/ranger-audit-server.sh).
|
|
||
| # Start the audit server using the custom startup script | ||
| WORKDIR /opt/ranger-audit-server | ||
| ENTRYPOINT ["/home/ranger/scripts/ranger-audit-server.sh"] |
There was a problem hiding this comment.
The Dockerfile copies scripts to ${RANGER_SCRIPTS}/ but the ENTRYPOINT uses a hard-coded /home/ranger/scripts/... path. If ${RANGER_SCRIPTS} is not exactly /home/ranger/scripts, the container will fail to start. Use a consistent path (either copy to /home/ranger/scripts explicitly, or set ENTRYPOINT to ${RANGER_SCRIPTS}/ranger-audit-server.sh).
| ENTRYPOINT ["/home/ranger/scripts/ranger-audit-server.sh"] | |
| ENTRYPOINT ${RANGER_SCRIPTS}/ranger-audit-server.sh |
agents-audit/core/src/main/java/org/apache/ranger/audit/model/AuthzAuditEvent.java
Show resolved
Hide resolved
agents-audit/core/src/main/java/org/apache/ranger/audit/provider/AuditHandler.java
Show resolved
Hide resolved
| } | ||
| } | ||
|
|
||
| private void preAuthenticateKerberos() { |
There was a problem hiding this comment.
preAuthenticateKerberos() => initKerberos()
|
|
||
| // Add serviceName to query parameters | ||
| if (StringUtils.isNotEmpty(serviceType)) { | ||
| queryParams.put(QUERY_PARAM_SERVICE_NAME, serviceType); |
There was a problem hiding this comment.
Parameter QUERY_PARAM_SERVICE_NAME is set to serviceType? Perhaps the parameter should be renamed to QUERY_PARAM_SERVICE_TYPE?
| public static final String PROP_AUDITSERVER_SSL_CONFIG_FILE = "xasecure.audit.destination.auditserver.ssl.config.file"; | ||
| public static final String PROP_AUDITSERVER_MAX_RETRY_ATTEMPTS = "xasecure.audit.destination.auditserver.max.retry.attempts"; | ||
| public static final String PROP_AUDITSERVER_RETRY_INTERVAL_MS = "xasecure.audit.destination.auditserver.retry.interval.ms"; | ||
| public static final String PROP_SERVICE_TYPE = "ranger.plugin.audit.service.type"; |
There was a problem hiding this comment.
Why are serviceType and appId needed for audit destination? How about using existing fields in AuthzAuditEvent - repoType (serviceType), repositoryName (serviceName), agentType (appId) instead, if necessary?
It will help avoid additional configurations that need to be setup in plugin.
| </dependency> | ||
| <dependency> | ||
| <groupId>com.google.code.gson</groupId> | ||
| <artifactId>gson</artifactId> |
There was a problem hiding this comment.
gson doesn't seem to be used. Please review and remove if unused.
What changes were proposed in this pull request?
Audit Server should have service which has kafka as a queue mechanism and have Kafka Producer to relay and store audits in Kafka Topic / Partition.
Audit Server should have Solr consumer running as a separate service to write audit into SOLR Index as a destination
Audit Server should have HDFS consumer running as a separate service to write audit into HDFS.
All the above mentioned service should be available and test in docker container.
How was this patch tested?