Skip to content

UNOMI-139 UNOMI-880 UNOMI-878 UNOMI-884: Multi-tenant platform core#760

Merged
sergehuber merged 6 commits into
masterfrom
UNOMI-139-platform
Jun 6, 2026
Merged

UNOMI-139 UNOMI-880 UNOMI-878 UNOMI-884: Multi-tenant platform core#760
sergehuber merged 6 commits into
masterfrom
UNOMI-139-platform

Conversation

@sergehuber

@sergehuber sergehuber commented May 18, 2026

Copy link
Copy Markdown
Contributor

Stacked PR (merge order)

Role Branch
Base (merge into) UNOMI-888-import-export-javadoc
Head (this PR) UNOMI-139-platform

This PR is stacked: merge into UNOMI-888-import-export-javadoc first (PR #756, UNOMI-888), after that line is merged or rebased. Downstream PRs (UNOMI-904 V2 compat, dev shell, additional ITs, manual) will stack on UNOMI-139-platform.

For more information (each PR targets the branch below until the bottom merges): https://github.github.com/gh-stack/introduction/overview/


JIRA

Summary

Delivers the Unomi 3.1 platform core on top of UNOMI-888: multi-tenant execution, unified caching, cluster-aware scheduling, and V3 migration assets. Services, persistence, GraphQL, and REST are wired for V3 tenant resolution (API keys / roles).

Supersedes the platform scope of PR #757 (closed; monolith branch UNOMI-139-UNOMI-880-multitenancy). UNOMI-904 (V2 compatibility), dev shell commands, additional IT classes, and manual updates will follow in stacked PRs on this line.

UNOMI-139 Multi-tenancy

  • Tenant model and APIs (Tenant, TenantService, API keys, security/audit interfaces, quotas, lifecycle hooks) and tenantId on core types.
  • ExecutionContextManager and thread-local tenant context so services, persistence, and REST resolve the active tenant consistently.
  • Tenant-scoped persistence and definitions (Elasticsearch/OpenSearch), including system-tenant vs tenant-specific overrides.

UNOMI-880 Unified multi-tenant caching

  • MultiTypeCacheService / CacheableTypeConfig and AbstractMultiTypeCachingService for shared cache lifecycle, predefined item bundling, periodic refresh, and statistics.
  • Migrate DefinitionsServiceImpl and SegmentServiceImpl (and related paths) onto the unified cache; remove duplicated ad-hoc cache listeners where superseded.

UNOMI-878 Cluster-aware task scheduling

  • Replace legacy scheduler usage with SchedulerService (scheduled tasks, task executors, persistence-backed vs in-memory tasks, cluster coordination).
  • Integrate periodic work (cache refresh, cluster heartbeat, router scheduling, rule refresh paths) with the new scheduler.
  • Update dependents (e.g. Groovy actions service) to schedule work through the new API.

UNOMI-884 Migration to V3

  • Migration Groovy scripts and request bodies for 3.1.0 (tenant document IDs, system item ID fixes, tenant initialization, legacy queryBuilder updates).
  • Extended MigrationUtils and MigrationUtilsTest for the new steps aligned with multi-tenancy and index updates.

Tests

  • Platform integration tests updated (BaseIT, migration ITs, ScopeIT, and related suites).
  • Full integration test run verified locally with OpenSearch (./build.sh --integration-tests --use-opensearch).

Licence

@sergehuber sergehuber changed the title UNOMI-139/880/878/884: Multi-tenant platform core UNOMI-139 UNOMI-880 UNOMI-878 UNOMI-884: Multi-tenant platform core May 18, 2026
@sergehuber sergehuber requested a review from Copilot May 19, 2026 13:43

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review this pull request because it exceeds the maximum number of lines (20,000). Try reducing the number of changed lines and requesting a review from Copilot again.

@sergehuber sergehuber force-pushed the UNOMI-139-platform branch from 4c5828e to d2093d1 Compare June 2, 2026 17:47
@sergehuber sergehuber changed the base branch from UNOMI-888-import-export-javadoc to master June 2, 2026 17:51
@sergehuber sergehuber force-pushed the UNOMI-139-platform branch from d2093d1 to 6f3f768 Compare June 2, 2026 18:29
Log the full event chain once per collapsed stack pattern instead of on
every blocked send(), avoiding log storms during rule recursion loops.
Add no-arg constructor and setUpdated() so clients can deserialize POST
/cxs/eventcollector JSON (aligned with unomi-3-dev, without tracing fields).
@sergehuber sergehuber force-pushed the UNOMI-139-platform branch from 6f3f768 to c36e1fd Compare June 2, 2026 19:44
@sergehuber sergehuber closed this Jun 3, 2026
@sergehuber sergehuber deleted the UNOMI-139-platform branch June 3, 2026 05:46
@sergehuber sergehuber restored the UNOMI-139-platform branch June 3, 2026 05:54
@sergehuber sergehuber reopened this Jun 3, 2026
Increase retry budget from 10 to 20 for the property type removal poll.
ES consistently takes ~11s to propagate the deletion, exceeding the 10s limit.

@sergehuber sergehuber left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: UNOMI-139 Multi-tenant platform core

Scope: 292 files, +30 486 / −4 093 lines (multi-tenancy, unified caching, cluster scheduling, V3 migration).

# Severity Finding
C1 CRITICAL hasTenantAccess() ignores tenantId — any tenant key accesses all tenants
C2 CRITICAL Unknown X-Unomi-Tenant-Id header grants full system context to JAAS users
C3 CRITICAL Private API key committed in plaintext to migration script (permanently in git history)
C4 CRITICAL ApiKey.key serialised to ES and returned by REST with no @JsonIgnore
H1 HIGH executeAsSystem() finally block suppresses original exception on dual failure
H2 HIGH Migration step key uses itemType — multiple rollover indices silently skipped
I1 IMPORTANT SecurityService.SYSTEM_TENANT = "SYSTEM_TENANT" conflicts with all other constants ("system")
I2 IMPORTANT isPublicOperation() throws IndexOutOfBoundsException on empty/fragment-only documents
I3 IMPORTANT migrate-3.1.0-00-tenantDocumentIds.groovy conflicts with existing 00-fixProfileNbOfVisits on master
I4 IMPORTANT ScheduledTask implements Serializable without serialVersionUID
I5 IMPORTANT getCurrentRoles() is dead code calling AccessController removed in Java 21
I6 IMPORTANT isOperatingOnSystemTenant() stub always returns false
A1 Advisory @author tag in SchedulerServiceImpl (ASF policy)

All findings ≥ 80 % confidence. Inline suggestions attached. C1–C4 should be resolved before merge.

Comment thread api/src/main/java/org/apache/unomi/api/tenants/ApiKey.java
Comment thread api/src/main/java/org/apache/unomi/api/tasks/ScheduledTask.java
…d log4j2 improvements

Fix several issues raised in PR #760 review:

- [I1] Fix SYSTEM_TENANT constant value ("SYSTEM_TENANT" → "system") so
  system tenant context resolution works correctly across all services

- [I4] Add missing serialVersionUID to ScheduledTask to prevent
  deserialization issues across cluster nodes

- Fix PatchIT.testRemove flaky timeout: polling for PropertyType removal
  must run inside executeAsSystem() context. Because system PropertyTypes
  are inherited by other tenants, polling outside the system tenant context
  would still observe the item through inheritance after deletion.

- Fix cache refresh race condition in AbstractMultiTypeCachingService:
  scheduled and manual refreshes were allowed to run concurrently
  (allowParallelExecution defaults to true in the scheduler). Add
  disallowParallelExecution() to prevent a scheduled refresh from
  re-populating the cache with stale data during a manual removal.

- Add comprehensive unit tests for keepTrying(), waitForNullValue() and
  shouldBeTrueUntilEnd() polling utilities in PollingUtilsTest (558 lines,
  47 tests, 100% coverage)

- Fix infinite loop bug in keepTrying() when predicate is satisfied on a
  null value (do-while pattern ensures predicate is tested after fetch)

- Improve log4j2 configuration:
  - Separate patterns for console (colors) and file (plain text)
  - Configurable message truncation via
    org.apache.unomi.logs.message.max.length (defaults to 500 chars,
    security feature against log injection)
  - Shorter bundle info (B <id> instead of id/name/version)
  - Thread name truncated from beginning (%-16.16t) to preserve
    the significant suffix
  - Disable all SafeExtendedThrowablePatternConverter transformations
    via --disable-log-truncation flag in build.sh for debugging

- Add build.sh flags: --disable-log-truncation, --maven-quiet,
  --search-heap, --karaf-heap
…safety issues

- ExecutionContext.setTenant/restorePreviousTenant: recompute isSystem from
  tenantId so a system context cannot persist after switching to a tenant (SEC-3)
- TaskStateManager: retryDelay is in milliseconds, remove erroneous toMillis()
  conversion that produced multi-year retry delays (UNOMI-939)
- TaskExecutionManager: remove "Simulated crash" sentinel that swallowed real
  exceptions and left tasks permanently locked (UNOMI-939)
- TenantQuotaService.checkQuota: null guard for unconfigured quota avoids NPE
  on new tenants (UNOMI-942)
- AuditServiceImpl.getModifiedItems: fill empty if-body so a null
  persistenceService returns early instead of falling through to NPE
@sergehuber sergehuber merged commit 02ab3c6 into master Jun 6, 2026
6 checks passed
@sergehuber sergehuber deleted the UNOMI-139-platform branch June 6, 2026 06:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants