LILAM

Asynchronous Alerting Architecture

Overview

To maintain high throughput (> 2,500 EPS), LILAM strictly decouples event analysis from notification dispatch. This prevents external latencies (e.g., SMTP handshakes) from impacting the core processing loop.

Responsiveness (RAM-First): LILAM prioritizes immediate alerting over persistence. Alerts are triggered directly from the RAM-resident Rules Engine via [DBMS_ALERT].
High-Throughput Buffering: To maintain > 2,500 EPS, event persistence is decoupled and buffered. Data is flushed to the MONITOR_TABLES asynchronously with a controlled delay (up to 1.8s), ensuring disk I/O never bottlenecks the real-time analysis.

To ensure no alert is ever lost, LILAM follows a Write-then-Signal pattern. When a rule violation is detected, the server immediately persists the alert metadata to the LILAM_ALERTS table before signaling the asynchronous consumer. This guarantees that the consumer always finds a valid record to process upon wakeup, maintaining high reliability even under heavy load.

Alert Handshake Workflow

sequenceDiagram
    autonumber
    participant App as Application (Business Logic)
    participant Prox as LILAM API (Proxy Package)
    participant Srv as LILAM Server (Background Process)
    participant DB as DATABASE (Persistence)
    participant Cons as Alert Consumer (Async)

    Note over App, Srv: High-Speed Path (Non-blocking)
    App->>Prox: Call trace_start / trace_stop
    Prox->>Srv: Write to DBMS_PIPE (SGA)
    Prox-->>App: Return Control (instantly)

    Note over Srv: Real-Time Analysis (RAM)
    Srv->>Srv: Rules Engine Evaluation
    
    alt Alert Triggered
        Srv->>DB: Write Metadata to LILAM_ALERTS
        Srv-->>Cons: Signal Breach (DBMS_ALERT)
        Note right of Srv: Server continues processing RAM buffer
    end

    Note over Srv, DB: Buffered Persistence (Telemetry)
    Srv->>DB: Asynchronous Bulk Insert (Performance Logs)

    Note over Cons, DB: Async Notification Path
    Cons->>Cons: Wake up on Signal
    Cons->>DB: Read Metadata from LILAM_ALERTS
    Cons->>Cons: Dispatch Notification (Mail/etc.)

Note: LILAM rules are not limited to error detection. They can also be used to track positive business milestones or validate complex event sequences (e.g., “Event B must follow Event A within X seconds”).

Configuration

Rules define how LILAM validates incoming events. Each rule shares a common set of parameters that specify which event type to monitor, the evaluation criteria to apply, and the corresponding action to take when a rule is triggered (e.g., notifying on a threshold breach or confirming an expected sequence of events). Rules are organized into Rule Sets, which are stored as JSON objects in the LILAM_RULES table. Within these JSON objects, individual rules are managed as structured arrays for efficient processing. Have also a look to Rule Set Sample.

Rule Set Structure

Hooks

Operators

¹ The trigger_type acts as a filter to determine when a rule is evaluated. It maps to core LILAM API calls, such as starting a transaction (TRACE_START), reaching a milestone (MARK_EVENT), or completing a process (PROCESS_STOP). ² The context field allows you to apply rules more selectively. Use it to differentiate between various instances of the same action. This is particularly useful when different thresholds or SLAs apply to specific locations or segments (e.g., a “Speed Limit” rule that only applies to a specific track section). For example rule SEQ-003 only monitors travel times for the specific track segment SECTION_400_001, rather than every segment on the line.

Deep Dive: Anomaly Detection with EWMA

The AVG_DEVIATION_PCT operator utilizes an Exponentially Weighted Moving Average (EWMA). Unlike a simple arithmetic mean, the EWMA gives more weight to recent data points, allowing the system to adapt to shifting performance trends in real-time.

What is EWMA?

It is a statistical measure used to model time-series data. In LILAM, it creates a “moving baseline” for your business transactions. If a new event deviates significantly from this baseline, an alert is triggered.

Technical Example: `20|100|0.1`

When using AVG_DEVIATION_PCT with the value 20|100|0.1, the parameters are defined as follows:

Parameter	Value	Description
Tolerance	`20`	Trigger an alert if the deviation is > 20% from the average.
Warm-up	`100`	Minimum number of initial events needed to build a stable baseline before alerting starts.
Smoothing (Alpha)	`0.1`	The weight of the latest event (10%). A lower value makes the average more stable; a higher value makes it more reactive to sudden changes.

    {
      "id": "SEQ-003",
	  "_comment": "Mehr als 25 Sekunden dauert die Fahrt nicht. Irgendetwas hat den Zug aufgehalten.",
      "trigger_type": "TRACE_STOP",
      "action": "TRACK_SECTION",
      "context": "SECTION_400_001"
      "condition": {
        "operator": "MAX_DURATION_MS",
        "value": "25000"
      },
      "alert": { "handler": "MAIL_LOG", "severity": "WARN", "throttle_seconds": 0 }
    }

Table: LILAM_RULES

This table serves as the central repository for all rule sets. Each rule set is stored as a single, versioned JSON document, allowing for flexible and dynamic rule management.

Column	Type	Description
SET_NAME	`VARCHAR2(30)`	Primary Key. The unique identifier for the rule set.
VERSION	`NUMBER`	Version number to support testing, staging, and rollbacks.
RULE_SET	`CLOB`	The core configuration: A JSON object containing the header and the array of rule definitions.
CREATED	`TIMESTAMP`	Audit timestamp: When this specific version was created.
AUTHOR	`VARCHAR2(50)`	Attribution: The developer or architect who defined the rule set.

** Implementation Note ** The LILAM servers load the RULE_SET JSON into RAM at startup (or upon manual refresh). This minimizes database I/O during high-speed event processing, as all rule evaluations are performed against the cached > memory structure.

Loading a Rule Set

LILAM servers support dynamic rule set updates at runtime. Active configurations are persisted in the LILAM_SERVER_REGISTRY, ensuring that servers automatically reload the correct rule sets upon restart:

exec LILAM.SERVER_UPDATE_RULES(p_processId => 1202, p_ruleSetName => 'METRO Rules', p_ruleSetVersion => 2);

This site is open source. Improve this page.