Part 10 — Defender Bridge

<index> / <wazabiedr> / defender-bridge

[ en | fr ]

┌───────────────────────┐
│                       │
│                       │
│                       │
│                       │
│                       │
│                       │
│                       │
│                       │
│                       │
│                       │
│                       │
│                       │
│                       │
│                       │
│                       │
└───────────────────────┘

Part 10 — Defender Bridge
~ lululufr

CONTENTS

  0  why bridge defender
  1  two files, three hundred lines
  2  evtsubscribe — the api surface
  3  the callback and the lifetime model
  4  rendering the event xml
  5  routing by event id
  6  known limitations

──[ 0. Why Bridge Defender ]──

Windows Defender's antimalware engine ships with the OS, runs by default on 
every installation, and writes its detection output to a stable Event Log 
channel. Re-implementing detection inside our agent would duplicate work that 
Defender already does well; bridging Defender's output into our event stream is 
a small, well-scoped piece of work that yields a baseline of malware-detection 
telemetry for free.

The plugin has a second purpose: it is the canonical **template** for 
callback-driven plugins. The shape — open the source's subscription API, 
register a callback, forward events through `EventSink`, park the main thread 
until shutdown — is the same shape any push-driven plugin will take, from ETW 
(Event Tracing for Windows) consumers to Win32 API hooks to custom application 
sources. Read this post as a worked example of that pattern as much as a 
Defender-specific bridge.

──[ 1. Two Files, Three Hundred Lines ]──

The whole plugin is in `WazabiEDR_Plugin_DefenderBridge/`:

   src/
   ├── main.rs        ~30 lines — SDK wiring
   └── eventlog.rs    ~250 lines — EvtSubscribe wrapper + XML render

`main.rs` is short because the SDK (Part 8) absorbs the protocol work. Here it 
is whole:

mod eventlog;

use wedr_plugin_sdk::{Result, metadata, run_with};

const PLUGIN_NAME: &str = "wedr-defender-bridge";

fn main() -> Result<()> {
    let metadata = metadata!(PLUGIN_NAME)?;
    run_with(metadata, |sink, ctx| {
        let _subscription = eventlog::Subscription::start(sink.clone())?;
        eprintln!("[{PLUGIN_NAME}] subscribed to Defender/Operational");
        ctx.wait_for_shutdown();
        eprintln!("[{PLUGIN_NAME}] shutdown signalled, unsubscribing");
        Ok(())
    })
}

The non-obvious detail is the ordering inside the closure.

The subscription is started first; it returns an RAII handle (RAII: Resource 
Acquisition Is Initialisation, the pattern of binding a resource's lifetime to a 
stack variable's scope so that the destructor releases the resource 
deterministically). The handle is held on the stack via `let _subscription = …`. 
Then `wait_for_shutdown` parks the thread. On wake-up, the closure returns, the 
handle drops (which tells the Event Log to stop dispatching callbacks), and 
*then* the SDK tears down the pipe.

The reverse order would race: if the SDK tore down the pipe before the 
subscription stopped dispatching, an in-flight callback's `sink.emit` could 
write to a dead pipe. Holding the subscription on the stack inverts the drop 
order — subscription first, sink second.

──[ 2. EvtSubscribe — the API Surface ]──

The Windows Event Log API (the API exposed through `wevtapi.dll`, the modern 
replacement for `eventvwr`-era logging APIs) supports several subscription 
modes. We use the callback flavour:

let handle = unsafe {
    EvtSubscribe(
        0,                          // local session
        ptr::null_mut(),            // no signal event — using callback
        channel_w.as_ptr(),         // "Microsoft-Windows-Windows Defender/Operational"
        ptr::null(),                // no XPath filter — deliver every event
        0,                          // no bookmark
        ctx as *const c_void,
        Some(callback),
        SUBSCRIBE_FUTURE_EVENTS,
    )
};

`SUBSCRIBE_FUTURE_EVENTS` (the `EvtSubscribeToFutureEvents` flag value) tells 
the API to deliver only events that occur from this point onward; it does not 
replay history. For an EDR plugin this is the right default — we don't want a 
fresh plugin start to flood the agent with last week's Defender events. Replay 
is supported via bookmarks (an opaque handle returned by `EvtCreateBookmark` 
that records "I have seen up to this event") but is out of scope for the v1 of 
this plugin.

The `ctx as *const c_void` is the standard idiom for smuggling Rust data into a 
C-callback API. Windows passes the pointer back verbatim on every dispatch:

struct CallbackCtx {
    sink: EventSink,
}

We `Box::into_raw` the context once before passing it to `EvtSubscribe`, and the 
callback dereferences it back to a Rust reference. The lifetime of that 
allocation needs care — see the next section.

──[ 3. The Callback and the Lifetime Model ]──

The Event Log API dispatches callbacks on its own internal threads. The 
documentation states that `EvtClose` drains in-flight callbacks before 
returning, but the contract is described loosely enough that production code 
typically takes a defensive posture against a stray callback executing after 
close.

impl Drop for Subscription {
    fn drop(&mut self) {
        unsafe { EvtClose(self.handle); }
    }
}

The `CallbackCtx` allocation is **intentionally leaked**. The plugin process is 
exiting after `Subscription` drops, and the cost of leaking one `Arc<EventSink>` 
plus a small wrapper struct (a few words of memory) is negligible compared to 
the cost of a use-after-free if `EvtClose` returns before draining a 
still-in-flight callback. The alternative — refcounting the context with `Arc` 
so the kernel logger's stray callback can safely keep using it — is more code 
and adds one more thing that can be subtly wrong.

The callback itself is short:

unsafe extern "system" fn callback(
    action: i32,
    user_ctx: *const c_void,
    event: EVT_HANDLE,
) -> u32 {
    if action != ACTION_DELIVER { return 0; }
    if user_ctx.is_null() { return 0; }
    let ctx = unsafe { &*(user_ctx as *const CallbackCtx) };

    let xml = match render_event_xml(event) {
        Ok(s) => s,
        Err(_) => return 0,
    };

    let event_id = extract_event_id(&xml).unwrap_or(0);
    let kind = kind_for(event_id);

    let payload = json!({
        "event_id": event_id,
        "raw_xml":  xml,
    });

    let _ = ctx.sink.emit(&kind, payload);
    0
}

`action == 0` is the API's error-delivery variant — Windows calls the same 
callback to inform us that the subscription itself has a problem (channel went 
away, etc.). We drop those because re-emitting them as plugin events would 
create a feedback loop: every error becomes another event becomes another error.

The callback returns 0 unconditionally. The API allows non-zero to mean "this 
event had a problem", but a single render failure should not poison the entire 
subscription — `0` followed by logging-and-continue is the more forgiving 
posture. The `let _ = ctx.sink.emit(...)` swallowed error is for the same 
reason: if the pipe is broken, the plugin's main thread will discover that and 
exit; the callback in an Event Log thread cannot productively react.

──[ 4. Rendering the Event XML ]──

`EvtRender` follows the standard Win32 two-call probe-then-render pattern. First 
call with an empty buffer to learn the required size, second call with a buffer 
of that size:

fn render_event_xml(event: EVT_HANDLE) -> std::io::Result<String> {
    let mut buffer_used: u32 = 0;
    let mut property_count: u32 = 0;

    unsafe {
        let _ = EvtRender(
            0, event, RENDER_EVENT_XML, 0, ptr::null_mut(),
            &mut buffer_used, &mut property_count,
        );
        let err = GetLastError();
        if err != ERROR_INSUFFICIENT_BUFFER {
            return Err(std::io::Error::from_raw_os_error(err as i32));
        }
    }

    let u16_cap = (buffer_used as usize).div_ceil(2);
    let mut buf: Vec<u16> = vec![0; u16_cap];

    let ok = unsafe {
        EvtRender(
            0, event, RENDER_EVENT_XML, buffer_used,
            buf.as_mut_ptr() as *mut c_void,
            &mut buffer_used, &mut property_count,
        )
    };
    if ok == 0 {
        return Err(std::io::Error::last_os_error());
    }
    // … UTF-16 buf → String, trim trailing NUL …
}

One unit-conversion landmine: `buffer_used` is reported in **bytes** even though 
the rendered output is UTF-16. The `Vec<u16>` length is therefore `div_ceil(2)` 
of the byte count, not the byte count itself. The MSDN documentation says 
"bytes" but it's easy to miss.

Defender events are small — under 4 KiB each — so the second call always 
succeeds. For channels with larger events (Security with audit-policy detail, or 
any channel carrying inline PowerShell command logs), an event could grow 
between the probe and the render; production code for those channels needs a 
retry loop.

──[ 5. Routing by Event ID ]──

The `kind` field on the wire is the agent's primary routing key for plugin 
events. The plugin selects it by parsing the EventID out of the rendered XML:

fn kind_for(event_id: u32) -> String {
    match event_id {
        1116 | 1117 => "defender.threat_detected".into(),
        1118 | 1119 => "defender.threat_remediation_failed".into(),
        5001        => "defender.realtime_protection_disabled".into(),
        5004        => "defender.realtime_protection_config_change".into(),
        2000 | 2001 => "defender.engine_definition_update".into(),
        other       => format!("defender.event_{other}"),
    }
}

A handful of explicit mappings for events that drive downstream alerting, then a 
fallback to `defender.event_N` so nothing is silently dropped. Consumers that 
don't care about a specific EventID see a single bucket; consumers that do (a 
server-side rule on EventID 5001) match the dedicated `kind`.

The parsing is a substring search for `EventID="N"` rather than a full XML 
parser. Defender's XML schema is stable across Windows 10 and Windows 11, the 
`EventID` attribute always appears at the same nesting level, and a substring 
search is faster than instantiating an XML parser on every callback invocation.

EventID 5001 (real-time protection disabled) gets dedicated treatment because it 
is the canary for an antimalware-tampering attack: a piece of malware that 
disables Defender as a precondition for its actual payload will produce a 5001 
before any 1116/1117 detection could fire. The dedicated `kind` lets a 
server-side rule react to "antivirus just got turned off on host X" with higher 
priority than ordinary detections.

──[ 6. Known Limitations ]──

The plugin's README is explicit about what is missing:

    - No bookmark — no replay of events that occurred while the
      plugin was down. The agent is single-event-source for that
      window; if it had been collecting Defender via another
      mechanism, gaps would show.
    - No XPath filtering. The plugin could pass an XPath query
      to EvtSubscribe and let Defender drop uninteresting events
      on its end; v1 forwards everything and lets the agent
      decide.
    - No WSC primary-AV check. If WazabiEDR were registered as
      primary AV via the Windows Security Center API, Defender
      would drop into passive mode and stop emitting 11xx events
      without notice. The plugin should detect that state and
      log it.

None are architectural; all are tracked. The plugin earns its keep as-is for a 
first-version EDR build-out.

Next post: the install layout, the parts of the system that hot-reload, and the 
parts that require a service restart.


ret <wazabiedr>