Browser-Based Remote Desktop on Windows and Linux

May 6, 2026 Christopher 5 min read

remote-desktopwebcodecswaylandcross-platform

This post describes the remote desktop implementation in the ET Ducky agent. As of May 2026, the agent supports Windows (user desktop and secure desktop) and Linux (Wayland via xdg-desktop-portal, X11 via x11vnc). The browser viewer is one implementation shared across all backends.

Capture backends

The agent selects a capture backend at session start based on the host operating system and the active session type. The dashboard viewer renders frames from whichever backend the agent selects.

OS	Capture backend	When it is used
Windows (user desktop)	DXGI Desktop Duplication via signed helper	Default. The agent service launches a separately-signed helper process into the interactive user's session. The helper uses DXGI Desktop Duplication for GPU-assisted capture and ships frames over the same authenticated WebSocket the dashboard uses. Hardware-accelerated, low CPU overhead.
Windows (secure desktop)	GDI BitBlt against Winlogon	Engages automatically while a UAC prompt, Ctrl+Alt+Del screen, or lock screen is on display. The helper switches its thread to the Winlogon desktop, captures via GDI BitBlt, and overlays the cursor manually. Slower than DXGI but works deterministically across GPU drivers and lets the operator interact with UAC from the viewer.
Linux (Wayland)	xdg-desktop-portal RemoteDesktop + PipeWire	Modern Wayland sessions on GNOME, KDE Plasma, or any compositor that implements the portal interface. PipeWire gives the agent a stream of frames from the compositor's own buffers. Input injection rides the same portal interface.
Linux (X11)	x11vnc + noVNC	Older Linux desktops or hosts that have not migrated to Wayland. The agent spawns x11vnc against the logged-in user's `$XAUTHORITY`, resolved at runtime via `loginctl`, and bridges the RFB stream to the dashboard.

The agent picks the backend at session start. The dashboard does not need to know or care; the viewer treats whatever arrives the same way.

H.264 over WebCodecs

On hosts with hardware H.264 encoding (VA-API on Linux, QuickSync on Intel, NVENC on NVIDIA), the agent encodes frames using the hardware encoder. The Linux helper selects among vah264enc (VA-API), x264enc (software), or JPEG fallback based on host capability. The Windows helper streams WebP at present; H.264 via Media Foundation is planned. The wire format and browser viewer support H.264 on either OS, so a Windows encoder change is a backend-only change.

The browser decodes H.264 frames via the WebCodecs VideoDecoder API. WebP frames are decoded via SkiaSharp and painted to the same canvas.

Encoding load on the host is bounded. Hardware encoding runs on dedicated silicon. Software encoding runs in the helper process under the agent's cgroup limits on Linux and job-object limits on Windows.

Browser support and JPEG fallback

WebCodecs is supported on Chromium-based browsers at version 94 or later, Firefox 130 or later, and Safari 16 or later. On browsers without WebCodecs, the agent falls back to JPEG at session start. The viewer render path is the same canvas in either case.

The agent emits a codec-announce message (type=6) at session establishment. If the announce indicates JPEG, the viewer does not initialize WebCodecs. If the announce indicates H.264, the viewer configures a VideoDecoder with the SPS/PPS from the announce and decodes type=1 NALU frames as they arrive.

Windows secure desktop

When a UAC prompt fires, a user presses Ctrl+Alt+Del, or the workstation locks, Windows switches the active desktop to Winlogon, which renders in a separate session with kernel-level isolation from the user's interactive desktop. Capture bound to the user's desktop does not produce frames during this period.

The ET Ducky helper polls the input desktop each iteration. When Winlogon is the active desktop, the capture thread switches its desktop binding to Winlogon via SetThreadDesktop, captures the screen via GDI BitBlt (DXGI Desktop Duplication does not bind to the secure desktop), and overlays the cursor with GetCursorInfo and DrawIconEx. When the secure desktop dismisses, the helper switches the thread binding back to the user desktop and DXGI re-initializes.

Input injection follows the same pattern. The helper's injection thread re-attaches to the current input desktop before each batch of SendInput calls.

The mechanism that makes capture and injection cross the secure-desktop boundary is the helper's uiAccess=true manifest. Windows enforces three preconditions before honoring that flag: the binary must be Authenticode-signed by a CA-trusted publisher, it must reside in a Windows-defined secure location, and the launching process must hold SeTcbPrivilege. The agent service ships an explicit privilege whitelist via sc privs, the helper is signed by ET Ducky LLC's EV code-signing certificate, and the helper is cached under C:\Program Files\ETDucky\Agent\RdpHelper\<version>\. With all three in place, the helper can capture and inject across the User Interface Privilege Isolation boundary. Without all three, Windows does not honor the flag and the helper is limited to the user desktop.

The uiAccess bypass is scoped to UIPI. It does not grant additional file system, registry, network, or token privileges. The helper runs at the interactive user's integrity level. Replacing the binary in the cache would fail Authenticode validation at WinVerifyTrust time and Windows would not honor the manifest. The signed-helper distribution path is documented on the security page.

Network requirements

The agent uses outbound HTTPS connections to etducky.com for enrollment, heartbeat, event upload, live-session AI, and remote desktop. The remote-desktop session is a WebSocket upgraded from that connection. The agent does not open inbound ports and does not require port-forwarding or a VPN tunnel.

The agent's network footprint is one outbound TLS connection to one host.

Resource bounds

Remote desktop is the agent's highest-throughput operation. Capture, encode, and frame transmission run inside the agent's cgroup limits on Linux and job-object limits on Windows.

Hardware encoding has near-zero CPU cost. Software encoding under x264enc adapts to the available CPU budget within the cgroup. The frame-pump path bounds its own buffers and drops frames rather than backpressuring the kernel.

Session integration

Remote desktop is one of three live-session interaction modes alongside the AI question/answer flow and the shell plus file-transfer pane. The three modes share session state. A single session id covers all three with operator attribution, the same authentication, and the same audit log.

ET Ducky

Documentation and pricing are available on this site.

View Pricing