Cross-Platform Inventory Snapshots
The ET Ducky agent now collects a per-host inventory snapshot on Linux and Windows. The data is captured at boot, refreshed every 24 hours, and refreshable on demand from the dashboard. This post describes what is collected, where it is stored, and how it is queried.
What gets collected
The snapshot has seven sections. Each section is stored in its own JSONB column on the AgentInventoryReports table, so a query about installed software does not have to parse the listening sockets list as well.
OS information
Linux: distro name and version parsed from /etc/os-release, kernel release from uname -r, architecture from uname -m. Windows: product name and build number read from HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion, with the build and Update Build Revision (UBR) combined into the kernel release field.
Installed software
On Linux the collector queries dpkg-query, rpm, snap list, and flatpak list, skipping any package manager that is not installed. On Windows the collector walks the three uninstall registry hives: HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall (64-bit), the WOW6432Node copy of the same key (32-bit applications on 64-bit Windows), and the HKCU equivalent (per-user installs).
The agent does not call WMI Win32_Product. Querying that class triggers Windows Installer to validate every MSI on the system, which can take minutes and rewrites registry entries during validation. The registry uninstall keys are the canonical source of installed-software metadata on Windows; Win32_Product is a documented foot-gun.
Running services
Linux: systemctl list-units --type=service --all --output=json, then a batched systemctl show pass that pulls ExecStart, MainPID, User, and UnitFileState for each unit in chunks of 100 to keep the argv under sysconf(ARG_MAX). Windows: System.ServiceProcess.ServiceController.GetServices(), with a per-service registry read for ImagePath and ObjectName.
Scheduled tasks
Linux reads /etc/crontab, /etc/cron.d/*, /var/spool/cron/* (when readable; per-user crontabs are root-only on most distros), the four /etc/cron.{hourly,daily,weekly,monthly} directories for anacron-style entries, and systemctl list-timers --output=json. Windows uses schtasks /query /fo csv /v and parses the CSV output with a single-line RFC 4180 parser.
Kernel modules and drivers
Linux: /proc/modules gives name, size, use count, dependencies, and state. Windows: the HKLM\SYSTEM\CurrentControlSet\Services hive filtered by Type values 1, 2, 4, or 8 (the four kernel-mode service types), with ServiceController.GetDevices() supplying running state and the driver image file size from disk when accessible.
Listening sockets
Linux: ss -tulnpH for TCP, TCP6, UDP, and UDP6 in one call, with PID resolution included via the -p flag (which requires CAP_NET_ADMIN; the agent's systemd unit grants it). The /proc/net/tcp, tcp6, udp, and udp6 files are a fallback for hosts where ss is missing, but the fallback cannot resolve PIDs without an extra /proc walk.
Windows: GetExtendedTcpTable and GetExtendedUdpTable P/Invokes from iphlpapi.dll, called once each for IPv4 and IPv6. Both APIs return the owning process ID directly, so a single P/Invoke pass produces the full snapshot. Process names are resolved with cached Process.GetProcessById calls so a process owning multiple sockets only resolves once.
Persistence artifacts
A list of file paths and registry keys that are common persistence vectors. Each entry records existence, size, modified time, owner, permissions, and a SHA-256 hash for files smaller than 64 KB.
Linux entries include /etc/ld.so.preload, /etc/sudoers.d/*, /etc/profile.d/*, /etc/cron.allow, /etc/init.d/*, /etc/systemd/system/*, /root/.bashrc, and /root/.ssh/authorized_keys.
Windows entries include the Run and RunOnce registry keys (HKLM, WOW6432Node, and HKCU), Winlogon Userinit and Shell values, AppInit_DLLs, every Image File Execution Options subkey that has a Debugger value set, and the contents of the Common and User Startup folders.
Cadence
The agent collects a snapshot at three points: 5 seconds after registration completes (so the bearer token is in place before the POST), every 24 hours after that, and on demand when an operator triggers it from the dashboard.
The on-demand path uses an SSE event named inventory.refresh. The dashboard endpoint POST /api/agents/{id}/inventory/refresh pushes that event to the agent, which calls InventoryReporterService.RunNowAsync(). The reporter collects a fresh snapshot and posts it back through the same path the boot and 24-hour snapshots use. There is no separate code branch for refresh versus interval; the only difference is what cancelled the 24-hour Task.Delay.
A 24-hour interval is appropriate because inventory data changes slowly. Sending a 5000-package list every minute alongside heartbeat metrics would be wasteful. Sending it once a day matches the rate at which the underlying state actually changes.
Storage
One row per snapshot in AgentInventoryReports. The table has nine columns of structured metadata (Id, AgentId, OrganizationId, ReportedAt, CreatedAt, AgentVersion, Platform, ServerName, plus a CollectorErrorsJson field for per-domain failures) and seven JSONB payload columns, one per domain.
Indexes on (AgentId, ReportedAt DESC) for the latest-per-agent read pattern, on (OrganizationId, ReportedAt DESC) for org-wide rollups, on ReportedAt for retention sweeps, and GIN indexes on InstalledSoftwareJson and ListeningSocketsJson for jsonb_path_query lookups. The other four domain columns do not get GIN indexes because the storage cost was not justified by the query patterns; a full GIN coverage would multiply the table size by roughly 3x.
Row-level security is applied via the standard app_apply_org_rls helper, which is the same policy every other org-scoped table uses.
Querying via Smart Reports
The AgentInventoryReports table is registered in the Smart Reports schema catalog with all per-domain JSONB columns marked filterable. The system prompt that drives the AI query planner has been extended with the per-domain shape of each JSONB column and three sample queries.
The most common query pattern is "latest snapshot per agent". Inventory is point-in-time data, not time-series, so the planner is told to use a row_number window function:
with latest as (
select *, row_number() over (
partition by "AgentId" order by "ReportedAt" desc
) rn
from "AgentInventoryReports"
)
select * from latest where rn = 1;
Three example questions and the queries they produce:
"Which agents have openssh-server installed?" pierces the Packages array in InstalledSoftwareJson with jsonb_array_elements and filters by Name.
"Which agents are listening on port 22?" pierces the Sockets array in ListeningSocketsJson and filters on LocalPort.
"Which agents have a cron entry that runs wget?" pierces ScheduledTasksJson and uses ILIKE on the Command field of each task.
The dashboard also has a per-agent inventory drawer at window.ETDuckyInventory.openPanel(agentId). The drawer has tabs for OS, Software, Services, Scheduled, Sockets, Modules, and Persistence; each tab is a Grid.js table with sort, search, and pagination. A Refresh button calls the inventory.refresh endpoint and re-fetches after a five-second wait.
Failure handling
Each domain collector runs inside its own try/catch with a 45-second time budget enforced by a CancellationTokenSource. A failed collector records its error in CollectorErrorsJson keyed by domain name and the rest of the snapshot is still posted. The dashboard inventory drawer surfaces these per-domain errors so an operator can tell the difference between "the agent has not posted a snapshot yet" and "the snapshot posted but the snap collector timed out".
Per-domain JSON columns are individually capped at 1 MB by the ingestion controller. A runaway collector that produces an oversized payload has its column dropped and the rest of the row is still persisted. The largest realistic payload measured to date is around 600 KB (installed software on a Windows Server with a heavy MSI footprint), so the 1 MB cap leaves substantial headroom.
What is and is not in this release
In: the seven domains, the table, the cadence (boot + 24h + on-demand), the SSE refresh path, the dashboard drawer, the Smart Reports planner integration, and the per-domain failure isolation.
Not in: continuous diff alerts (for example, "a new file appeared in /etc/sudoers.d"). The data is captured but the diffing is not automated. The persistence-tampering rule planned for a follow-up release will use these snapshots once it lands.
Not in: software-version vulnerability matching. Identifying which hosts have a CVE-affected version of openssl requires a CVE feed integration, which is a separate piece of work.
Not in: backend-driven retention pruning. The recommended retention is 90 days, but there is no scheduler that prunes AgentInventoryReports for you yet. Operator policy applies for now.
Inventory snapshots in the dashboard
The first snapshot is posted within five seconds of agent registration. Subsequent snapshots follow on a 24-hour interval, with on-demand refresh available from the per-agent inventory drawer.
Plans and Pricing