{{thisPage.meta.articleTitle}}
search

How to use osquery evented tables

{{articleSubtitle}}

| The author's GitHub profile picture

Mo Zhu

How to use osquery evented tables

{{articleSubtitle}}

| The author's GitHub profile picture

Mo Zhu

How to use osquery evented tables

Osquery can let you see the state of your computers right now. However, this snapshot means you have to check the table repeatedly if you want to view data over time. Actively watching and diffing tables could be challenging even with automation, especially for short-lived or high-churn processes.

This is where osquery's evented tables can help. Instead of displaying the point-in-time state for your host, osquery's evented tables store the host's historical data. You can configure osquery to capture certain types of information, which will be stored in the relevant *_events table for later analysis.

In this guide, we’ll go through the concepts, considerations, and best practices for setting up evented tables — focusing on osquery. We’ll also cover basic information about commonly-used evented tables to help you get started.

How do osquery evented tables work?

Osquery does not generate the events itself. Instead, it reads and formats event data generated by various OS components. For example, on Linux, the audit framework generates and broadcasts process and socket event data. Osquery receives the data, converts it into an event row, and buffers them (in the internal rocksdb store) in the process_events and socket_events tables to await querying. The data can then be filtered and transformed via SQL and shipped to a log destination with the scheduled query functionality.

For the purposes of this article, we'll use the term "utility" to mean the underlying OS component that osquery subscribes to for its various evented tables.

This separation between osquery and the utility means that some evented tables rely on configurations for the utility to determine which events will be generated. At the utility level, you can specify what data is captured. At the osquery level, you can specify what information is ingested, presented, and transmitted. For most evented tables, osquery works great out of the box with the utility’s default configurations, but some use cases may require adjusting the utility configuration.

How do I know whether a table is evented?

You can tell that an osquery table is evented in two ways:

  • The table ends in _events.
  • In the osquery schema, the table is marked with an "evented" tag near the table name.

What do I need to consider when configuring evented tables?

Performance impact

Capturing event data generates performance overhead from both the utility and osquery. If the utility is configured loosely to generate more events, then the utility performs more operations to generate events and osquery parses and stores more events.

Capturing only the events you need will cut down on the amount of work the host needs to do. For example, when monitoring processes, there may be frequent but low-value processes such as awk and sed which can be ignored, reducing work for the host. Collecting “good enough” data is key in managing performance impact.

Also, consider the impact of the queries you’re using to collect your event data. Queries using WHERE clauses will be fairly efficient (and minimize data volume), while many JOINs or wildcards (%) will use more resources.

Even after considering all of these factors, you may need to give osquery some additional resources. Osquery's watchdog automatically cancels queries if they exceed certain system usage. This can be adjusted with the following flags:

  • --watchdog_memory_limit changes the maximum memory usage (expressed in MBs).
  • --watchdog_utilization_limit changes the maximum number of CPU cycles (defined as the processes table's user_time and system_time) for more than the time in seconds set by --watchdog_latency_limit.
  • --watchdog_delay sets the delay in seconds before CPU and memory usage limits will be enforced (60 seconds by default).

Disk usage and data retention

Osquery collects data from the utility, formats it into an event row, and stores it in the evented table for querying. Of course, the more events data is collected, the more disk space this occupies. We recommend that you do not rely on osquery as long-term storage for event data. Instead, regularly schedule the data to be sent to an external destination for future analysis. Osquery has built-in options to automatically clean up the data.

The following osquery flags will help you manage the size of osquery's data:

  • --events_max sets the maximum number of event rows per evented table to store in the buffer before expiring them with a default value of 50,000.
  • --events_expiry sets the lifetime of event rows in seconds with a default value of 86,000 (24 hours). An event only expires if a query has been against the table after event generation. When combined with scheduled queries, this is a handy way to clean out data automatically. Some osquery practitioners set this to 1 so that it immediately gets cleared out when a scheduled query runs.
  • --events_optimize=true saves the time that this table was last queried and only returns events after that time (enabled by default). This can be overridden in a one-off query by specifying the time column in a WHERE clause.

You could also consider configuring the utility to ignore extraneous data to minimize resource utilization. Ignoring extraneous data can minimize disk usage at both the utility and osquery levels. The utility will not generate this data in the first place. Or osquery can filter out the extraneous processes in the SELECT statement. Both will minimize data volume.

There is always risk of data loss. Both osquery and the utilities limit the disk usage and processor usage, which results in data loss during periods of high system load. Osquery will log when it's dropping events due to high load, though this will not detect when events are dropped by the utility. To address the risk of data loss, you could schedule queries to run more frequently or allocate more resources to osquery or the utility. Of course, this comes at a cost and requires tuning to balance risk and reward.

Test for impact

Getting the right setup that balances performance, data volume, and data usefulness for the evented tables requires some trial and error. The best way is to try things out on a progressively larger set of machines. We recommend setting up a canary team on Fleet to test different combinations of configurations.

The osquery_schedule table will list all scheduled queries and recent information about their memory usage and execution time. Note that these do not have visibility into the utility. For lower-level visibility, use the OS-native profilers.

Useful troubleshooting tools built into osquery

  • The osquery_events table tells you which evented tables are turned on (active column) and the number of events stored (events column) per table.
  • The osquery_flags table tells you the current set of flags for osquery. You can use this to confirm the desired flags are set correctly.
  • The osquery_schedule table lists all scheduled queries and collects memory and execution time for the latest execution.
  • The --verbose flag will generate more logs with troubleshooting information.

How do I turn on an evented table?

To turn on osquery's eventing system, set the flag --disable_events=false. Eventing is disabled by default.

Each evented table is turned on by its own flag. For most evented tables, when you turn them on in osquery, osquery will use the default configuration of the utility. The defaults are good enough for most situations.

However, we recommend getting to know the underlying utility to optimize it for your use case. Let's go over the following topics:

  1. File integrity monitoring
  2. Process auditing
  3. YARA scanning

File integrity monitoring (FIM)

FIM refers to the monitoring of key files or filepaths. FIM enables organizations to audit the history of critical resources, detect intrusions, and apply remediations.

On all three OSs, in the osquery configuration, use the file_paths key to specify the files and directories from which osquery should collect file_events data. Use the exclude_paths key to ignore files and directories that generate too much noise. Wildcards are available in these configuration options. On Linux, there is a further file_accesses option, which specifies the file locations where an "access" event should be recorded in addition to created/modified/deleted.

FIM on macOS

To turn file_events on for macOS, use the flag --enable_file_events=true. The corresponding utility is FSEvents.

MacOS also has an es_process_file_events table that uses the EndpointSecurity API. However, osquery needs Full Disk Access permission, which can be granted manually or via MDM. To use this, use the flags --disable_endpointsecurity=false --disable_endpointsecurity_fim=false.

es_process_file_events records which processes accessed which files, whereas file_events does not. However, es_process_file_events will generate more data volume because it captures everything by default. Currently, you can configure EndpointSecurity to ignore certain file paths, but there is no way to configure it to only watch certain filepaths.

Due to the data volume, Fleet suggests using file_events for macOS, but you can use es_process_file_events.

FIM on Linux

To turn file_events on for Linux, use the flag --enable_file_events=true. The corresponding utility is inotify.

Linux has a process_file_events table that uses the audit framework. To use this table, use the flags --disable_audit=false --audit_allow_fim_events=true.

Fleet recommends using the process_file_events table since it also includes data for which process accessed which file.

FIM on Windows

For Windows, use the --enable_ntfs_event_publisher=true flag to turn on ntfs_journal_events. The corresponding utility is NTFS Journal.

Learn more

Read the osquery FIM docs for more information on file integrity monitoring with osquery.

Process auditing

Process auditing refers to recording process executions and network, or socket, connections.

Process auditing on Linux

On Linux, there are two utilities that enable osquery process auditing: eBPF and the audit framework.

The choice of utility depends on your situation. Here are some factors to consider:

  • Audit has earlier support (>2.6 ) compared to eBPF (>4.18).
  • Only one consumer of audit’s logs are allowed at a time. The --audit_persist=true flag will set osquery to retry connection to audit logs.
  • Audit has limited visibility inside containers.
  • The audit table and the eBPF table return slightly different data.

We recommend that you try both and compare results for your use case.

To use the bpf_process_events and bpf_socket_events tables, use the flag --enable_bpf_events=true. See the instructions on auditing using bpf for more information.

To use process_events and socket_events with the audit framework, use the flags --disable_audit=false --audit_allow_process_events=true --audit_allow_socket_events=true. See the instructions on using audit for more information.

Process auditing on macOS

On macOS, there are two utilities that enable osquery process auditing: OpenBSM and the EndpointSecurity. Fleet recommends using the EndpointSecurity implementation because it's intended to replace OpenBSM, which is deprecated. EndpointSecurity is available starting macOS 10.15.

To use the es_process_events tables, use the flag --disable_endpointsecurity=false. See the EndpointSecurity instructions for more information. To use process_events and socket_events with OpenBSM, see the OpenBSM instructions.

Windows

Fleet supports auditing process events on Windows via the process_etw_events table. To learn more about process auditing on Windows, visit Microsoft's security auditing overview. Fleet is tracking work to add file auditing for Windows in osquery. Stay up to date on GitHub.

YARA scanning

YARA is a malware research and detection tool available on Linux and macOS that allows users to create descriptions of malware families based on patterns of text or binary code. Each potential piece of malware is matched against a YARA rule and triggers if the specified conditions are met.

Osquery applies pre-specified YARA rules to incoming events in the file_events table to populate the yara_events table. As such, it requires the following flags:

  • --disable_events=false
  • --enable_file_events=true

With the appropriate flags set, specify the appropriate YARA rules in the osquery configuration using the format described in the YARA configuration doc.

Explore more topics and useful links for YARA:

Common event tables

These event tables are available in osquery. We will provide more information for them in other guides.

Table name OS Flags
apparmor_events Linux --audit_allow_apparmor_events=true
disk_events macOS no additional flags needed
hardware_events macOS, Linux no additional flags needed
seccomp_events Linux --audit_allow_seccomp_events
selinux_events Linux --audit_allow_selinux_events=true
syslog_events Linux no additional flags needed
user_interaction_events macOS --enable_keyboard_events=true --enable_mouse_events=true
user_events Linux --audit_allow_user_events=true
windows_events Windows --enable_windows_events_publisher=true
powershell_events Windows --enable_powershell_events_subscriber=true