OSQuery: Explore your OS with SQL

Key Takeaways

OSQuery, released by Facebook, allows users to inspect the current state of their OS X or Linux operating system using SQL queries. The software is open source and works on CentOS, Ubuntu, and OS X.
OSQuery pretends to be a relational database and contains tables that expose the operating system data in a queryable manner. This simplifies the process of identifying and resolving system issues, such as a taken port or a dead instance of a program.
The software comes with an interactive console for experimenting with queries (osqueryi) and a daemon that can be scheduled to run regularly and aggregate data across monitored machines (osqueryd). It also includes a guide on creating your own tables if needed.
OSQuery provides a default Vagrant configuration for building the package. The installation process involves building the package manually and then installing it from a local location, rather than a remote repository. Once installed, it allows the user to query various types of system information, such as running processes, loaded kernel modules, open network connections, browser plugins, hardware events, and file hashes.

If the title sounds like a confusing hoax, that’s understandable – but it’s very, very real. In an announcement on October 30th, Facebook released OSQuery – a new way to inspect the current state of your OS X or Linux operating system by writing SQL queries.

At first, this might sound weird and your gut reaction might be a resonating “Why?!”, but upon further inspection, useful aspects become obvious. Let’s see how. In this post, I’ll tell you why it might be useful for you, show you how to install it, and guide you through doing some example queries on a prepared Vagrant box you can use if you’re not currently running OS X or Linux.

What is it?

I won’t regurgitate their announcement post – for implementation details see there. In a nutshell, OSQuery pretends to be a relational database and contains some “tables” (tables in quotes because they don’t actually exist as tables you’re used to in, for example, MySQL) which expose the OS data in a manner that makes it queryable by SQL statements (yes, including joins and the whole lot!).

If you ever ran into a situation where you couldn’t run Apache because a port was already taken and you had to go and grep the process list, only to find out a dead instance of Skype is hogging port 80, you’ll know to appreciate the simplicity of OSQuery.

OSQuery works on CentOS, Ubuntu, and OS X, thus supporting your production servers, your development playbox, and the operating systems of any other machine you have access to, like your children’s or your employees’s – allowing you to use it to monitor the OS status of your entire ecosystem. It’s fully open source, and there’s even a guide on creating your own tables, in case some are missing and you need them. The team is adding new tables regularly, so even if you don’t feel like contributing but still want to use some missing ones, there’s a high chance they’ll pop up if you give it some time.

The software is installed via (currently) self-built packages for all supported operating systems, and comes with osqueryi – an interactive console for playing around with the queries – and osqueryd – a daemon you can schedule to run regularly and aggregate data across monitored machines, for example. The documentation is very good, so conquering every aspect of OSQuery is as simple as dedicating an afternoon to it.

Installing and Using OSQuery

OSQuery provides a default Vagrant configuration for you to use for building the package which you’ll eventually distribute across all other machines you’d like it installed on. If you’re not familiar with Vagrant, and you really should be, see our posts on the topic here.

The installation process is somewhat convoluted if you’ve never used VMs, so let’s break it down. Let’s imagine we have an Ubuntu 14.04 machine onto which we’d like to install OSQuery. Typically, you install software via a package manager such as Aptitude by issuing a command like apt-get install. However, since OSQuery is not in the official repos for these types of distributions yet, we’ll need to build the package manually, and then install it from a local location (by copying a .deb file onto the target machine), rather than a remote repository as usual. This might sound more complicated than it really is, so let’s do the step by step dance.

1. Clone and Up the OSQuery box

Make sure you have Git, Vagrant and Virtualbox installed on your main machine, and execute the following:



git clone https://github.com/facebook/osquery

cd osquery

vagrant up ubuntu14

If your copy of Vagrant has an Ubuntu14 image downloaded from before, you should be up and running in a minute tops. Otherwise, it’ll download the image which might take a while, and then create the virtual machine.

2. Build in the Virtual Environment

SSH into your VM with vagrant ssh. In our case, that’ll be


vagrant ssh ubuntu14

Once inside, execute:



sudo su

cd /vagrant

./tools/provision.sh

Note that if you’re on Windows, the famous symlink error will rear its ugly head again. Just re-run the provision script after it fails to complete, and it should work. This is a strange hiccup that warrants further investigation, and I’ll post back if I find any real workarounds or if the issue is fixed.

This will update the Ubuntu instance and download everything OSQuery needs to build itself. Then, we tell it to wrap itself into an installable package.



make

make package

You should then be able to see the package in /vagrant/build/linux/osquery-0.0.1-trusty.amd64.deb.

3. Installing OSQuery

To install this, we can use the default Debian Package Management System:


sudo dpkg -i osquery-0.0.1-trusty.amd64.deb

Installing it into any of your Ubuntu 14.04 machines is now as simple as copying the .deb file over, and running the above command. We can even install it into the very OS that built it.

If you need packages for other operating systems, the procedure is exactly the same with minimal alterations – just follow the instructions.

5. Using OSQuery

Let’s see if it works. Enter the interactive console by executing osqueryi. You should see something like this:

Let’s see a test query. Paste the following into the console and execute it:


SELECT * FROM users;

You should see something like this happen:

You can list all available tables by just executing .tables, all commands with .help and you can exit with .exit.

Malicious Actors Example

As per their announcement post, the query:


SELECT name, path, pid FROM processes WHERE on_disk = 0;

lists all processes of which the binary which launched them no longer exists on disk. Running a process and disappearing is a common approach of malicious actors, and if your system isn’t compromised, it shouldn’t return anything.

All Users with Groups Example


SELECT u.uid, u.gid, u.username, g.name, u.description FROM users u LEFT JOIN groups g ON (u.gid = g. gid);

The above query will output all the users of the OS with their IDs, their groups and group names, and their descriptions.

Find all empty groups


SELECT groups.gid, groups.name FROM groups LEFT JOIN users ON (groups.gid = users.gid) WHERE users.ui d IS NULL;

This query finds all the user groups of the OS that are empty – that no user belongs to.

These are all very simple examples, but you can already see how interaction between tables can reveal interesting information quickly and efficiently.

Conclusion

OSQuery is Facebook’s latest open source wonder – a way to expose the system level data with a relational-database-like API that lets us query our OS as if it were a pile of relational data. While useful for monitoring a server or a cluster of servers, this definitely has other applications as well – from malware detection to zombie process kills, you name it.

Have you thought of any unique uses? Want to write about them? Get in touch!

Frequently Asked Questions (FAQs) about OSQuery

What is the primary function of OSQuery?

OSQuery is an open-source tool developed by Facebook that allows you to query your operating system as if it were a relational database. It provides a SQL interface for querying various types of system information, such as running processes, loaded kernel modules, open network connections, browser plugins, hardware events, and file hashes. This makes it easier to perform system monitoring, diagnose problems, and ensure compliance with security policies.

How does OSQuery differ from traditional monitoring tools?

Traditional monitoring tools often focus on specific aspects of system health, such as CPU usage, disk space, or network traffic. OSQuery, on the other hand, provides a more holistic view of system state by allowing you to query virtually any type of system information. This makes it a powerful tool for system administrators, security analysts, and developers who need to understand the state of a system at a given point in time.

How can I install OSQuery on my system?

OSQuery can be installed on various operating systems, including Linux, macOS, and Windows. The installation process varies depending on the operating system. For Linux, you can typically install OSQuery using the package manager for your distribution. For macOS, you can use Homebrew, and for Windows, you can download the OSQuery installer from the official website.

How can I use OSQuery to monitor system processes?

OSQuery allows you to query the running processes on your system using the “processes” table. You can use SQL queries to filter the processes based on various criteria, such as the process name, user ID, or CPU usage. This can be useful for identifying resource-hungry processes, detecting unauthorized processes, or troubleshooting system performance issues.

Can OSQuery be used for security monitoring?

Yes, OSQuery is a powerful tool for security monitoring. It can be used to detect changes in system state that may indicate a security breach, such as the creation of new user accounts, changes to system files, or the installation of unknown software. By regularly querying system state and comparing it against a known good baseline, you can identify potential security threats before they cause damage.

How can I schedule regular queries with OSQuery?

OSQuery includes a feature called “scheduled queries” that allows you to run queries at regular intervals and log the results. This can be useful for monitoring system state over time, detecting changes that may indicate a problem, or collecting data for analysis.

Can OSQuery be used in a networked environment?

Yes, OSQuery can be used in a networked environment. It includes features for distributed query execution, which allows you to run queries on multiple machines from a central location. This can be useful for managing large fleets of machines, where manually querying each machine would be impractical.

What types of data can OSQuery query?

OSQuery can query a wide range of system information, including hardware events, running processes, loaded kernel modules, open network connections, browser plugins, and file hashes. It can also query information about the operating system itself, such as the version, build number, and installed patches.

How can I learn more about using OSQuery?

The official OSQuery website includes comprehensive documentation that covers all aspects of using OSQuery, from installation and configuration to query syntax and examples. There are also numerous online tutorials and blog posts that provide practical examples of how to use OSQuery for system monitoring and security analysis.

Is OSQuery suitable for use in a production environment?

Yes, OSQuery is designed to be used in a production environment. It is used by many large organizations, including Facebook, to monitor their systems and ensure compliance with security policies. However, like any tool, it should be used responsibly and in accordance with your organization’s policies and procedures.