OSQuery: Explore your OS with SQL
If the title sounds like a confusing hoax, that’s understandable – but it’s very, very real. In an announcement on October 30th, Facebook released OSQuery – a new way to inspect the current state of your OS X or Linux operating system by writing SQL queries.
At first, this might sound weird and your gut reaction might be a resonating “Why?!”, but upon further inspection, useful aspects become obvious. Let’s see how. In this post, I’ll tell you why it might be useful for you, show you how to install it, and guide you through doing some example queries on a prepared Vagrant box you can use if you’re not currently running OS X or Linux.
What is it?
I won’t regurgitate their announcement post – for implementation details see there. In a nutshell, OSQuery pretends to be a relational database and contains some “tables” (tables in quotes because they don’t actually exist as tables you’re used to in, for example, MySQL) which expose the OS data in a manner that makes it queryable by SQL statements (yes, including joins and the whole lot!).
If you ever ran into a situation where you couldn’t run Apache because a port was already taken and you had to go and grep the process list, only to find out a dead instance of Skype is hogging port 80, you’ll know to appreciate the simplicity of OSQuery.
OSQuery works on CentOS, Ubuntu, and OS X, thus supporting your production servers, your development playbox, and the operating systems of any other machine you have access to, like your children’s or your employees’s – allowing you to use it to monitor the OS status of your entire ecosystem. It’s fully open source, and there’s even a guide on creating your own tables, in case some are missing and you need them. The team is adding new tables regularly, so even if you don’t feel like contributing but still want to use some missing ones, there’s a high chance they’ll pop up if you give it some time.
The software is installed via (currently) self-built packages for all supported operating systems, and comes with osqueryi
– an interactive console for playing around with the queries – and osqueryd
– a daemon you can schedule to run regularly and aggregate data across monitored machines, for example. The documentation is very good, so conquering every aspect of OSQuery is as simple as dedicating an afternoon to it.
Installing and Using OSQuery
OSQuery provides a default Vagrant configuration for you to use for building the package which you’ll eventually distribute across all other machines you’d like it installed on. If you’re not familiar with Vagrant, and you really should be, see our posts on the topic here.
The installation process is somewhat convoluted if you’ve never used VMs, so let’s break it down. Let’s imagine we have an Ubuntu 14.04 machine onto which we’d like to install OSQuery. Typically, you install software via a package manager such as Aptitude by issuing a command like apt-get install
. However, since OSQuery is not in the official repos for these types of distributions yet, we’ll need to build the package manually, and then install it from a local location (by copying a .deb
file onto the target machine), rather than a remote repository as usual. This might sound more complicated than it really is, so let’s do the step by step dance.
1. Clone and Up the OSQuery box
Make sure you have Git, Vagrant and Virtualbox installed on your main machine, and execute the following:
git clone https://github.com/facebook/osquery
cd osquery
vagrant up ubuntu14
If your copy of Vagrant has an Ubuntu14 image downloaded from before, you should be up and running in a minute tops. Otherwise, it’ll download the image which might take a while, and then create the virtual machine.
2. Build in the Virtual Environment
SSH into your VM with vagrant ssh
. In our case, that’ll be
vagrant ssh ubuntu14
Once inside, execute:
sudo su
cd /vagrant
./tools/provision.sh
Note that if you’re on Windows, the famous symlink error will rear its ugly head again. Just re-run the provision script after it fails to complete, and it should work. This is a strange hiccup that warrants further investigation, and I’ll post back if I find any real workarounds or if the issue is fixed.
This will update the Ubuntu instance and download everything OSQuery needs to build itself. Then, we tell it to wrap itself into an installable package.
make
make package
You should then be able to see the package in /vagrant/build/linux/osquery-0.0.1-trusty.amd64.deb
.
3. Installing OSQuery
To install this, we can use the default Debian Package Management System:
sudo dpkg -i osquery-0.0.1-trusty.amd64.deb
Installing it into any of your Ubuntu 14.04 machines is now as simple as copying the .deb
file over, and running the above command. We can even install it into the very OS that built it.
If you need packages for other operating systems, the procedure is exactly the same with minimal alterations – just follow the instructions.
5. Using OSQuery
Let’s see if it works. Enter the interactive console by executing osqueryi
. You should see something like this:
Let’s see a test query. Paste the following into the console and execute it:
SELECT * FROM users;
You should see something like this happen:
You can list all available tables by just executing .tables
, all commands with .help
and you can exit with .exit
.
Malicious Actors Example
As per their announcement post, the query:
SELECT name, path, pid FROM processes WHERE on_disk = 0;
lists all processes of which the binary which launched them no longer exists on disk. Running a process and disappearing is a common approach of malicious actors, and if your system isn’t compromised, it shouldn’t return anything.
All Users with Groups Example
SELECT u.uid, u.gid, u.username, g.name, u.description FROM users u LEFT JOIN groups g ON (u.gid = g. gid);
The above query will output all the users of the OS with their IDs, their groups and group names, and their descriptions.
Find all empty groups
SELECT groups.gid, groups.name FROM groups LEFT JOIN users ON (groups.gid = users.gid) WHERE users.ui d IS NULL;
This query finds all the user groups of the OS that are empty – that no user belongs to.
These are all very simple examples, but you can already see how interaction between tables can reveal interesting information quickly and efficiently.
Conclusion
OSQuery is Facebook’s latest open source wonder – a way to expose the system level data with a relational-database-like API that lets us query our OS as if it were a pile of relational data. While useful for monitoring a server or a cluster of servers, this definitely has other applications as well – from malware detection to zombie process kills, you name it.
Have you thought of any unique uses? Want to write about them? Get in touch!