Setup Guide Monitor Your Site with Driftbot

Driftbot is a Node.js application that runs as a GitHub Action leveraging Puppeteer for headless browser automation. It can be difficult to test the integrity and assess the risk of your application's real-world (e.g. deployed live) software supply chain. Driftbot helps you do that.

Example Configs

Not sure what to expect? Check out these example sites configured with Driftbot alerts.

Setup Driftbot on your site

Start by cloning the Driftbot GitHub repo.

$ git clone [email protected]:darkbitio/driftbot.git

Then rename the origin remote to upstream.

$ git remote rename origin upstream

Then create a GitHub repo.

$ gh repo create driftbot --private -y
✓ Created repository username/driftbot on GitHub
✓ Added remote [email protected]:username/driftbot.git

Edit .github/workflows/driftbot.yml and set your site with the SITE_URL environment variable.

SITE_URL: 'https://mysite.com/'

Commit and push your changes to your new repo. Make sure to set the remote origin as upstream.

$ git add . && \
  git commit -m "updated site URL" && \
  git push -u origin main

Create an empty authorized_hosts.json.

$ cp authorized_hosts.json.sample authorized_hosts.json

Commit and push your changes back to the repo. This will trigger an initial run of the bot. The first two commits are necessary as GitHub Actions are not enabled on a cloned repo until after the first commit.

$ git add . && \
  git commit -m "updated site URL" && \
  git push

Check the Action logs from the check-site job in the GitHub Actions UI. Observe the output at the end of the Run Bot step. This is your initial baseline config. Review the hosts listed and confirm they are approved or expected sources for your site.

-+-+-+-

[info] analysis complete
[warn] script host=github.githubassets.com
[warn] [script-hosts] 1 unauthorized host observed
[warn] xhr host=github.githubassets.com
[warn] [xhr-hosts] 1 unauthorized host observed
[info] [websocket-hosts] no unauthorized hosts observed
[info] [webworker-hosts] no unauthorized hosts observed
[info] [obfuscated-script-hosts] no unauthorized hosts observed
[info] [suspicious-script-hosts] no unauthorized hosts observed

-+-+-+- No baseline config found. Add the following
to `./authorized_hosts.json` to set the current baseline:

{
  "script_hosts": ["github.githubassets.com"],
  "xhr_hosts": ["github.githubassets.com"],
  "websocket_hosts": [],
  "webworker_hosts": [],
  "obfuscated_script_hosts": [],
  "suspicious_script_hosts": []
}

The JSON object at the end is your authorized hosts list. The bot will use this to check for unauthorized hosts during future runs. Copy and paste the JSON output into your local authorized_hosts.json file, overwriting the empty sample.

{
  "script_hosts": ["github.githubassets.com"],
  "xhr_hosts": ["github.githubassets.com"],
  "websocket_hosts": [],
  "webworker_hosts": [],
  "obfuscated_script_hosts": [],
  "suspicious_script_hosts": []
}

Commit the changes and push back to your new repo.

$ git add . && \
  git commit -m "updated authorized hosts list" && \
  git push

The commit will automatically trigger another job run since the authorized_hosts.json file was changed. Review the GitHub Action logs once again and you should see no unauthorized hosts.

[info] analysis complete
[info] [script-hosts] no unauthorized hosts observed
[info] [xhr-hosts] no unauthorized hosts observed
[info] [websocket-hosts] no unauthorized hosts observed
[info] [webworker-hosts] no unauthorized hosts observed
[info] [suspicious-script-hosts] no unauthorized hosts observed
[info] [obfuscated-script-hosts] no unauthorized hosts observed

By default, Driftbot will run every 4 hours, at 5 minutes past the hour. You can modify this by changing the cron schedule in .github/workflows/driftbot.yml. It is possible to run Driftbot as often as hourly and still stay well under the free GitHub Actions runtime limits.

What now?

Driftbot will continue to monitor your site. The bot will open up to 6 GitHub issues for your site, depending on what type of third-party source hosts are detected. You can read about the different types of alerts below in the FAQ. If all of the source hosts observed for a given detection type are later authorized, the corresponding issues will be automatically closed.

Frequently asked questions

Can’t find the answer you’re looking for? Free free to open an open an issue if your question isn't answered here.

Why is it called Driftbot?

Software supply chain drift can be the source of major headaches in modern app development & deployment. Drift that happens due to changes in third-party software packages is difficult to detect and can create serious risk. Some recent cases of software supply chain compromises are here, here, and here. Automated tools can help us detect software supply chain drift that could lead to a compromise or data breach.

What is the authorized_hosts.json file for?

This file holds a list of known hosts that have been loaded by your site. The first time Driftbot runs, it will output a list of hosts that you may wish to consider “authorized.” Authorized just means you are aware of it and consider it normal for your site to be communicating with that host.

Can I get Driftbot alerts in Slack?

Yes! Install the GitHub Slack app in your fork of the Driftbot repo and you’ll get Slack alerts when any issues are created.

What does "unauthorized script hosts" mean?

It means your site loaded JavaScript from one or more sources that aren’t authorized.

What does "unauthorized xhr hosts" mean?

It means your site made one or more XMLHttpRequest (XHR) requests to a source that isn’t authorized. XHR requests (sometimes referred to as ajax requests) happen asynchronously and typically aren’t noticed by users. Malicous code can load data from or send data to a remote server via XHR and likely remain undetected by most users.

What does "unauthorized websocket hosts" mean?

It means your site opened one or more WebSocket connections to a host that isn’t authorized. WebSocket connections are normal on many sites, but a malicious actor can use them as another covert channel to exfiltrate data from a user’s browser to a server under her control.

What does "unauthorized webworker hosts" mean?

It means your site spawned one or more background WebWorkers that made a connection to a host that isn’t authorized. WebWorkers are normal on many sites, but a malicious actor can use them as another covert channel to exfiltrate data from a user’s browser to a server under his control.

What does "unauthorized obfuscated script hosts" mean?

It means your site loaded JavaScript that is heavily obfuscated from a source that isn’t authorized. Obfuscation isn’t dangerous on its own, but hackers and aggressive ad networks often use it as a way to mask the true purpose of their code. Hackers may be trying to steal data from your users. Advertisers may be running code that aggressively profiles user behavior base on their web browser fingerprints. Ad networks have also been targeted by malicous actors who inject malware into ads that are then served on the ad network.

What does "unauthorized suspicious script hosts" mean?

It means your site loaded JavaScript that makes use of suspicious function calls from a source that isn’t authorized. By default, suspicious calls are eval, atob, and btoa. These function calls are often abused by malicious actors trying to obfuscate their code or stage attacks against users’ web browsers. You can authorize script sources you trust to use these calls. You can also modify the list of function calls that are considered suspicious in the .github/workflows/driftbot.yml file.

Can I use wildcards like '*' and '?' in authorized_hosts.json?

Yes, certain CDNs and dynamic source hosts may use many different hostnames that follow a predictable pattern. You can use wildcards to cover multiple patterns. For example, you could authorize scripts from any Google subdomain by adding "*.google.com" to the "script_hosts" section.

There are no unauthorized hosts, but the GitHub Issue didn't close.

The GitHub Action needs to pass twice for the Issue to be closed. The first time the job run passes, a resolved label is applied to the Issue. The next time the job run passes, the Issue will be closed.

Driftbot opened an issue for my site, but I don't understand why.

The bot has no persistent storage, so it only works off of the authorized_hosts.json file for a list of known and authorized hosts. If obfuscated or suspicious hosts are triggering, examine the job run logs for clues about why the source host was triggered. For obfuscated scripts, there is a threshold that determines whether the script is considered “heavily obfuscated.” By default, this threshold is 25% - meaning more than 25% of the code in the script was obfuscated. This can be adjusted in the .github/workflows/driftbot.yml file.

Why does Driftbot run a full Docker container just to run Puppeteer?

The fine folks at browserless.io have built a rock solid Docker image for running Puppeteer. The minor inconvenience of spinning up a Docker container to run headless Chrome is far outweighed by the stability of the browserless.io image.

I don't want to run this myself, is there a paid service somewhere?

No, not yet. If this is something you are interested in, please open a GitHub Issue or reach out on Twitter.