Person looking at the hope balloon as it floats down a path.

Better Blocker HTTP Archive

To build the tracker blocking functionality in Better Blocker, we crawl the most popular sites on the Web using Better Inspector.

We examine what they’re loading to identify and block third-party trackers.

We save the information about the sites we’ve visted in a standard format called HAR (HTTP Archive format). Our archive is available for use by researchers (or anyone else who is interested) under a Creative Commons ShareAlike License.

The archive (~6,800 sites, ~400MB) is available in DAT format at:

dat://archive.better.fyi

(You can also browse this site using the same DAT URL on the peer-to-peer Web via Beaker Browser.)

With love,
Ind.ie.

Usage instructions

To download the archive, you will need to use DAT.

Graphical (easy)

Screenshot of DAT desktop downloading the Better HTTP Archive: progress at 42%
  1. Download the DAT Desktop Application (available for Mac and Linux)
  2. Press the Download button.
  3. Enter the DAT address:
    dat://archive.better.fyi

Commandline

If you’re having trouble with the DAT Desktop application or if you’re comfortable using the Terminal, you can also use the DAT CLI to get and stay synchronised with the archive (requires Node.js):

  1. Install DAT:
    npm install -g dat
  2. Clone the archive:
    dat clone dat://archive.better.fyi better-http-archive
You will find the HAR files (gzipped) in the better-http-archive directory. To keep your copy in sync with updates, from your archive directory:
dat sync