FracFocus Chemical Database Download
Please familiarize yourself with the metadata on this page before downloading:
This dataset contains information extracted from PDF files hosted on FracFocus.org, voluntary disclosure reports submitted by oil and gas drilling operators about the chemicals they used in hydraulic fracturing operations across the United Sates. This dataset represents all of the information included in each individual report - as submitted by industry operators. We have added one column which calculates if a chemical abstract service number (CAS #) is valid (the checksum is correct), invalid (the number appears to be a CAS # but the checksum is wrong), or other (blanks, NULL, or something else does not resemble a CAS #), and for all valid CAS numbers we run a script that standardizes the spacing and punctuation. We record when we obtained the report from FracFocus for the first time and assign it a unique ID, but otherwise, all the data in our database appears exactly as they is recorded on FracFocus.org: typos, "trade secrets," and all.
This database is provided free of charge for research and educational purposes. If you use this database to produce a published work, all we ask is that you acknowledge SkyTruth in your website/blogpost/report/paper.
We are refreshing these datasets monthly, available for download from the links below. Due to the dynamic nature and size of the database, we are offering the database as large zipped files (in tab separated value [.txt] format), updated monthly. We will update the current year links on the tenth of each month with a dataset that will encompass all reports that were submitted to FracFocus.org between January 1 and the last day of the preceding month. Prior years' data is archived by year. Be aware of the following issues with the data:
CAVEAT 1: Many operators agreed to submit reports retroactively to January 2011, but some have submitted even older reports. This means that the dataset contains some records for fracks occurring prior to Jan.1, 2011, but the data generally covers fracks from Jan. 1, 2011 to within a few month of the present (see CAVEAT 2).
CAVEAT 2: There is a lag between when a frack occurs and when it is reported, so the dataset updated on Dec. 31, 2012 is unlikely to have more than a few reports for the month of December in it. However, in subsequent updates, December 2012 will become more populated. Since Sept. 2012, we check the FracFocus website daily and the field "Published" records when the report first appeared on the FracFocus website, allowing research to be done on the timelines of reporting, but we have not attempted to quantify how long the lag is or how it varies by state depending on disclosure laws or the lack thereof.
We are making available 3 datasets - Reports, Chemicals, and Blended - all in a zipped, tab-separated format [.txt.zip].
Reports Data - One record for each disclosure report, listing the date, location, operator, total volume of fracking fluid etc. for that individual frack.
Chemicals Data - One record for every chemical that was disclosed, can be joined to the reports data on the "pdf_sequid" field.
Blended Data - The largest of the three datasets, this combines the reports and chemicals datasets together. Convenient if you want everything in one place, but a lot of redundancy.
WARNING: The Chemicals and Blended datasets for a full year or more can contain over 1,000,000 records, which far exceeds what Microsoft Excel can handle - you must use Microsoft Access or other database managing software to work with the Chemical or Blended Dataset.
SkyTruth Fracking Chemical Database
All reports submitted to FracFocus through last month
Reports published in 2013 (through the latest complete month)
Reports published in 2011 - 2012