Launching PFT Extractor

We started with a simple sounding problem: can we automate the interpretation of pulmonary function tests? We ended up needing to solve a few other problems first.

At Automate Medical we started working together on a simple-sounding problem: can we automate the interpretation of pulmonary function tests? We read the textbooks. We spoke to everyone we could. We traced the development of standards by the American Thoracic Society. We implemented reference value equations from papers and built training tools.

But automation of medical tests like PFTs turned out to be anything but simple. As we’ve written previously, one of the biggest problems in health care is the interoperability of records and diagnostic reports like PFTs. We needed to build software to bridge the gap. That’s why we’re so excited to announce availability of PFT Extractor today, our new SDK for pulmonary function tests.

PFT Extractor converts pulmonary function test reports from "digital", but flat *.PDFs, into coded medical concepts in a FHIR Bundle as JSON (DiagnosticReport + a set of Observations). Not automation yet, but an important step towards computability.

How does it work?

First, we OCR the original document. Next, we convert it to a text stream and apply a sequence of domain-specific filters (for the kind of labels you expect to be in a PFT) we have developed by obtaining manufacturer sample PFTs (data reference). We provide back a list of observations that we were able to extract and coded medical concepts (LOINC) associated with each observation.

One of the interoperability challenges with PFTs, in general, is the fact that there's a deviation between the clinical standards the ATS evolves and the translation of those standards into existing code systems. There's a LOINC gap for many important observations for example. We read about this challenge in Electronic Health Records and Pulmonary Function Data: Developing an Interoperability Roadmap at the start of the year:

Lack of variable-naming conventions and incomplete or duplicate LOINC

We’ve experimentally mapped as much of the Spirometry and Diffusion Capacity LONIC concepts in our export.

This brings us to the FHIR Bundle export currently available for private review. All of the data we extract from the PDF is wrapped up into a FHIR-friendly package. FHIR is the health data protocol of the future. Unfortunately, clinical data for pulmonary medicine hasn’t been given the attention in the FHIR community it deserves. An Implementation Guide for them doesn’t generically exist and national profiling efforts are few and far between.

We’re changing that. This summer we are publishing an Implementation Guide for pulmonary function tests on FHIR containing a set of Profiles to describe a PFT that conforms to ATS standardization best practices to date. This effort is happening in the open and we intend to share it for critical review. PFT Extractor will natively export this implementation guide once it advances beyond this draft stage. We are actively seeking collaborators on this project, please reach out to us at

With PFT Extractor in our toolkit, we can advance interoperability and data standards for an important underserved segment of clinical work and patient outcomes. These are building blocks towards computability. Of course, that realization got us thinking: there are bigger problems we need to solve for everyone.

Big picture

At Automate Medical we started working together on a simple-sounding problem: can we automate the interpretation of pulmonary function tests? What we uncovered was an opportunity to build something bigger and more important than what we had originally set out to build: the rails for the health technology stack of the future.

Our mission is building that health data stack. We build tools that work to power new patient-centric care models. Startups can find faster paths to market. Patients get to be the source of truth on their health data. Providers like clinics and hospitals can react to shifts in data exchange requirements by payers. Entirely new ways of delivering health care will exist in the future because we solved the health data problems of today.

Does making pulmonary function tests computable excite you? How about the health technology stack of the future? We’d love to hear from you at if you’re excited about these things.