De-identified Data in Healthcare

Home » De-identified Data in Healthcare

De-identified Data in Healthcare

What’s wrong with de-identified data in healthcare? Maybe nothing. Maybe everything.

The term de-identification as it relates to healthcare is defined in the HIPAA Privacy Rule. The specific sections of the Privacy Rule that pertain to de-identification of protected health information (PHI) are:

45 CFR § 164.514(a) – Standard: De-identification of protected health information.
45 CFR § 164.514(b) – Implementation specifications: Requirements for de-identification of protected health information.

An easier read is this article, which is part of the U.S. Department of Health and Human Services’ (HHS) guidance and resources on the topic of de-identification of PHI. Basically, de-identification means taking personal information and disguising it so it can’t be linked back to a specific person. It’s like removing the labels from things so you can’t tell one from the other. This is done to protect people’s privacy when using their information for things like research or analysis. By itself, this is not a bad thing.

However, recently, some healthcare apps – including at least one EHR – have had to update their Terms of Service because of new privacy laws issued by some states. In their new Terms, these companies have carved out the right to de-identify healthcare data entered in their product. The problem with this is that once healthcare data has been de-identified, it is no longer considered PHI. That means that, at that point, a company can do whatever they want with it. They don’t have to ask or even inform the subscriber when or how their patients’ data is being used.

Even that might not be TOO bad, except for these things:

  • Even though the risk is minimal, de-identified data can, at times, be used to identify the individual.
  • The rules for de-identification are fairly stringent. If they are not applied correctly, the data is not technically de-identified, and it becomes more likely to be linked back to the individual.

Methods of De-Identification

There are two approved methods for de-identifying healthcare data: the Expert Determination method and the Safe Harbor method. The Safe Harbor method is usually considered more objective than the Expert Determination method, because it is quantitative. Safe Harbor involves removing 18 specific identifiers from PHI to make it reasonably de-identified. These 18 identifiers are:

  1. Names: All elements of names (except for the initial of the last name) and full face photos.
  2. Geographic Information: All geographic subdivisions smaller than a state, including street address, city, county, precinct, ZIP code, and their equivalent geocodes.
  3. Dates: All dates, except the year, directly related to an individual, including birthdates, admission dates, discharge dates, death dates, and more.
  4. Telephone Numbers: All telephone numbers.
  5. Fax Numbers: All fax numbers.
  6. Email Addresses: All email addresses.
  7. Social Security Numbers: All Social Security numbers.
  8. Medical Record Numbers: All medical record numbers.
  9. Health Plan Beneficiary Numbers: All health plan beneficiary numbers.
  10. Account Numbers: All account numbers.
  11. Certificate/License Numbers: All certificate or license numbers.
  12. Vehicle Identifiers and Serial Numbers: All vehicle identifiers and serial numbers, including license plate numbers.
  13. Device Identifiers and Serial Numbers: All device identifiers and serial numbers.
  14. Web Universal Resource Locators (URLs): All URLs.
  15. IP Addresses: All IP addresses.
  16. Biometric Identifiers: All biometric identifiers, including fingerprints and voiceprints.
  17. Full Face Photos: All full face photos.
  18. Any Unique Identifying Number or Code: Any other unique identifying number, characteristic, or code, except as permitted by limited data use agreements.

However, even when companies apply the Safe Harbor method correctly, it is not foolproof.1 Although unlikely, there will be times when someone will be able to recognize an individual from de-identified data. To complicate matters, in some cases data has not been de-identified correctly. For example, some companies have said they use initials in their de-identified data. If they are only using a last initial and have done everything else correctly, that could be considered proper de-identification. However, that is not what they are saying. We are left to wonder whether they don’t fully understand the Safe Harbor method or whether they are choosing to ignore it, creating their own de-identification methods instead. Neither would be a good thing.

Misuse of Business Associate Agreements

The concept of a “Business Associate” (BA) in healthcare originated in 2000 with the publication of HIPAA’s Privacy Rule. The HITECH Act (2009) expanded HIPAA’s privacy and security provisions to Business Associates directly. It mandated that Business Associates must comply with many of the same privacy and security rules that apply to Covered Entities (CEs).

In 2013, the HHS issued the HIPAA Omnibus Rule, which further clarified and expanded the requirements related to Business Associates. This rule specified the obligations of Business Associates regarding security breaches, compliance with HIPAA rules, and the need for Business Associate Agreements (BAAs) between CEs and their Business Associates. According to The Omnibus Rule, Business Associates are directly liable for the security of protected health information (PHI) entrusted to them. Should there be a breach caused by the BA’s product, they are required to take specific actions related to the breach. In other words, the BA is supposed to “make things right” as much as is possible.

Thus, the authors of HIPAA originally intended a BAA to serve as a kind of protection for Covered Entities (in this case, healthcare providers). It was meant to be the company’s pledge that they understood HIPAA and that their software (or other service) was secure. However, some healthcare software companies are weaponizing their BAAs in ways that not only fail to protect CEs, but that may actually harm them.

Specifically, by de-identifying the data in their systems, the BA can legally claim that it is no longer PHI. Since it’s no longer PHI, that gets them off the hook should any leaks occur with that data. If patients are identified from the de-identified data, the BA can truthfully claim that even HIPAA admits that de-identification isn’t foolproof.

Additionally, the rewording of their Terms of Service, which you must sign to use their product, gives them the broadest possible permissions to use the data in any way they deem fit. Should there be any issues at all, the BA will be able to bring out documents signed by their users, which assigned all rights to the company.

Some may be thinking that perhaps the best solution is just to not sign a company’s BAA if you disagree with the terms. However, if you plan to keep using the software, that’s also a Catch-22. Covered Entities are required to sign a Business Associate Agreement from each healthcare software product we use.2 Also, in some of the current cases, the company will not allow you – or your clients – to continue to access their product if you refuse to sign their terms.

If you find yourself being appalled by this recent turn of events, you are not alone. As the founder and CEO of PSYBooks, I can assure you that I am, too.

PSYBooks has never de-identified data. Furthermore, we have never sold, bartered or traded any data our subscribers have entrusted us with, or used it for anything other than what is required to run the program.

If you’re looking for more privacy and autonomy over the patient data you enter in your EHR, we’d love to show you around. We offer both Free Demos and Free Trials of our program.

Susan C. Litton, Ph.D.


1 U.S. Department of Health and Human Services. (n.d.). De-identification of Protected Health Information. HHS.gov. https://www.hhs.gov/hipaa/for-professionals/privacy/special-topics/de-identification/index.html
2 Exceptions are products that are only “conduits” for PHI, as opposed to storing it.

By |2023-09-05T11:23:14-04:00September 2nd, 2023|Current, HIPAA/HITECH, How To Choose an EHR|Comments Off on De-identified Data in Healthcare

About the Author:

Susan C. Litton, Ph.D. holds degrees in both psychology and IT. In addition to being the developer of the PSYBooks EHR & Portal, she's been a practicing clinical psychologist in Decatur, GA, since 1985.