What is considered personal data under the EU GDPR?

The EU’s GDPR only applies to personal data, which is any piece of information that relates to an identifiable person. It’s crucial for any business with EU consumers to understand this concept for GDPR compliance.

The EU’s General Data Protection Regulation (GDPR) tries to strike a balance between being strong enough to give individuals clear and tangible protection while being flexible enough to allow for the legitimate interests of businesses and the public. As part of this balancing act, the GDPR goes to great lengths to define what is and is not personal data.

If your organization collects, uses, or stores the personal data of people in the EU, then you must comply with the GDPR’s privacy and security requirements or face large fines. (If you’re not sure whether your organization is subject to the GDPR, read our article about companies outside of Europe.)

GDPR Article 4, the GDPR gives the following definition for “personal data”:

‘Personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.

Furthermore, the GDPR only applies to personal data processed in one of two ways:

Personal data processed wholly or partly by automated means (or, information in electronic form); and
Personal data processed in a non-automated manner which forms part of, or is intended to form part of, a ‘filing system’ (or, written records in a manual filing system).

There is a lot to unpack here, but the first line of the definition contains four elements that are the foundation of determining whether information should be considered as personal data:

“any information”
“relating to”
“an identified or identifiable”
“natural person”

These four elements work together to create the definition of personal data. We will break each one down in the following paragraphs.

Natural person

This element is the easiest to define. By using “natural person,” the GDPR is saying data about companies, which are sometimes considered “legal persons,” are not personal data. A final caveat is that this individual must be alive. Data related to the deceased are not considered personal data in most cases under the GDPR.

Any information

This element is very inclusive. It includes “objective” information, such as an individual’s height, and “subjective” information, like employment evaluations. It is also not limited to any particular format. Video, audio, numerical, graphical, and photographic data can all contain personal data. For example, a child’s drawing of their family that is done as part of a psychiatric evaluation to determine how they feel about different members of their family could be considered personal data, insofar as this picture reveals information relating to the child (their mental health as evaluated by a psychiatrist) and their parents’ behavior.

Inaccurate information

Information that is inaccurately attributed to a specific individual, be it factually incorrect or information that in reality is related to another individual, is still considered personal data as it relates to that specific individual. If data are inaccurate to the point that no individual can be identified, then the information is not personal data. (e.g. If you refer to “the man who lives at 12 Mulberry Lane had a party last night,” when Mulberry Lane ends at number 10, that’s not personal data.)

Identifiable individuals and identifiers

At its most basic form, whenever you differentiate one individual from others, you are identifying that individual. Any individual who can be distinguished from others is considered identifiable.

Calling someone by their name is the most common way of identifying someone, but it is often context-dependent. There are millions of Roberts in the world, but when you say the name “Robert,” generally you are trying to get the attention of the person you are facing. By adding another data point to the name (in this example, proximity), you have enough information to identify one specific individual. These data points are identifiers.

Looking back at the GDPR’s definition, we have a list of different types of identifiers: “a name, an identification number, location data, an online identifier.” A special mention should be made for biometric data as well, such as fingerprints, which can also work as identifiers. While most of these are straightforward, online identifiers are a bit trickier. Fortunately, the GDPR provides several examples in Recital 30 that include:

Internet protocol (IP) addresses;
cookie identifiers; and
other identifiers such as radio frequency identification (RFID) tags.

These identifiers refer to information that is related to an individual’s tools, applications, or devices, like their computer or smartphone. The above is by no means an exhaustive list. Any information that could identify a specific device, like its digital fingerprint, are identifiers.

Finally, there are “related factors,” which the GDPR lists as “factors specific to the physical, physiological, genetic, mental, economic, cultural, or social identity of that natural person.” These factors are characteristics that are directly related to a specific individual that could help you identify them.

Identifying an individual directly and indirectly

An individual is directly identifiable if you can identify them using nothing but the information you possess. In the previous example, by knowing his name and location, you were able to directly identify Robert. However, a name is not always necessary. Had you not known Robert’s name, you could have still identified him through his proximity and some combination of physical factors, like height and hair color.

There are more factors to consider with indirect identification. Indirect identification means you cannot identify an individual through the information you are processing alone, but you may be able to by using other information you hold or information you can reasonably access from another source. A third party using your data and combining it with information they can reasonably access to identify an individual is another form of indirect identification.

An easy example of information that could be used to indirectly identify someone is an individual’s license plate number. The police (a third party) can quickly match a name to a license plate number.

The qualifier “reasonably” is an important one. Methods of identification that are not present today could be developed in the future, which means that data stored for long durations must be continuously reviewed to make sure it cannot be combined with new technology that would allow for indirect identification.

Any information that can lead to either the direct or indirect identification of an individual will likely be considered personal data under the GDPR.

Personal data that ‘relate to’ an identifiable individual

Here it is important to consider the content of the data. Information that identifies an individual, even without a name attached to it, may be personal data if you are processing it to learn something about that individual or if your processing of this information will have an impact on that individual. Records that contain information that is clearly about a specific individual are considered to be “related to” that individual, such as their medical history or criminal records. Records that have information that describes an individual’s activities may also qualify, such as a bank statement. Any data that relate to an identifiable individual is personal data.

Data that are used for learning or making decisions about an individual are also personal data. Records about electricity and water usage would be considered personal data as this information is used to determine how much to charge an individual.

Information that, when processed, could have an impact on an individual, even if that was not your primary aim, is also considered to be personal data. For instance, Uber tracks all of its drivers so that it can find the nearest available car to assign to an Uber request. However, this data could also be used to monitor whether Uber drivers follow the rules of the road and to measure their productivity rate. This processing of the data should be subject to data protection rules.

Personal data and the purpose for processing

The GDPR requires that consideration be given to how the data are being used to make decisions about specific individuals. A piece of information that does not qualify as personal data for one organization could become personal data if a different organization came into possession of it based on the impact this data could have on the individual. It all depends on the reason for which the organization is processing the data. If an organization processes data for the sole purpose of identifying someone, then the data are, by definition, personal data.

Two examples:

First, a photo of a street in the hands of a photographer is not personal data, while that same photo in the hands of an investigator who is working to identify the individuals and vehicles that were present on that street at that particular time would be considered personal data for the individuals concerned.

Second, video surveillance or security footage whose sole purpose is to be used to identify individuals when and where authorities see fit should be considered as processing data about identifiable persons, even if, in some cases, the individuals recorded cannot be identified.

This guide is not an exhaustive list, but it should help you understand some of the concepts for determining whether the data your organization processes is subject to the EU’s GDPR requirements. If you need further help with GDPR compliance, head over to our GDPR checklist, which can help you determine whether your organization is on the right track.

Richie Koch

Managing Editor, GDPR EU

Prior to joining Proton VPN, Richie spent several years working on tech solutions in the developing world. As a senior editor at Latterly magazine, he covered international human rights stories. He joined Proton VPN to advance the rights of online privacy and freedom.