個人數據的分類比較研究（下）

最新 01-06

【個人數據分類的比較研究（上）】描述了美國人較為經常使用的PII、歐盟GDPR對個人信息的定義、國家標準《個人信息安全規範》對個人信息的界定。在下篇中，還想和大家聊聊兩種分類。

美國教授提出的PII 2.0

知名美國學者Paul Schwartz和Daniel Solove曾撰寫過兩篇文章，分別是The PII Problem: Privacy and a New Concept of Personally Identifiable Information 和 Reconciling Personal Information in the European Union and the United States。總的來說，兩位學者的觀點是：

In the US, the law provides multiple definitions of PII, most focusing on whether the information pertains to an identified person. 美國的法律中，PII主要是與已識別的個人相關聯的信息。換句話說就是國家標準《個人信息安全規範》中講的路徑二：「關聯，即從個人到信息，如已知特定自然人，則由該特定自然人在其活動中產生的信息（如個人位置信息、個人通話記錄、個人瀏覽記錄等）即為個人信息。」

In contrast, in the EU, there is a single definition of personal data to encompass all information identifiable to a person. Even if the data alone cannot be linked to a specific individual, if it is reasonably possible to use the data in combination with other information to identify a person, then the data is PII. 但在歐洲，個人信息的定義是指即包含了識別路徑，也包括了關聯路徑。其實，美國自己的標準SP800-122也是包含了上述兩個路徑。

兩位學者進而提出PII 2.0。這個PII 2.0 簡單來說就是：

Under PII 2.0, data about identified individuals should be given the most protection. 關聯路徑下的個人信息需要最高等級的保護。

Identifiable data still deserves protection too, but that protection differs from identified data in that only some of the Fair Information Practice Principles (FIPPs) should apply. 識別路徑下的個人信息需要保護，但是保護級別不用像關聯路徑下的那麼強。

為什麼做出這個區分，最重要的理由是：PII 2.0 enhances the protection of privacy. It creates an incentive for companies to keep information in the least identifiable form. If we abandon PII, or treat identified and identifiable information as equivalents, companies will be less willing to expend resources to keep or transfer data in the most de-identifiable state practicable. 簡單來說就是，這樣的區分給公司一個激勵：當數據處理活動不需要識別到具體個人時，那就在處理活動中不要識別出具體個人。如此一來，因數據處理活動對個人合法權益造成危害的風險就降低了。其實這是個典型的風險管理的思路。通過法律上對不同識別度的個人數據進行區別對待，鼓勵數據控制者採取低風險的處理方式。

GDPR對個人數據分類的再解析

兩位美國教授提出的分類，在很多人看來，其實根本就是與GDPR不符。畢竟GDPR對個人數據的定義中，沒有對「可識別」和「已識別」做出區分。但真的如此嗎？

首先，個人數據的定義已經在上篇給出了。

其次，匿名化數據：The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable. This Regulation does not therefore concern the processing of such anonymous information, including for statistical or research purposes.

其次，假名化數據：『pseudonymisation』 means the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person.

以上三類大家比較了解。但實際上，仔細讀GDPR文本，特別是第11、12條，還存在一類特殊的個人數據。

Article 11 Processing which does not require identification

1. If the purposes for which a controller processes personal data do not or do no longer require the identification of a data subject by the controller, the controller shall not be obliged to maintain, acquire or process additional information in order to identify the data subject for the sole purpose of complying with this Regulation.

2. Where, in cases referred to in paragraph 1 of this Article, the controller is able to demonstrate that it is not in a position to identify the data subject, the controller shall inform the data subject accordingly, if possible. In such cases, Articles 15 to 20 shall not apply except where the data subject, for the purpose of exercising his or her rights under those articles, provides additional information enabling his or her identification.

這個11條讀起來是不是和兩位美國教授談到的PII 2.0中「識別路徑下的個人信息」（identifiable data）有點相像？特別有意思的是，第11條沒有用「假名化」這樣的字眼。

所以總結起來，實際上GDPR包含了四個類型的個人數據：已識別個人的數據、可被識別個人的數據（包括假名化數據）、第11條這個類型的數據、匿名化數據。這四個類型的數據的識別程度依次下降。

簡單的總結

本系列上、下兩篇對個人數據分類的研究，主要聚焦在識別度上，並非涉及個人數據的敏感度。

對已識別個人的數據來看，還可做如下分類：

有一類個人數據，其用途在數據控制者看來，是在於建立於特定主體之間互動的渠道。例如電話號碼、電子郵箱、地址、IMEI號等。

有一類個人數據，其用途是建立對特定個人電子身份的認證。例如用戶名密碼、指紋、虹膜、Face ID等。一旦用於認證個人電子身份的個人數據被泄露、濫用、誤用，則與電子身份緊密相連的各種權益都處於巨大的風險之中，如銀行資金被盜用、社保記錄被篡改用於騙取社保資金、醫療記錄被篡改導致被列入重點監控人群等等。

有一類個人數據，描述了特定個人的某些方面的特徵或情況。例如瀏覽記錄、婚史、行蹤軌跡、教育經歷、疾病史、宗教信仰、血型、基因信息等等。這些個人數據一旦遭泄露，或者被濫用、誤用，個人可能遭受不必要的社會壓力進而封閉自己，他人可能會利用這些信息勒索特定個人或迫使其違背意願行事等等。

在已識別個人數據下，還存在可被識別個人的數據（包括假名化數據）、GDPR第11條這個類型的數據、匿名化數據。

在我個人看來，將來我國的立法能否提出個人數據不同識別度的分類，以及對不同識別度的數據如何進行區別對待，是立法成敗的關鍵之一。

喜歡這篇文章嗎？立刻分享出去讓更多人知道吧！

本站內容充實豐富，博大精深，小編精選每日熱門資訊，隨時更新，點擊「搶先收到最新資訊」瀏覽吧！

請您繼續閱讀更多來自 網安尋路人 的精彩文章:

※「數據治理和網路安全研究聯盟網站」正式上線

TAG:網安尋路人 |