Skip to main content

Data privacy in DWH

More than a year after introduction of the general data protection regulation (GDPR), many enterprises still find it hard to reconcile the topics of data warehouse (DWH) and data privacy. Customer-centric data modelling prevailing at enterprises poses a special challenge here. It leads to major conflicts with many GDPR requirements in virtually any data-driven process. But why is it so hard to unify the two topics?

Data subjects' rights - requirements for DWH

The biggest changes brought about by the general data protection regulation relate to data subjects' rights. These place different demands on data warehouses and create several new areas requiring measures. The following overview shows key facts concerning the most important rights of data subjects according to GDPR:

  • Right to information: Data subjects have the right to receive information about processed personal data.
  • Right to rectification: Data subjects have the right to request rectification of personal data.
  • Right to erasure (right to be forgotten): Data subjects have the right to erasure of personal data.
  • Commitment to notify: The data controller is obliged to transparently notify data subjects about any collection and processing.
  • Right to data portability: Data subjects are entitled to have personal data transferred to other data controllers.
  • Right to objection: Data subjects have the right to object to processing of personal data.

The important role played by personal data in almost all DWH processes is an indication of how many data-driven operations and data categories are concerned by the above-mentioned aspects. To conform with legal norms and forestall the high fines which pose a great risk particularly due to the significant external impact of data subjects' rights, we have identified six DWH architecture modules which should make it easier to bring your DWH into line with data privacy requirements.

Architecture modules for a data warehouse compliant with data privacy

Consent management platform: Lawful consent management is essential already during data collection. A consent management platform makes it possible to document and keep track of the customer's consent, and assign it to the different data categories in the DWH. A use of consent thus allows GDPR-compliant implementation of otherwise critical operations involving processing of personal data. Integration of consent management into DWH data administration moreover protects you against processing of data for which consent has been withdrawn or is no longer up-to-date.

Pseudonymization engine: Pseudonymization is a tried-and-tested means of ensuring a certain level of protection during critical processing of personal data. Here, however, it is important to note that pseudonymization does not absolve from data privacy requirements, because re-identification is possible at any time. Pseudonymization nonetheless has advantages, and is a frequently employed means of comprehensive data analysis, particularly in the DWH domain comprising analytics. A pseudonymization engine and its integration into all data management processes can help here. An absolute must, however, is a highly secure pseudonymization key and its consistent application across all data processes within the DWH.

Anonymization / erasure engine: Due to data protection requirements, an erasure engine is essential as part of DWH architecture. In the context of data privacy, anonymization can basically be considered as equivalent to erasure. Because the consequences of erasure for reporting and analysis departments are often serious and irreversible, anonymization is always preferable to physical erasure. The biggest challenge lies in conceiving lawful implementation of erasure while minimizing information losses. With the help of a suitable, legally safeguarded concept, all relevant personal data can be identified, localized and automatically anonymized. This avoids time-consuming manual processes.

Information engine / reports: The information engine is used for automated, complete and rapid localization of personal data, and generates reports for individuals who have requested information. Because the personal data of individuals are often distributed over the entire DWH, avoidance of manual operations and enhancement of legal certainty through completeness of information are major advantages here. Structured categorization of data via metadata management, in particular, is an indispensable basis for the information engine.

List of processing activities: The list of processing activities documents all processing of personal data. Especially in the DWH environment, a current and dynamic list of processes is indispensable for fulfilling documentation and notification obligations as per statutory provisions. Of special importance here is completeness of processing records as per Article 30 of the GDPR. Integration of the list of processing activities into data protection processes and the directory of corporate procedures are challenges which need to be mastered. Linking the list with releases of processing operations can be particularly useful here. Processing can be released automatically in such cases, if the list contains suitable, approved procedures. Appropriate tools (such as D-quantum) and adequate metadata management are very helpful during such processes.


D-Quantum list of processing activities


Metadata management / data lineage: The requirements and solutions described for GDPR-compliant DWH cannot be implemented without collection of information on the relevant data and data-processing operations. Development of a data catalogue to display data lineage and data processing operations requires establishment of a sustainable metadata management falling under the responsibility of a data governance body. For more information on metadata management and data governance, we recommend our webinar "GDPR - using the procedur index correctly!".


All these modules are tried-and-tested means of resolving the apparent conflict between data privacy and data warehouses. Use of the right architecture and proper implementation of all mentioned aspects helps not only act in compliance with the law, but also minimize costs and draw maximum benefit from available data. Reporting and analysis departments can then continue to operate with a high degree of legal certainty and generate maximum added value.

If you have any questions about the modules or need assistance in their implementation, do not hesitate to contact our experts!