In this integrated system, communication between the services is also possible in different directions. A service such as a relational database can not only supply data to the data lake but can also receive reprocessed data from the data lake or data warehouse. The Amazon Glue Data Catalog ensures that data is available to all services via a common metadata view. The data in the Data Catalog is subject to governance via Amazon Lake Formation.
The concept can be easily understood by looking at the interoperability between the Data Lake and Amazon Redshift, the AWS cloud data warehouse service. The metadata in the data catalog allows a specific part of the data lake to be accessed transparently as part of the data warehouse. The Spectrum service within Amazon Redshift is used to do this. With Spectrum, a database defined in the Glue Data Catalog is represented as a schema integrated within the Amazon Redshift database. The tables in this schema together with the Redshift-native tables can then be used in SQL queries with joins etc. This use is still subject to governance by Amazon Lake Formation, however. So users can only access the tables, columns and data for which they have access rights.