AWS Data Platform to gain insight and future proof the business
The foodforplanet GmbH & Co. KG, an inhouse E-Commerce agency of Erbacher – the food family, was tasked to build a modern data platform in the cloud for the whole of the Erbacher group. After deciding for AWS as their cloud provider of choice, b.telligent as AWS Advanced Consulting Partner was selected to build up the infrastructure and implement the first use case. The journey started by integrating various E-Commerce data source in the data platform to enable foodforplanet to gain deeper insight into their online activities and to create value by adding the insight back into their online shops.
Erbacher – the food family and foodforplanet GmbH & Co. KG
Erbacher – the food family has been producing fodder, pet food and cereals for humans since 1941 in Kleinheubach, Germany. Currently the group employs over 800 people in three countries. Some of the group’s companies are Josera Agrar, which produces fodder for cattle and pigs, Josera Petfood as a producer to cater to dogs, cats, and horses and Farm Champs as a service company to help cattle farmers optimize the health and productivity of their livestock. foodforplanet runs several international online shops for the groups pet food brands Josera and Green Petfood. The food family thrives to create a sustainable business and is doing so by creating new and innovative green products and aiming for climate neutral processes.
b.telligent, the trusted partner for the data platform
On their way to building a data platform on AWS, foodforplanet chose b.telligent, an independent consulting company based in Germany and AWS Advanced Consulting Partner, to start the implementation of the data platform. b.telligent specializes in BI, CRM, DWH, big data, data science and cloud technologies and was tasked to design the architecture of the data platform on AWS and to implement the first use cases after successfully delivering a proof-of-concept project which demonstrated how to build a data platform in the AWS cloud.
The need for a data platform
Being a traditional midsized production company, the food family realized that sustainability and digitalization are the key to future proof the company for the years to come. After establishing a successful online business, the next step was to take their data processing capabilities to the next level and to prepare for the new and innovative possibilities that the usage of data brings. As a foundation, a modern data platform was needed that is flexible enough to cater to the needs of the various companies and departments in the group. The data platform has to support the growing need to store and process data and to offer a scalable solution that is also capable to enable use cases ranging from the online business to the enablement of IoT and Industry 4.0 initiatives that lay the foundation to a modern production.
Decision to build a modern data platform on Amazon Web Services
foodforplanet, leading this initiative for the whole group, chose to work with AWS for the flexibility and openness of the cloud services. AWS offers a wide variety of choices to implement data and analytics use cases, which integrate in a plug-and-play fashion and are priced mainly in a pay-for-use model. As an example, a modern data platform must support scalable data stores that support different usage scenarios. Using the AWS Lake House Architecture, a Data Lake is built with services such as Amazon S3, AWS Glue and AWS Lake Formation to create a cheap storage with basically no capacity limit. The data lake is complemented by Amazon Redshift, the cloud data warehouse service, that integrates seamlessly with the Data Lake using Redshift Spectrum and offers a performant SQL based access to the curated and refined data ready for end-user consumption.
Implementing the initial use case
The integration of the various systems and service providers (web shop, tracking and online marketing) of their online business, was chosen by foodforplanet as their initial use case to build up their data platform.
A team of b.telligent consultants set up the infrastructure and implemented the first use case in an initial 5-month project phase. They used the AWS Cloud Development Kit (AWS CDK) to define the components of the data platform as code and to be able to deploy similar infrastructure in multiple AWS accounts.
The data platform’s initial architecture consists of a Data Lake build in AWS S3, using AWS Lake Formation and the AWS Glue data catalog to manage the meta data/access, and AWS Glue ETL for data processing within the Data Lake. Data ingestion was achieved by using custom build connectors in AWS Lambda, AWS Glue, and utilizing Meltano, an open-source solution for running data pipelines based on an ecosystem of readily available connectors and data loaders. Amazon Redshift was the service chosen for the data warehouse, with dbt as the SQL based, in-DWH transformation framework. dbt is again an open-source project which lets developers create transformations in plain SQL-Select statements while the framework deals with the other complexities of managing the underlying data base objects and commands. The architecture also includes some additional services such as an API-Gateway for programmatic access, a Yellowfin BI-server running on an EC2 instance and backed by as RDS data base for its metadata to offer reporting and dashboards and the CloudWatch/CloudTrail services for monitoring and logging.
The benefit of the first use case
Bringing up the cloud infrastructure and integrating the data sources into one DWH data model inside the data platform enabled foodforplanet to streamline the reporting for their online activities and will provide a single source for further analysis. Having all the data available in a BI Solution without lengthy manual preprocessing offers business users to work with the data daily and take better informed decisions. This will lead to cost savings in data preparation and an enhanced understanding of the business.
The data platform will also enable foodforplanet to use the insights gained to make their customers the right offers by personalizing the web shops. Adding real-time integration of data from the Data Warehouse and results of advanced analytics into product recommendations.
Growing the data platform
The described solution is just beginning of Erbacher – the food family’s journey into the AWS cloud data capabilities. The platform will soon not only be offered to foodforplanet but to other parts of the company, too. Use cases e.g. from the production sphere like predictive maintenance of production lines, image recognition in data quality control or logics use cases will build on the foundations laid with the first iteration of building a truly scalable and flexible data platform on AWS.