Blog Posts: Data Science & AI

Blog

Data Science & AI

Nahaufnahme von Händen auf einer Laptop-Tastatur

2.6.2022

8.5.2025

Vertex AI Pipelines - Getting Started

After taking a trip into the world of Ray in the first article, we now want to dedicate ourselves to Vertex AI – the key area of all machine learning services in GCP. Pipelines are meant to make life in the machine learning world easier. They promise to shorten development cycles through a high degree of automation. In addition, infrastructure abstraction is meant to allow teams to dispense with expertise in microservices etc. and instead focus on core competencies.

In this blog post, we will look at a simple example of how a machine-learning pipeline can be set up in Vertex AI.

Quantile Regression With Gradient Boosted Trees

13.12.2021

8.5.2025

Quantile Regression With Gradient Boosted Trees

When we do simple descriptive data exploration, we are seldom content with analyzing mean values only. More often, we take a more detailed look at the distribution, have a look at histograms, quantiles, and the like. Mean values alone often lead to erroneous conclusions, and keep important information hidden. But if this is the case, why do we forget about this as soon as we build predictive models? These usually aim only at mean values - and they lie.

Large Language Models – An Overview Of The Model Landscape

10.8.2023

8.5.2025

Large Language Models – An Overview Of The Model Landscape

Since the release of ChatGPT and the attention drawn to large language models, we have seen a rapid increase in releases of more models and a rapidly evolving market associated with a use of LLMs. A model's suitability for utilization in a business context depends heavily on the respective use case. In this blog post, we shall take a closer look at the currently most important models, and compare them on the basis of enterprise-relevant criteria so that you can maintain a better overview.

Data Science for Kids: How To Win at “Guess Who?”

25.6.2020

8.5.2025

Data Science for Kids: How To Win at “Guess Who?”

The other day, I played "Guess Who?", the classic game for children from about 6 to 9 years, with my six-year-old son. While we were playing, we both tried to work out the best way to win the game. This article series is the result of our search for an effective game plan. Part 1 is aimed at the whole family. OK - let's find out how to win!

Deliver Projects Faster With Python Ibis Analytics

7.11.2022

8.5.2025

Deliver Projects Faster With Python Ibis Analytics

If successful proof of concept (PoC) for a data-analysis pipeline is to be followed by production, this often proves to be a long road. Ibis makes it possible to simplify this process and thus add value faster.

After successful local development of a data-analysis pipeline in Python, the code often needs to be rewritten to allow operation in production mode. But does it really have to be that way? Programmed by Wes McKinney, lead author of Python Pandas library, the Python Ibis library provides a fascinating solution for balancing data processing between the production and development environments, thus enabling analytics teams to achieve production faster. This blog post of ours shows how it works.

Brief Guide to Using Generative AI and LLMs

13.10.2023

20.10.2025

Brief Guide to Using Generative AI and LLMs

Ever since ChatGPT was introduced in late 2022, we have all been thrilled about the possibilities of generative AI and large language models (LLMs). What intrigues people is the incredible ease of generating high quality texts and getting responses to questions, code fragments, etc. You simply write a prompt, which is a text input, feed it to the ChatGPT’s API, and voilà, you have a response.

We are still very much in a generative AI hype cycle, where the benefits of a technology are typically overstated. For businesses, it is important to avoid the attendant pitfalls, and to understand when and how to best use ChatGPT or generative AI solutions. In this blog, we look beyond the hype, and show you an approach to evaluate and implement LLM-based Gen AI use cases.

Caret: A Cornucopia of Functions For Doing Predictive Analytics In R

6.7.2017

8.5.2025

Caret: A Cornucopia of Functions For Doing Predictive Analytics In R

R is one of the most popular open source programming languages for predictive analytics. One of its upsides is the abundance of modeling choices provided by more than 10000 user-created packages on the Comprehensive R Archive Network (CRAN). On the downside, package-specific syntax choices (which are a much bigger problem in R than in e.g. in Python) impede the employment of new models. The caret package attempts to streamline the process of creating predictive models by providing a uniform interface to various training and prediction functions. Caret’s data preparation- , feature selection- and model tuning functionalities facilitate the process of building and evaluating predictive models. This blog post focuses on model tuning and selection and shows how to tackle common model building challenges with caret.

Recommender Systems – Part 3: Personalized Recommender Systems, ML and Evaluation

12.3.2020

8.5.2025

Recommender Systems – Part 3: Personalized Recommender Systems, ML and Evaluation

Algorithms for Personalized Recommendations

Users do not always leave behind enough personalized information along their customer journey. For instance, new customers can be acquired or existing customers might browse an e-commerce website without being logged in. Non-personalized recommendation systems, such as those based on proposals for products frequently purchased together, still offer recommendation opportunities for companies in this case. However, the more individually these are tailored to the customer, the better.

Use of Private Python Packages in Vertex AI - 3

7.6.2022

8.5.2025

Use of Private Python Packages in Vertex AI - 3

As data scientists, we regularly train different machine-learning models in the cloud. Here you can find out how to structure your model training with the help of Python packages. Although each model has its own specific, intended application, some code snippets are ultimately copied from one project to another. In my case, this code is often for reading data from a database or for a pre-processing step. By allowing frequently used functions to be collected in one place, Python packages are ideal for avoiding this kind of code copying. This offers many advantages in the maintenance and testing of code.

In this blog article, we will see how a Python package can be utilized in GCP and integrated into a Vertex AI training job.

Data Science & AI

Vertex AI Pipelines - Getting Started

Quantile Regression With Gradient Boosted Trees

Large Language Models – An Overview Of The Model Landscape

Data Science for Kids: How To Win at “Guess Who?”

Deliver Projects Faster With Python Ibis Analytics

Brief Guide to Using Generative AI and LLMs

Caret: A Cornucopia of Functions For Doing Predictive Analytics In R

Recommender Systems – Part 3: Personalized Recommender Systems, ML and Evaluation

Algorithms for Personalized Recommendations

Use of Private Python Packages in Vertex AI - 3

Munich

Berlin

Cluj

Dusseldorf

Frankfurt

Hamburg

Nuremberg

Vienna

Zurich

Zurich

Nürnberg

Munich

Basel

Cluj

Vienna – Postal address

Vienna – Visitor address

Frankfurt

Düsseldorf

Hamburg

Berlin