Data Science & AI

Nahaufnahme von Händen auf einer Laptop-Tastatur
Snowflake Document AI – Easily Extract Data From Unstructured Documents
Snowflake Document AI – Easily Extract Data From Unstructured Documents

Snowflake Document AI – Easily Extract Data From Unstructured Documents

With Snowflake Document AI, information can be easily extracted from documents, such as invoices or handwritten documents, within the data platform. Document AI is straightforward and easy to use: either via a graphical user interface, via code in a pipeline or integrated into a Streamlit application. In this article, we explain the feature, describe how the integration into the platform works and present interesting application possibilities.

Neural Averaging Ensembles for Tabular Data With TensorFlow 2.0
Neural Averaging Ensembles for Tabular Data With TensorFlow 2.0

Neural Averaging Ensembles for Tabular Data With TensorFlow 2.0

Neural Networks for Tabular Data: Ensemble Learning Without Trees

Neural networks are applied to just about any kind of data (images, audio, text, video, graphs, ...). Only with tabular data, tree-based ensembles like random forests and gradient boosted trees are still much more popular. If you want to replace these successful classics with neural networks, ensemble learning may still be a key idea. This blog post tells you why. It is complemented by a notebook in which you can follow the practical details.

Sizing and Scaling Azure AI Search
Sizing and Scaling Azure AI Search

Sizing and Scaling Azure AI Search

Azure AI Search, Microsoft’s top serverless option for the retrieval part of RAG, has unique sizing, scaling, and pricing logic. While it conceals many complexities of server based solutions, it demands specific knowledge of its configurations.

Efficient Distance Joins in Polars
Efficient Distance Joins in Polars

Efficient Distance Joins in Polars

Polars: Develop Faster, Execute Faster

Polars, the Pandas challenger written in Rust, is much faster, not only in executing the code, but also in development. Pandas has always suffered from an API that "grew historically" in many places. Polars is completely different: it ensures significantly faster development, since its API is designed to be logically consistent from the outset, carefully maintaining stringency with every release (sometimes at the expense of backwards compatibility). Polars can often easily replace Pandas: for example, in Ibis Analytics projects and, of course, for all kinds of daily data preparation tasks. Polars’ superior performance is also helpful in interactive environments like Power BI.

How Mature Is Your ML Approach?
How Mature Is Your ML Approach?

How Mature Is Your ML Approach?

Machine Learning Operations (MLOps) is a practice for collaboration and communication between data scientists and operations professionals to help manage production Machine Learning (ML) lifecycles. It involves the principles of DevOps in the ML lifecycle to streamline and automate the process from model development to deployment and monitoring. The intention of MLOps is to develop faster deployment and scaling of ML models in a structured and efficient manner.

Automated Image Processing: A Standard Architecture
Automated Image Processing: A Standard Architecture

Automated Image Processing: A Standard Architecture

The PoC has been made, a model ready for production has been trained, and the showcase has inspired all stakeholders. But in order for business cases to be realized with the model, it (and the related processing) must be embedded in the existent (cloud) landscape.

LightGBM On Vertex AI
LightGBM On Vertex AI

LightGBM On Vertex AI

In the Google cloud, Vertex AI is the MLOps framework. It is very flexible, and you can basically use any modelling framework you like. However, some frameworks are a bit easier to use than others: Tensorflow, XGBoost and Scikit-Learn are supported with some prebuilt images which are very helpful. This blog post will show how you can train and deploy models which are not generated by another framework. We will use a LightGBM model as an example, but the workflow can easily be transferred to any other modelling package.

How To Install Ray Under Windows
How To Install Ray Under Windows

How To Install Ray Under Windows

Ray enjoys a growing popularity in the machine learning community. Getting it up and running under Windows can be tricky however. This blog tells you how.

Vertex AI Pipelines - Getting Started
Vertex AI Pipelines - Getting Started

Vertex AI Pipelines - Getting Started

After taking a trip into the world of Ray in the first article, we now want to dedicate ourselves to Vertex AI – the key area of all machine learning services in GCP. Pipelines are meant to make life in the machine learning world easier. They promise to shorten development cycles through a high degree of automation. In addition, infrastructure abstraction is meant to allow teams to dispense with expertise in microservices etc. and instead focus on core competencies.

In this blog post, we will look at a simple example of how a machine-learning pipeline can be set up in Vertex AI.