Python Package in VERTEX AI Training Job and GCP

Blog

Use of Private Python Packages in Vertex AI - 3

Laurenz Reitsam

Published on

7.6.2022

8.5.2025

Updated on

8.5.2025

Data Science & AI

Use of Private Python Packages in Vertex AI - 3

As data scientists, we regularly train different machine-learning models in the cloud. Here you can find out how to structure your model training with the help of Python packages. Although each model has its own specific, intended application, some code snippets are ultimately copied from one project to another. In my case, this code is often for reading data from a database or for a pre-processing step. By allowing frequently used functions to be collected in one place, Python packages are ideal for avoiding this kind of code copying. This offers many advantages in the maintenance and testing of code.

In this blog article, we will see how a Python package can be utilized in GCP and integrated into a Vertex AI training job.

Creating a Python package

Before being able to make our Python code available as a package, we need to make sure that our Python module meets the requirements for this:

The module should have at least one of the files setup.py, setup.cfg or pyproject.toml. These can be used individually or in combination to define how the Python package should be installed later. Prerequisites such as Python version >= 3.9 can be specified in this manner, for example.
The code should have a folder structure as shown in the following snippet: There is a main folder containing all configuration files and a subfolder. The subfolder contains the actual Python code.

structure of package: ├── setup.py # or setup.cfg or pyproject.toml ├── my_package │ ├── __init__.py │ └── example.py

After ensuring that these requirements have been met, we can create package distributions from our module:

cd my-package python3 -m pip install --upgrade build python3 -m build ls ./dist

This code installs the native Python build tool for us, and uses it to create the Python distributions. The result is a WHEEL file as well as a TAR archive.

Setting up Google's artifact registry

Google's artifact registry offers a complete solution for images and code libraries. We will use it to version and manage our Python packages. The registry can be easily created via the UI or with gcloud.

Integrating Google's artifact registry

Before being able to load our package into the registry, we have to make a few more preparations:

First of all, a pypirc file is needed. This file contains specifications for uploading packages to private registries. Here we list our newly created artifact registry and specify its URL.

# ~/.pypirc [distutils] index-servers = my-repository [my-repository] repository =

Now we need to obtain authority to access Google's artifact registry. This is done via Python's keyring service. For this we also need Google's keyring library which allows us to use our GCP credentials for login. After logging into gcloud and installing the library, we no longer have to bother about access rights.

gcloud auth login python3 -m pip install keyrings.google-artifactregistry-auth

Uploading the package to Google's artifact registry

Our distributions have been built, our registry is ready and we are authorized for access. Now we can load our package into the registry. This is done with Python's standard tool Twine. After installing Twine, we can upload the package to the registry via the command line.

python3 -m pip install twine twine upload --repository-url ./dist/*

Done! Our package is in the cloud. From now on, it is accessible for all our services.

Use of the private package in a docker container

Now we can use the package directly in a new service. The easiest way to do this is in container-based solutions such as Vertex AI training jobs with custom containers.

For this purpose, we list the package in the docker service's requirements. Here we just have to make sure to specify our registry's URL. This tells tools like Pip where to look for listed dependencies.

Important! The URL requires the "/simple" suffix. This tells dependency management tools (pip) how to communicate with the server. For more details, refer to PEP 503.

# requirements.txt --extra-index-url /simple/ my-package ...

In the docker build process, it is then necessary to install Google's keyring library again. This also provides the docker daemon with the rights to communicate with the registry.

# Dockerfile FROM python:3.9-slim WORKDIR /app COPY ./train.py . RUN pip install keyrings.google-artifactregistry-auth RUN pip install -r requirements.txt CMD python ./train.py

Finished! The image can be built and pushed.

Conclusion

We have just seen how to make Python packages available with Google's cloud registry, and how they can be used from Vertex AI. Which functions do you often copy from one project to another? A perfect starting point for cleaning up here is to put the code into a package and make it usable for your future projects.

Let’s Unlock the Full Potential of Your Data – Together!

Looking to become more data-driven, optimize processes, or leverage cutting-edge technologies? Our blog provides valuable insights – but the best way to tackle your specific challenges is through a direct conversation.

Let’s talk – our experts are just one click away!

Want To Learn More? Contact Us!

Olaf Bowe

Domain Lead Insights & Information Design

Who is b.telligent?

Do you want to replace the IoT core with a multi-cloud solution and utilise the benefits of other IoT services from Azure or Amazon Web Services? Then get in touch with us and we will support you in the implementation with our expertise and the b.telligent partner network.

Get to know us

The top of an office building on a bright day

All posts

No previous post

No next post

Use of Private Python Packages in Vertex AI - 3

Table of Contents

Creating a Python package

Setting up Google's artifact registry

Integrating Google's artifact registry

Uploading the package to Google's artifact registry

Use of the private package in a docker container

Conclusion

Let’s Unlock the Full Potential of Your Data – Together!

Want To Learn More? Contact Us!

Your contact person

Olaf Bowe

Who is b.telligent?

Munich

Basel

Berlin

Cluj

Dusseldorf

Frankfurt

Hamburg

Nuremberg

Vienna

Zurich

Cluj

Vienna – Postal address

Vienna – Visitor address

Basel

Zurich

Nürnberg

Frankfurt

Düsseldorf

Hamburg

Berlin

Munich

Use of Private Python Packages in Vertex AI - 3

Table of Contents

Creating a Python package

Setting up Google's artifact registry

Integrating Google's artifact registry

Uploading the package to Google's artifact registry

Use of the private package in a docker container

Conclusion

Let’s Unlock the Full Potential of Your Data – Together!

Want To Learn More? Contact Us!

Your contact person

Olaf Bowe

Who is b.telligent?

Related Posts

Snowflake Document AI – Easily Extract Data From Unstructured Documents

Neural Averaging Ensembles for Tabular Data With TensorFlow 2.0

Neural Networks for Tabular Data: Ensemble Learning Without Trees

Sizing and Scaling Azure AI Search

Munich

Basel

Berlin

Cluj

Dusseldorf

Frankfurt

Hamburg

Nuremberg

Vienna

Zurich

Cluj

Vienna – Postal address

Vienna – Visitor address

Basel

Zurich

Nürnberg

Frankfurt

Düsseldorf

Hamburg

Berlin

Munich