Cataloging Drives In Python: An Introduction By b.telligent

Blog

Very Best Practice: Working With Paths In Python - Part 2

Stefan Seltmann

Published on

4.3.2019

8.5.2025

Updated on

8.5.2025

Data Science & AI

Very Best Practice: Working With Paths In Python - Part 2

The Same Problem: Listing Folders and Drives

In the last blog, we used a recursive function for a solution with less than 10 lines to scan folders and allow file evaluation by modification date and size.

Now I’m going to raise the bar somewhat for this example by showing even better alternatives.

Catenate The Path With Pathlib

Old Wine in New Bottles?

The solution to the earlier example by catenating the paths was:

path_file = os.sep.join([path_dir, filename])

The advantage of this is that the solution is independent of the operating system, and one does not have to combine strings with a “+” sign or string formatting.

Yet, this is error prone in that one could inadvertently or mistakenly define the directory path with a closing path separator.

path_dir: str = r"C:/Users/sselt/Documents/blog_demo/"  # abschließender Trenner

filename: str = "some_file"

path_file = os.sep.join([path_dir, filename])

# C:/Users/sselt/Documents/blog_demo/\some_file

Although this example shows a functioning code, the wrong separator leads to an error when calling up the path. Such errors can occur whenever users manage the path in config files, far from the code, without paying attention to the convention.

A better solution has emerged since Python 3.4, as a pathlib module. This handles file and folder functions of Python’s os module with an object-oriented approach.

To repeat, here’s the old variant:

import os

path = "C:/Users/sselt/Documents/blog_demo/"

os.path.isdir(path)

os.path.isfile(path)

os.path.getsize(path)

And Here Is The New Alternative

from pathlib import Path

path: Path = Path("C:/Users/sselt/Documents/blog_demo/")

path.is_dir()

path.is_file()

path.stat().st_size

Both deliver exactly the same result. So, why is the second one much better?

Object Oriented And More Error Tolerant

The call-ups are basically object oriented, and it may or may not be your preference – but I like this a lot more. We have an object here, like the path definition, which has attributes and methods.

However, the example applied here to overload operators is more exciting:

filename: Path = Path("some_file.txt")

path: Path = Path("C:/Users/sselt/Documents/blog_demo")

print( path / filename )

# C:\Users\sselt\Documents\blog_demo\some_file.txt

At first, the two-path division appears to be an invalid code. However, the path object was simply overloaded in such a manner that it functions like a catenated path.

In addition to this syntactic sugar, the path objects will intercept other typical errors:
This variant is not only nicer, but also more robust against false inputs. In addition to other advantages, the code is also independent of the operating system. One defines only a generic path object, which manifests itself in a Windows system as a WindowsPath and in a Linux system as a PosixPath.

filename: Path = Path("some_file.txt")

# hier path mit überflüssigem Trenner am Schluss

path: Path = Path("C:/Users/sselt/Documents/blog_demo/")

# hier path mit doppeltem Trenner

path: Path = Path("C:/Users/sselt/Documents/blog_demo//")

# hier path völlig durcheinander

path: Path = Path("C:\\Users/sselt\\Documents/blog_demo")  # hier ein wilder Mix

# alle Varianten führen zum selben Ergebnis

print(path/filename)

# C:\Users\sselt\Documents\blog_demo\some_file.txt

Most functions that typically expect a string as a path can work directly with a path. Rarely, you may need to resolve the object simply with str(Path).

Processing The path With os.walk

In my last blog’s solution, I used os.listdir, os.path.isdir and a recursive function to iterate through the path tree and differentiate between folders and files.

But os.walk offers a better solution. This method does not create a list, but instead an iterator that you can call up line by line. The results contain the respective folder path and a list of all data files within the path. All this occurs by itself recursively, such that you get all the files with one call up.

The Better Solution With os.walk And Pathlib

If you combine the two aforementioned techniques, you get a solution that is simpler, fully independent of the OS, more robust against inconsequent path formats, and free of explicit recursions:

If you can top this with a best practice, don’t hesitate to get in touch. I’d love your feedback!

Read here the first part of the blog post.

Want To Learn More? Contact Us!

Dr. Sebastian Petry

Domain Lead Data Science & AI

Who is b.telligent?

Do you want to replace the IoT core with a multi-cloud solution and utilise the benefits of other IoT services from Azure or Amazon Web Services? Then get in touch with us and we will support you in the implementation with our expertise and the b.telligent partner network.

Get to know us

The top of an office building on a bright day

All posts

No previous post

No next post

Very Best Practice: Working With Paths In Python - Part 2

The Same Problem: Listing Folders and Drives

Table of Contents

Catenate The Path With Pathlib

Old Wine in New Bottles?

And Here Is The New Alternative

Object Oriented And More Error Tolerant

Processing The path With os.walk

The Better Solution With os.walk And Pathlib

Want To Learn More? Contact Us!

Your contact person

Dr. Sebastian Petry

Who is b.telligent?

Munich

Basel

Berlin

Cluj

Dusseldorf

Frankfurt

Hamburg

Nuremberg

Vienna

Zurich

Cluj

Vienna – Postal address

Vienna – Visitor address

Basel

Zurich

Nürnberg

Frankfurt

Düsseldorf

Hamburg

Berlin

Munich

Very Best Practice: Working With Paths In Python - Part 2

The Same Problem: Listing Folders and Drives

Table of Contents

Catenate The Path With Pathlib

Old Wine in New Bottles?

And Here Is The New Alternative

Object Oriented And More Error Tolerant

Processing The path With os.walk

The Better Solution With os.walk And Pathlib

Want To Learn More? Contact Us!

Your contact person

Dr. Sebastian Petry

Who is b.telligent?

Related Posts

Snowflake Document AI – Easily Extract Data From Unstructured Documents

Neural Averaging Ensembles for Tabular Data With TensorFlow 2.0

Neural Networks for Tabular Data: Ensemble Learning Without Trees

Sizing and Scaling Azure AI Search

Munich

Basel

Berlin

Cluj

Dusseldorf

Frankfurt

Hamburg

Nuremberg

Vienna

Zurich

Cluj

Vienna – Postal address

Vienna – Visitor address

Basel

Zurich

Nürnberg

Frankfurt

Düsseldorf

Hamburg

Berlin

Munich