How to Get a List of All Files in a Directory With Python
by:
blow post content copied from Real Python
click here to view original post
To get all the files in a directory with Python, you can leverage the pathlib
module. This tutorial covers how to use methods like .iterdir()
, .glob()
, and .rglob()
to list directory contents.
For a direct list of files and folders, you use .iterdir()
. The .glob()
and .rglob()
methods support glob patterns for filtering files with specific extensions or names. For advanced filtering, you can combine these methods with comprehensions or filter functions.
By the end of this tutorial, you’ll understand that you can:
- List all files of a directory in Python using
pathlib.Path().iterdir()
- Find all files with a particular extension with
pathlib.Path().glob("*.extension")
- Use
pathlib.Path().rglob("*")
to recursively find all files in a directory and its subdirectories
You’ll explore the most general-purpose techniques in the pathlib
module for listing items in a directory, but you’ll also learn a bit about some alternative tools.
Source Code: Click here to download the free source code, directories, and bonus materials that showcase different ways to list files and folders in a directory with Python.
Before pathlib
came out in Python 3.4, if you wanted to work with file paths, then you’d use the os
module. While this was very efficient in terms of performance, you had to handle all the paths as strings.
Handling paths as strings may seem okay at first, but once you start bringing multiple operating systems into the mix, things get more tricky. You also end up with a bunch of code related to string manipulation, which can get very abstracted from what a file path is. Things can get cryptic pretty quickly.
Note: Check out the downloadable materials for some tests that you can run on your machine. The tests will compare the time it takes to return a list of all the items in a directory using methods from the pathlib
module, the os
module, and even the Python 3.12 version of pathlib
. That version includes the well-known walk()
function, which you won’t cover in this tutorial.
That’s not to say that working with paths as strings isn’t feasible—after all, developers managed fine without pathlib
for many years! The pathlib
module just takes care of a lot of the tricky stuff and lets you focus on the main logic of your code.
It all begins with creating a Path
object, which will be different depending on your operating system (OS). On Windows, you’ll get a WindowsPath
object, while Linux and macOS will return PosixPath
:
With these OS-aware objects, you can take advantage of the many methods and properties available, such as ones to get a list of files and folders.
Note: If you’re interested in learning more about pathlib
and its features, then check out Python’s pathlib Module: Taming the File System and the pathlib
documentation.
Now, it’s time to dive into listing folder contents. Be aware that there are several ways to do this, and picking the right one will depend on your specific use case.
Getting a List of All Files and Folders in a Directory in Python
Before getting started on listing, you’ll want a set of files that matches what you’ll encounter in this tutorial. In the supplementary materials, you’ll find a folder called Desktop. If you plan to follow along, download this folder and navigate to the parent folder and start your Python REPL there:
Source Code: Click here to download the free source code, directories, and bonus materials that showcase different ways to list files and folders in a directory with Python.
You could also use your own desktop too. Just start the Python REPL in the parent directory of your desktop, and the examples should work, but you’ll have your own files in the output instead.
Note: You’ll mainly see WindowsPath
objects as outputs in this tutorial. If you’re following along on Linux or macOS, then you’ll see PosixPath
instead. That’ll be the only difference. The code you write is the same on all platforms.
If you only need to list the contents of a given directory, and you don’t need to get the contents of each subdirectory too, then you can use the Path
object’s .iterdir()
method. If your aim is to move through directories and subdirectories recursively, then you can jump ahead to the section on recursive listing.
The .iterdir()
method, when called on a Path
object, returns a generator that yields Path
objects representing child items. If you wrap the generator in a list()
constructor, then you can see your list of files and folders:
>>> import pathlib
>>> desktop = pathlib.Path("Desktop")
>>> # .iterdir() produces a generator
>>> desktop.iterdir()
<generator object Path.iterdir at 0x000001A8A5110740>
>>> # Which you can wrap in a list() constructor to materialize
>>> list(desktop.iterdir())
[WindowsPath('Desktop/Notes'),
WindowsPath('Desktop/realpython'),
WindowsPath('Desktop/scripts'),
WindowsPath('Desktop/todo.txt')]
Passing the generator produced by .iterdir()
to the list()
constructor provides you with a list of Path
objects representing all the items in the Desktop directory.
Read the full article at https://realpython.com/get-all-files-in-directory-python/ »
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
January 12, 2025 at 07:30PM
Click here for more details...
=============================
The original post is available in Real Python by
this post has been published as it is through automation. Automation script brings all the top bloggers post under a single umbrella.
The purpose of this blog, Follow the top Salesforce bloggers and collect all blogs in a single place through automation.
============================
Post a Comment