What Is the __pycache__ Folder in Python? :
by:
blow post content copied from Real Python
click here to view original post
When you develop a self-contained Python script, you might not notice anything unusual about your directory structure. However, as soon as your project becomes more complex, you’ll often decide to extract parts of the functionality into additional modules or packages. That’s when you may start to see a __pycache__
folder appearing out of nowhere next to your source files in seemingly random places:
project/
│
├── mathematics/
│ │
│ ├── __pycache__/
│ │
│ ├── arithmetic/
│ │ ├── __init__.py
│ │ ├── add.py
│ │ └── sub.py
│ │
│ ├── geometry/
│ │ │
│ │ ├── __pycache__/
│ │ │
│ │ ├── __init__.py
│ │ └── shapes.py
│ │
│ └── __init__.py
│
└── calculator.py
Notice that the __pycache__
folder can be present at different levels in your project’s directory tree when you have multiple subpackages nested in one another. At the same time, other packages or folders with your Python source files may not contain this mysterious cache directory.
Note: To maintain a cleaner workspace, many Python IDEs and code editors are configured out-of-the-box to hide the __pycache__
folders from you, even if those folders exist on your file system.
You may encounter a similar situation after you clone a remote Git repository with a Python project and run the underlying code. So, what causes the __pycache__
folder to appear, and for what purpose?
Get Your Code: Click here to download the free sample code that shows you how to work with the pycache folder in Python.
Take the Quiz: Test your knowledge with our interactive “What Is the __pycache__ Folder in Python?” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
What Is the __pycache__ Folder in Python?In this quiz, you'll have the opportunity to test your knowledge of the __pycache__ folder, including when, where, and why Python creates these folders.
In Short: It Makes Importing Python Modules Faster
Even though Python is an interpreted programming language, its interpreter doesn’t operate directly on your Python code, which would be very slow. Instead, when you run a Python script or import a Python module, the interpreter compiles your high-level Python source code into bytecode, which is an intermediate binary representation of the code.
This bytecode enables the interpreter to skip recurring steps, such as lexing and parsing the code into an abstract syntax tree and validating its correctness every time you run the same program. As long as the underlying source code hasn’t changed, Python can reuse the intermediate representation, which is immediately ready for execution. This saves time, speeding up your script’s startup time.
Remember that while loading the compiled bytecode from __pycache__
makes Python modules import faster, it doesn’t affect their execution speed!
Why bother with bytecode at all instead of compiling the code straight to the low-level machine code? While machine code is what executes on the hardware, providing the ultimate performance, it’s not as portable or quick to produce as bytecode.
Machine code is a set of binary instructions understood by your specific CPU architecture, wrapped in a container format like EXE, ELF, or Mach-O, depending on the operating system. In contrast, bytecode provides a platform-independent abstraction layer and is typically quicker to compile.
Python uses local __pycache__
folders to store the compiled bytecode of imported modules in your project. On subsequent runs, the interpreter will try to load precompiled versions of modules from these folders, provided they’re up-to-date with the corresponding source files. Note that this caching mechanism only gets triggered for modules you import in your code rather than executing as scripts in the terminal.
In addition to this on-disk bytecode caching, Python keeps an in-memory cache of modules, which you can access through the sys.modules
dictionary. It ensures that when you import the same module multiple times from different places within your program, Python will use the already imported module without needing to reload or recompile it. Both mechanisms work together to reduce the overhead of importing Python modules.
Next, you’re going to find out exactly how much faster Python loads the cached bytecode as opposed to compiling the source code on the fly when you import a module.
How Much Faster Is Loading Modules From Cache?
The caching happens behind the scenes and usually goes unnoticed since Python is quite rapid at compiling the bytecode. Besides, unless you often run short-lived Python scripts, the compilation step remains insignificant when compared to the total execution time. That said, without caching, the overhead associated with bytecode compilation could add up if you had lots of modules and imported them many times over.
To measure the difference in import time between a cached and uncached module, you can pass the -X importtime
option to the python
command or set the equivalent PYTHONPROFILEIMPORTTIME
environment variable. When this option is enabled, Python will display a table summarizing how long it took to import each module, including the cumulative time in case a module depends on other modules.
Suppose you had a calculator.py
script that imports and calls a utility function from a local arithmetic.py
module:
calculator.py
from arithmetic import add
add(3, 4)
The imported module defines a single function:
arithmetic.py
def add(a, b):
return a + b
As you can see, the main script delegates the addition of two numbers, three and four, to the add()
function imported from the arithmetic
module.
Note: Even though you use the from ... import
syntax, which only brings the specified symbol into your current namespace, Python reads and compiles the entire module anyway. Moreover, unused imports would also trigger the compilation.
The first time you run your script, Python compiles and saves the bytecode of the module you imported into a local __pycache__
folder. If such a folder doesn’t already exist, then Python automatically creates one before moving on. Now, when you execute your script again, Python should find and load the cached bytecode as long as you didn’t alter the associated source code.
Read the full article at https://realpython.com/python-pycache/ »
[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
May 13, 2024 at 07:30PM
Click here for more details...
=============================
The original post is available in Real Python by
this post has been published as it is through automation. Automation script brings all the top bloggers post under a single umbrella.
The purpose of this blog, Follow the top Salesforce bloggers and collect all blogs in a single place through automation.
============================
Post a Comment