PyFu

Python Packages

Core Python Concepts

In Python, a package is a way of structuring and organizing related modules (Python files) under a common namespace; This makes it easier to manage, distribute, and reuse code across different parts of a project or even between projects.

Technically, a package is just a directory that contains a special __init__.py file, which tells Python to treat that directory as a package.

Packages allow developers to build modular, maintainable software. Instead of dumping all functionality into a single file, you can split your application into logical components such as auth, database, or utils, each with its own module.

This is especially useful in larger projects or in team environments, where organization and scalability are critical.

Python comes with a set of default (standard library) packages that are automatically installed with the interpreter. These packages provide core functionality such as file I/O, networking, data handling, and system operations.

On Debian-based systems (such as Ubuntu), Python’s default packages also known as the standard library are typically stored in:

/usr/lib/python3.X/

We can determine where a given Python package is installed by inspecting its __file__ attribute; This attribute reveals the full path to the module or package’s source file, helping identify whether it’s part of the standard library, a third-party package, or a user-installed module.

For example:

Python 3.12.3 (main, Feb  4 2025, 14:48:35) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import json
>>> json.__file__
'/usr/lib/python3.12/json/__init__.py'
>>>

In this case, the json module is part of the standard library and is located under /usr/lib/python3.12/, which is the default location for built-in packages on Debian-based systems.

Also, you can get the package version using the attribute __version__ as the following:

Python 3.12.3 (main, Feb  4 2025, 14:48:35) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import json
>>> json.__file__
'/usr/lib/python3.12/json/__init__.py'
>>> json.__version__
'2.0.9'
>>>

Both __file__ and __version__ attributes can be helpful when inspecting Python modules, but it’s important to note that they are not guaranteed to be available for all modules.

The __file__ attribute typically points to the location of the module’s source file, but it may be missing for built-in modules (such as sys or math) that are compiled into the Python interpreter and do not exist as standalone files on disk.

Similarly, the __version__ attribute is not part of any formal requirement for Python packages. While many third-party libraries define it as a convention, standard library modules usually do not. Attempting to access it on such modules will often result in an AttributeError.

PyPI

To create and distribute your own Python packages, you typically structure your code in a certain way and publish it to the Python Package Index (PyPI).

PyPI (Python Package Index) is the central repository where Python developers publish, share, and discover reusable code modules. It serves as the default source for Python packages and is tightly integrated with pip, Python’s package installer.

When you run a command like pip install somepackage, pip connects to PyPI to locate, download, and install the specified package along with its dependencies.

PyPI hosts a vast ecosystem of open-source libraries that span nearly every domain in Python development.

PIP

PIP is the package installer for Python. It allows you to download, install, upgrade, and manage third-party Python packages from the Python Package Index (PyPI) and other sources.

Installing packages with pip is pretty straightforward. If pip is not already installed on your system, you can install it by using the get-pip.py bootstrap script.

You can simply install pip as the following:

curl -O https://bootstrap.pypa.io/get-pip.py
python get-pip.py

After successful installation, you can start installing packages with simple commands:

pip install flask
pip install scapy
pip install fastapi

pip packages are typically installed in one of two main locations, depending on how the installation is performed:

User-level Installation:

When using the --user flag or when the environment restricts system-wide access, packages are installed in:

/.local/lib/python3.12/site-packages/

System-wide Installation

For installations done with administrative privileges, packages are installed in:

/usr/local/lib/python3.12/dist-packages/

In both cases, python3.12 corresponds to the version of Python in use; These directories are automatically included in Python’s module search path, allowing installed packages to be imported and used in your applications.

You can retrieve the version of a Python package using its __version__ attribute, which is commonly defined by third-party packages to indicate their release version.

For example:

Python 3.12.3 (main, Feb  4 2025, 14:48:35) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import scapy
>>> scapy.__file__
'/home/user/.local/lib/python3.12/site-packages/scapy/__init__.py'
>>> scapy.__version__
'2.6.1'
>>> 

Python Importlib

Another useful way to access information about installed Python libraries is through the importlib module, which provides tools for interacting with the import system. Specifically, importlib.metadata and importlib.util can be used to retrieve package metadata and installation details.

To get the version of a package, you can use importlib.metadata.version():

Python 3.12.3 (main, Feb  4 2025, 14:48:35) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import importlib.metadata
>>> importlib.metadata.version("flask")
'3.1.0'
>>>

To get the installation path of a module, use importlib.util.find_spec():

Python 3.12.3 (main, Feb  4 2025, 14:48:35) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import importlib.util
>>> spec = importlib.util.find_spec("flask")
>>> print(spec.origin)
/usr/local/lib/python3.12/dist-packages/flask-3.1.0-py3.12.egg/flask/__init__.py
>>>

And for built-in module such as sys, you will get the following:

Python 3.12.3 (main, Feb  4 2025, 14:48:35) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import importlib.util
>>> spec = importlib.util.find_spec("sys")
>>> print(spec.origin)
built-in
>>> 

Why packaging matters from an offensive security perspective

The packaging system is an attack surface in two directions: it installs and runs other people’s code, and it tells you exactly what code a target is running.

pip install runs arbitrary code. Installing a package is not a passive download. Legacy setup.py-based packages execute that script at install and build time, so a malicious package runs its payload the moment someone installs it, with that user’s privileges. This is the mechanism behind most PyPI supply-chain attacks, and it shows up as:

  • Typosquatting. A package named one keystroke away from a popular one (reqeusts, python-dateutil lookalikes) that runs code on install.
  • Dependency confusion. An internal package name that also resolves on public PyPI; pip pulls the attacker’s public version instead of the intended private one.
  • Malicious updates. A previously-trusted package compromised and shipped with a payload in a new release.

site-packages is writable code on the import path. As this page shows, installed packages live in directories that Python imports from automatically. Anyone who can write there does not need pip at all: they can drop a .pth file that executes at interpreter startup, or shadow a real module with a trojaned one. That is its own technique, covered in Import System Abuse with .pth Files and sys.meta_path, and it is also why a venv’s site-packages is a persistence target (Python Virtual Environment).

Version introspection is reconnaissance. The __version__, importlib.metadata.version(), and __file__ lookups shown above are exactly what an attacker runs after landing any code execution. Exact dependency versions map directly to known CVEs, so enumerating them turns “I can run a line of Python” into “here is the vulnerable library to pivot through.” __file__ and find_spec().origin reveal the filesystem layout, whether the code runs from a venv or system Python, and where writable package directories are. The same attributes the page presents as debugging conveniences are the first commands of post-exploitation enumeration.