Python Hacking: Techniques in Python Library Hijacking

Python Hacking: Techniques in Python Library Hijacking

Introduction

Today, a lot of programming languages are being used for development of software ranging for simple to complex applications. Of these languages used, Python is a largely used language which has grown in popularity over the years, since it’s introduction in 1991.

Python is one of the most widely used programming languages, powering everything from web applications to cybersecurity tools. As a consequence of its wide used, it has become a target for malicious actors; creating a lesser-known but dangerous attack vector - Python Library Hijacking (I.e. a technique in which malicious actors inject rogue code into Python packages or exploit dependency resolution issues to run arbitrary code on a victim's device.)

In this article, we will try to explain the various techniques involved in python library hijacking, real-world examples and test cases.

Python Library Hijacking and Its Techniques

Python applications often rely on third-party libraries installed via pip or other package managers. If an attacker manages to trick a developer or system into installing a malicious package instead of a legitimate one, they can execute arbitrary code on the target system

Python Library Hijacking simply involves the injection of malicious codes into existing python-based applications or python codes through any applicable technique, to execute malicious code on the target system. It typically involves the exploitation of vulnerabilities in Python libraries, manipulating the Python’s import mechanism to execute unauthorized or malicious.

This types of attack from malicious actors falls in the following categories (I.e. techniques):

  1. Typosquatting

  2. Dependency Confusion

  3. Hijacking Abandoned Packages

  4. Exploiting Misconfigured Imports

  1. Typosquatting:

    Here, attackers create and register malicious packages with names similar to popular Python packages, hoping that they are downloaded by developers (e.gs reqeusts for requests, urllib for urllib3,etc).

    Research from ReversingLabs shows that dozens of legitimate Python packages have been mimicked by threat actors to fool developers into installing them - click to view the list of some of this malicious packages.

  2. Dependency Confusion:

    If an application references an internal package, but an attacker publishes a package with the same name on PyPI (Python Package Index), Python’s package manager might mistakenly install the attacker’s package instead. In this scenario, the internal package is not published on the package manager (in this case PYPI).

    In the article, Dependency Confusion: How I Hacked Into Apple, Microsoft and Dozens of Other Companies, Alex Birsan describe how this technique can be used to exploit devices running Python codes. Many organization use internal packages that are not published on package managers and the names of these packages can be sniffed by attackers. When attackers get hold of these names, they create and publish malicious libraries on package managers using these names, assigning a higher version to the libraries. Now, when the legitimate packages are to be installed, package managers tend to locate the latest versions of packages with higher version to install. This thus result in installing the malicious libraries published, unless the installing devices have been properly configured to install the correct internal libraries.

  3. Hijacking Abandoned Packages:

    Some maintainers abandon their packages over time, leaving them vulnerable to malevolent attacks. These attackers then inject harmful code as updates into the no-longer-maintained packages. Users who then update to the latest version of these packages unintentionally install harmful code on their devices.

  4. Exploiting Misconfigured Imports:

Attackers place malicious Python modules in directories where they might be unintentionally imported by legitimate applications.

If a Python script imports a library using a relative path (import module), Python looks for the module in multiple locations, including the local directory. An attacker who places a malicious module in a directory that is searched first can execute their payload.

Usually, exploiting misconfigured imports take either of the 3 scenarios:

  • Write permissions exists on the imported Python module

  • Broken privileges on higher prioriy Python library path

  • Redirecting Python library search through PYTHONPATH ENVIRONMENT variable

In the CTF solution, Empire: LupinOne, the first scenario (write permissions on the imported module) was used to exploit the target machine and escalate privilege horizontally.

The following scenarios will be further explained in a later article with examples.

Mitigation Strategies for Python Library Hijacking

Having identified the what Python Library Hijacking is and tecniques associated with it, the following can be implemented to prevent it:

  1. Verify Package Names:

    • Always double-check packages names before installation.

    • Avoid typos and copy-and-paste commands from trusted sources.

  1. Use Trusted Sources:

    • Always install packages from official repositories like PYPI.

    • Verify package maintainers before installation.

  2. Enable Hash-Based Verification:

    • Use pip install --require-hashes -r requirements.txt to ensure package integrity.
  3. Enforce Import Safery:

    • Use absolute imports (from package import module) to avoid unintended imports.

    • Validate the source of modules before importing them.

  4. Restrict External Dependencies:

    • Configure private package indexes for internal dependencies.

    • Use tools like pip config set global.index-url <internal-repo> to prevent accidental external installs.

Conclusion

Python Library Hijacking is a major cyber threat that takes advantage of developers' trust in open-source packages. Attackers can execute malicious code on vulnerable systems using techniques such as typosquatting, dependency confusion, hijacking abandoned packages, and exploiting misconfigured imports. These assaults can compromise critical data, install backdoors, or even take over entire infrastructures.

As Python continues to be widely adopted across industries, securing its ecosystem must remain a priority. Developers must be vigilant, cautious, and proactive in defending against Python Library Hijacking to ensure the integrity and security of their applications.

References

  1. Developers beware: Imposter HTTP libraries lurk on PyPI

  2. Dependency Confusion: How I Hacked Into Apple, Microsoft and Dozens of Other Companies

  3. Dependency Confusion: An Exploitation Overview

  4. Secure Installs

  5. Six Malicious Python Packages in the PyPI Targeting Windows Users