From Confusion to Compromise: Dependency Confusion Attacks
A primer on dependency confusion attacks.
# What are Dependency Confusion Attacks?
Dependency confusion, also known as namespace substitution or namespace confusion, is a novel type of supply chain attack. First introduced in February 2021 by security researcher Alex Birsan, who demonstrated the novel attack vector by infiltrating companies such as Apple and Microsoft, the attack vector has gained significant popularity in 2022 due to its sophistication.
Any npm package can arbitrarily execute code before and after installation.
It is common to use in the corporate codebase internal packages found on a private registry. Taking advantage of package manager behavior and automated development tools (such as an auto update mechanism common to CI workflows), the attackers can and will publish to npm or PyPI a malicious packages with the same name as the internal, legitimate one. And once it is inside… well, it can do a lot of damage.
# PyTorch Dependency Poisoning Incident
In December 2022, an attack occurred on PyTorch. PyTorch is a popular python library used in the field of machine learning. The PyTorch team released an annoucement indicating that nightly versions of the related torchitron library released between December 25, 2022 and December 30, 2022 were victims of the compromise.
In this case, a malicious actor uploaded a poisoned Python Package Index (PyPI) dependency that was disguised as the real dependency named torchtriton. Once this package in disguise was established, it began running a binary that extracted sensitive informations from infected host systems:
- nameservers from
/etc/resolv.conf
- hostname from
gethostname()
- current username from
getlogin()
- current working directory name from
getcwd()
- environment variables
The malicious code also read the following files:
/etc/hosts
/etc/passwd
- The first 1,000 files in
$HOME/*
$HOME/.gitconfig
$HOME/.ssh/*
Then the attacker exfiltrated all of this information through the network (via encrypted DNS queries to the domain *.h4ck[.]cfd
, using the DNS server wheezy[.]io
). The malware version of torchtriton ended up being downloaded 2,386 times.
Just in time for the holidays! ☃️
# How this Relates to Transitive Dependencies
It’s important to note that dependency confusion attacks can also occur with transitive dependencies, which makes them even harder to detect. Transitive dependencies are dependencies that a package depends on, which in turn depend on other packages.
If the torchitron is the parent dependency of another package, then installing the child dependency would result in the malicious package being installed along with it.
# Why Dependency Confusion Attacks are Hard to Detect
Dependency confusion attacks can be hard to detect because they rely on the automated processes that developers use to manage dependencies in their code. Developers typically trust that the packages they download and install are legitimate, that they have been reviewed for vulnerabilities. However, this trust can be exploited by malicious actors who create malicious packages with the same name as internal packages and upload them to public package repositories.
To mitigate the risk of dependency confusion attacks, organizations must adopt a proactive approach to managing dependencies. This includes:
- Regularly reviewing and updating dependencies to ensure that they are up-to-date and do not contain known vulnerabilities.
- Use proactive security scanners that detect unknown vulnerabilites such as malware. This allows developers to detect and block the malicious packages before they get downloaded.
- Use version pinning. In manifest files such as
package.json
andrequirements.txt
, it’s best to use version pinning to lock down the version of packages that are being used. This ensures that even if a malicious package is published on the public registry with the same name, it won’t be installed because the version won’t match. - Monitoring package repositories for malicious packages that may have the same name as internal packages.
- Using private package repositories or internal package management systems to avoid relying on public package repositories.
In conclusion, dependency confusion attacks are a real threat to the software supply chain. By understanding how these attacks work, and adopting a proactive approach to managing dependencies, developers and organizations can reduce their risk exposure to novel supply chain threats.