The NPM (Node Package Manager) registry suffers from a security lapse called “manifest confusion,” which undermines the trustworthiness of packages and makes it possible for attackers to hide malware in dependencies or perform malicious script execution during installation.
NPM is a package manager for the JavaScript programming language and the default for the widely used Node.js environment. The package manager helps project owners automate the installation, upgrading, and configuration of software packages hosted on the “npm registry” database at npmjs.com.
In 2020, the platform was acquired by Microsoft through GitHub, and today, it is estimated that over 17 million software developers worldwide are using it, downloading 208 billion packages per month.
Darcy Clarke, a former GitHub engineer, has highlighted the manifest confusion problem in a write-up on his blog, explaining that despite his former employer knowing about the issue since at least November 2022, little has been done to address the associated risks.
The NPM registry is also immensely popular among developers as it contains a wide assortment of packages that can be used to extend an application’s features without requiring additional development work.
However, its popularity makes it a prime target for threat actors to distribute malicious packages to take over dev’s computers, steal credentials, or even deploy ransomware.
Manifest confusion
Manifest confusion occurs there is an inconsistency between a package’s manifest information presented on the npm registry and the actual ‘package.json’ file in the tarball of the published npm package used when the package is installed.
Both the manifest data submitted to NPM when publishing a package and the package.json contain information about the package name, version, and other metadata, such as scripts used in deployment, build dependencies, etc.
The two are submitted separately to the npm registry, and the platform does not validate if they match, so their data could differ, and no one would know unless they scrutinize their contents.
This allows a threat actor to modify the manifest data submitted with a new package to remove dependencies and scripts so that they do not appear in the NPM registry. However, these scripts and dependencies still exist in the package.json file and will be executed when the package is installed.
This “manifest confusion” is illustrated in the image below, showing that there are zero dependencies listed on NPM for Clarke’s proof-of-concept package, even though there are dependencies listed in the package.json.
The risks that arise from the “manifest confusion” inconsistency include cache poisoning, installation of unknown dependencies, execution of unknown scripts, and potentially also downgrade attacks.
“And to be clear, it’s not just hidden dependencies,” warned Socket CEO Feross Aboukhadijeh to BleepingComputer, who said their tools are not affected by this issue.
“Manifest confusion also allows an attacker to include hidden install scripts too. These hidden scripts and dependencies won’t show up on the npm website or in most security tools, even though they will be installed by the npm CLI.”
Unfortunately, the npm community and all major package managers, including npm@6, npm@9, yarn@1, and pnpm@7, are impacted by this problem.
Unfortunately, this leads to a lack of trust in the NPM Registry as the dependencies, version numbers, and even package names may not be accurate.
Instead, developers should manually read the package.json to determine version numbers, what dependencies will be installed, and what scripts will be executed.
Problem still unfixed
Clarke says GitHub has known about manifest confusion problems since at least 2022, and a bug report filed on the npm CLI’s GitHub repository concerning the node-canvas
package seems to confirm that.
The engineer submitted a detailed HackerOne report that presented examples of the problem on March 9, 2023.
On March 21, 2023, GitHub closed the ticket, responding that they were dealing with the issue internally. Still, they have yet to remediate the risks and have not communicated them to the npm community.
Clarke mentions that due to npm’s sheer size and the fact that it has been following this unsafe practice for many years, addressing this problem is far from trivial.
Until GitHub forges a plan to deal with manifest confusion on npm, Clarke suggests that authors and maintainers of packages remove the reliance on manifest data and instead source all metadata apart from the name and version from ‘package.json’ files that are less prone to manipulation.
Another protection measure would be to use a registry proxy between the package database and the npm client, which could implement additional checks and validations to ensure the consistency between the manifest data and the information in the package’s tarball.
BleepingComputer has contacted GitHub about the issue, and we will update this post as soon as we receive a response.