Securing Developer Tools: A New Supply Chain Attack on PHP
Sonar's R and D team discovered a new attack vector in the PHP supply chain. Read about their findings and how to prevent and patch these code vulnerabilities.
Join the DZone community and get the full member experience.
Join For FreeSupply chain attacks are a hot topic for development organizations today. Last year, in the largest-ever software supply chain attack against SolarWinds, "roughly 18,000 entities downloaded the malicious update," and "nine federal agencies and about 100 private sector companies were compromised," according to a press briefing by the White House. Earlier this year, a security researcher was able to breach Apple, Microsoft, PayPal, and other tech giants using a new supply chain attack technique.
The underlying design exploited by these attacks is that all modern software is built on top of other third-party software components, often without clear visibility of all the downloaded packages. And while reusing many components allows for speeding up the development process, infecting the supply chain is a very effective and subtle attack vector to compromise many organizations at once.
While supply chains can take different forms, one of them is significantly more impactful: by gaining access to the servers distributing these third-party software components, threat actors can alter them to obtain a foothold in the systems of their users.
One year after our first publication about a critical vulnerability in the PHP supply chain (read more in PHP Supply Chain Attack on Composer), the Sonar R&D team uncovered a new critical vulnerability in similar components. It allowed taking control of the server distributing information about existing PHP software packages, and ultimately compromising every organization that uses them.
In this publication, I present our findings in the biggest PHP package manager, Composer, and its official package repository Packagist. I explain how the discovered code vulnerability works in theory, how it affected Packagist, and how we could demonstrate it on both a test instance and the real one. I will also look at how these code vulnerabilities can be prevented and how the maintainers patched this particular one.
Impact
The attack we demonstrate in this publication allowed us to execute arbitrary commands on the server running the official instance of Packagist. Composer uses this service to fetch the metadata associated with a given package and its dependencies. Every month, around 2 billion software dependencies are downloaded with Composer from Packagist, among which at least 100 million of these installs require fetching metadata from Packagist.
The security of these backend services is critical: they perform the association between the name of a package and where the package manager should download it from, so compromising them would allow attackers to force users to download backdoored software dependencies the next time they do a fresh install or an update of a Composer package based on data from 2021. Since Composer is the standard package manager for PHP, most open-source and commercial PHP projects would have been impacted.
You are already safe if you are using the default, official Packagist instance or Private Packagist. I responsibly disclosed our findings, and maintainers patched it on the public production instances within hours.
If you integrate Composer as a library and operate on untrusted repositories, upgrade at least to Composer 1.10.26, 2.2.12, or 2.3.5 to benefit from the security patches for CVE-2022-24828.
Previous Work
Now, let's dive into the technical details of this new finding to see what we can learn. As you'll see, there is a direct link between what we documented in PHP Supply Chain Attack on Composer: I will first summarize what my team and I did a year ago, show how one of our approaches leads to a dead end, and finally see how we could reuse the same exploitation technique that we introduced last year.
Discovery of CVE-2021-29472
My team and I's previous work on CVE-2021-29472 provided us with insights on interesting attack surfaces. Even though I reviewed the patches fixing CVE-2021-29472, I could have missed something, and getting back on them is relevant.
The vulnerability my team and I identified occurred in the implementation of VcsDriver sub-classes: one driver exists for every supported Version Control System (hence the name) like Git, Mercurial, Subversion, etc. Their role is to interact with code repositories created by these tools without re-implementing the related necessary code; instead, Composer invokes them as external commands.
Code that calls system commands is commonly prone to two major classes of vulnerabilities:
- Command Injection: attackers can inject command substitution sequences later interpreted by the shell to force the execution of additional, arbitrary commands (also see Sonar rule S2076).
- Argument Injection: attackers can add extra arguments to the invoked command in the hope of influencing its behavior in a dangerous way (also see Sonar rule S5883).
Command Injection? Argument Injection?
To better understand these concepts, let's go through a few slides from the talk we presented at BARBHACK at the end of August.
In the case of a command injection bug, where the attacker-controlled value is not escaped at all, the command within $()
is first executed by the shell, and its output is used in the second command:
Suppose the attacker-controlled value is correctly enclosed by single quotes by an escaping function. In that case, the command substitution will be ignored by the shell and treated as regular characters in a string literal:
However, the invoked command's argument parser is going to interpret this value as operands and as arguments when prefixed by one or more dashes (-h
, --help
):
In this example, a harmless help message will be displayed, but we discovered a specific option of the hg client that enables the execution of arbitrary commands in all cases. Again, you can find more details about the exploitation in our previous publication.
As you can see, it is impossible to protect against argument injection vulnerabilities using escaping functions. It can be surprising as we are used to neutralizing special characters by escaping or encoding them to prevent so-called injection vulnerabilities (e.g., SQL injections).
Here, developers have to use a special option called the end-of-options: as part of the POSIX specification, it is used to tell the program that parses its arguments to separate options from operands. In simpler terms, anything located at the right of the end-of-options sequence will be treated as an operand: running hg identify -- --help
won't display the help message.
Uncovering a New Vulnerability
The Packagist interface displays information about packages, for instance, here for the famous Symfony framework:
When a new package is imported or updated, asynchronous workers are notified. They will then pull the entire repository associated with it. One of the steps of this process is to update the main documentation page of this package.
This content originates from a file named README.md
by default. This filename could conflict with other services, so the maintainers added an option to specify this file name directly in the package's manifest, as documented in https://getcomposer.org/doc/04-schema.md#readme.
To fetch the contents of this file, the name of the branch is obtained at [1], the file name at [2], and finally, getFileContents()
is invoked at [3]:
packagist/src/Package/Updater.php
<?
private function updateReadme(IOInterface $io, Package $package, VcsDriverInterface $driver): void {
// [...]
try {
// [1]
$composerInfo = $driver->getComposerInformation($driver->getRootIdentifier());
if (isset($composerInfo['readme']) && is_string($composerInfo['readme'])) {
// [2]
$readmeFile = $composerInfo['readme'];
} else {
$readmeFile = 'README.md';
}
// [...]
switch ($ext) {
case '.txt':
// [3]
$source = $driver->getFileContent($readmeFile, $driver->getRootIdentifier());
if (!empty($source)) {
$package->setReadme('<pre>' . htmlspecialchars($source) . '</pre>');
}
break;
The goal of getFileContent()
is to allow reading files from a repository at a given branch, tag, or commit. This is the fastest way to proceed and probably safer, too: there is no risk of mistakenly following symbolic links pointing to unintended destinations or introducing command injection vulnerabilities when performing multiple shell commands.
Each VcsDriver
implements its version of this method. Let's focus on GitDriver
(for Git) and HgDriver
(for Mercurial):
composer/src/Composer/Repository/Vcs/GitDriver.php
<?
public function getFileContent(string $file, string $identifier): ?string {
$resource = sprintf('%s:%s', ProcessExecutor::escape($identifier), ProcessExecutor::escape($file));
$this->process->execute(sprintf('git show %s', $resource), $content, $this->repoDir);
// [...]
}
composer/src/Composer/Repository/Vcs/HgDriver.php
<?
public function getFileContent(string $file, string $identifier): ?string {
$resource = sprintf('%s:%s', ProcessExecutor::escape($identifier), ProcessExecutor::escape($file));
$this->process->execute(sprintf('git show %s', $resource), $content, $this->repoDir);
// [...]
}
composer/src/Composer/Repository/Vcs/HgDriver.php
<?
public function getFileContent(string $file, string $identifier): ?string {
$resource = sprintf('hg cat -r %s %s', ProcessExecutor::escape($identifier), ProcessExecutor::escape($file));
$this->process->execute($resource, $content, $this->repoDir);
// [...]
}
This is a similar situation to what was done for our previous finding, where I can inject additional arguments. Both are ideal for exploitation, as the name of the branch and the file are fully controlled through the manifest file.
Investigating GitDriver
As a reminder, this command will be invoked as git show '<branch>':'<file>'
. I can't use the file's name to inject a new argument, so I have to figure out a way to create a Git branch with all the characters we need for our payload and take care of that mandatory suffix (:'<file>'
).
Among all the options supported by git show, only --output
seems promising as it would allow writing the contents of all the files of the current Git repository into an arbitrary destination. In Securing Developer Tools: Git Integrations, I've already demonstrated that the security of a Git repository is very fragile when the attacker can control or modify internal files such as .git/config
; this file would be a target of choice here.
The first step is to create a branch with our injected options in its name. What should be simple appears to be blocked:
$ git checkout -b --help
fatal: '--help' is not a valid branch name
I could still figure out a way to force it on the local repository, and this branch would be accepted by the Git remote:
$ echo "ref: refs/heads/--help" > .git/HEAD
$ mv .git/refs/heads/main .git/refs/heads/--help $ git push origin -- --help
However, the mandatory suffix becomes a significant constraint. The only way to get around it would be to create a symbolic link between, for instance, foo:README.md
and .git/config
.
I quickly figured out that this path is a dead end: repositories are cloned as bare (notice the option --mirror
in the code snippet below), which means that the directory won't expose files from the malicious package in the repository.
composer/src/Composer/Util/Git.php
<?
public function syncMirror(string $url, string $dir): bool {
// [...]
$commandCallable = static function ($url) use ($dir): string {
return sprintf('git clone --mirror -- %s %s', ProcessExecutor::escape($url), ProcessExecutor::escape($dir));
};
$this->runCommand($commandCallable, $url, $dir, true);
Back on HgDriver
Now, let's have a look at the other vulnerable VcsDriver
. This time, the command is invoked as hg cat -r '<branch>' '<file>'
; this is a more ideal context than in GitDriver
.
As described in the section Previous Work, we can use Mercurial's --config
option to override the behavior of a built-in command, e.g., cat, and make it execute an arbitrary shell script instead.
I can craft the following payload based on the information above in a very similar fashion to what my team and I did for CVE-2021-29472:
The payload may be slightly more complex than what you could have expected; let's break it down:
- Injected configuration override: this is the extra argument that declares a shell command something overriding Mercurial's cat;
- Payload: the repository is cloned as bare, so we can't access files. Using an unmodified call to
hg cat
, we can read the repository's file namedpayload.sh
and pipe it to a shell; - Mandatory suffix: Packagist only processes files ending with
.txt
or.md
; other ones are discarded.
An attacker would have to follow these steps to attempt exploiting this vulnerability against Packagist:
- Create a project in a remote Mercurial repository;
- Put the manifest in
composer.json
and add a malicious readme entry; - When using a payload like the one depicted above, create a file named
payload.sh
to perform the desired actions; - Import the package on Packagist, and request an update of the package.
I performed these steps on a test instance I set up and could demonstrate the execution of arbitrary commands on the server:
The next step would be to modify the definition of a package to point to an unintended destination and compromise the application in which they are used; this is something that my team and I have already demonstrated in our Insomni'hack talk and won't be presented again in this article.
The exploitability of this vulnerability on the production instance, packagist.org
, was also demonstrated with a non-destructive command. I immediately reached out to the maintainers with all the technical details of our attempt, IP address, etc. It should be noted that maintainers did not identify any prior exploitation of this vulnerability.
Patch
CVE-2022-24828
As you may remember from the previous sections, it is not possible to patch the injection in GitDriver with the POSIX end-of-options switch. Git introduced a non-standard flag, --end-of-options
, but it's only supported starting from Git 2.24, which may break Composer for some users.
As a result, the maintainers merged 2c40c53, containing a patch for both vulnerable VcsDriver
classes. First, GitDriver
is patched by forbidding any branch whose name starts with a dash:
public function getFileContent($file, $identifier)
{
+ if (isset($identifier[0]) && $identifier[0] === '-') {
+ throw new \RuntimeException('Invalid git identifier detected. Identifier must not start with a -, given: ' . $identifier);
+ }
+
$resource = sprintf('%s:%s', ProcessExecutor::escape($identifier), ProcessExecutor::escape($file));
$this->process->execute(sprintf('git show %s', $resource), $content, $this->repoDir);
In a similar fashion, HgDriver
now forbids leading slashes in the branch name and introduced the end-of-options switch to protect against argument injections with filename:
public function getFileContent($file, $identifier) {
- $resource = sprintf('hg cat -r %s %s', ProcessExecutor::escape($identifier), ProcessExecutor::escape($file));
+ if (isset($identifier[0]) && $identifier[0] === '-') {
+ throw new \RuntimeException('Invalid hg identifier detected. Identifier must not start with a -, given: ' . $identifier);
+ }
+
+ $resource = sprintf('hg cat -r %s -- %s', ProcessExecutor::escape($identifier), ProcessExecutor::escape($file));
$this->process->execute($resource, $content, $this->repoDir);
Further Hardening
Composer is slightly different than other package managers because it uses Packagist only to fetch metadata about a given package and download the dependency later from another source. They are not hosting the packages, so it becomes slightly harder to integrate and enforce tools like sigstore.
Timeline
Date | Action |
---|---|
2022-04-07 | We report the vulnerability to the Packagist maintainers. |
2022-04-07 | Vendor acknowledges the issues and starts working on a patch. |
2022-04-08 | The public instance at packagist.org is hot-patched. |
2022-04-13 | CVE assigned, official communication by Packagist on their blog and new Composer releases. No indicator of previous exploitation of CVE-2022-24828 has been detected. |
Summary
I demonstrated how I discovered an argument injection in the backend services of the PHP package manager Composer and could successfully exploit it to compromise any PHP software dependency.
This is a perfect example of a retrospectively simple bug missed by the maintainers and vulnerability researchers, even if both likely spent a few hours on this code before merging the security patch for CVE-2021-29472. Coming back on old bugs with a clear mind is a powerful tool that shouldn't be underestimated.
Published at DZone with permission of Thomas Chauchefoin. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments