What Is The Polyfill Incident Trying To Tell Us, If We Bother To Listen - Cybernoz

If you work in the security industry, you have likely heard about the polyfill.io incident that came into the public light a couple of weeks ago. We don’t know exactly how many websites were affected, but it seems we have a large window of between 110,000 and several million, according to a tweet by Cloudflare’s CEO, Matthew Prince.

Affected users were being taken to a sporting bets website, and what’s clear is that it could have been a lot worse. That’s because when an attacker gets their code running on any website without restriction, they can do a mass collection of any sensitive data. This includes Personal Identifiable Information (PII), payment data, Protected Health Information (PHI), etc. – normally manipulated by the website. But that’s not all. Attackers can tamper with the webpage DOM, changing what the user sees, asking the user for more information, or redirecting them to phishing websites where their credentials and money can be stolen.

Now, multiply that by the number of users across hundreds of thousands or millions of websites. Luckily, in this latest instance, Namecheap decided to take the domain down after people and public pressure demanded it. And, now, everyone is safe from this particular incident (yes, you can still visit that sports betting website if you really want to!). As I mentioned earlier, it could have been way more damaging when you consider the time it took to uncover this attack would have given the skimmer a big enough window to skim thousands or even millions of credit cards. Or enough time to leak personal data from millions of people. All of which would have returned the attacker’s investment many times fold (the word on the street is that the domain and the GitHub repo were transacted for $1M).

Despite the mild actual consequences, if you think about the potential scope of the attack, it was bad. Not because it was new, but because it wasn’t. Almost every year we have a major node.js package that is compromised with malware. For example, in 2021: attackers managed to hijack the developer’s account and injected malicious code into ua-parser-js NPM package. This included a crypto-miner and password stealing code. In another 2018 instance, a widely used NPM package, event-stream, was compromised, and in this case, emptied specific Bitcoin wallets that were using that dependency. It was bad because, once more, it showed how vulnerable the web is right now to supply chain attacks.

The polyfill.io incident, as the XZ lib, or like defunct-domain-based attacks, shows us a trend where attackers are intentionally going after getting control of third-party components to get malware running on thousands of apps without too much hassle. They are, respectively, offering to buy these projects, tricking current maintainers into passing them the control or exploiting the lack of proper code hygiene that is plaguing the web today.

At first glance, we could argue that one way to address the problem would be to dramatically improve visibility over who owns these projects and provide warnings on ownership changes. Or, we could make it even simpler by doing that just for the corresponding domain names, assuming who owns the domain owns the project. This could begin with banning anonymized WHOIS records on these domains, and ending with proper validation of the domain owner’s identity. But, at the end of the day, we will still be vulnerable because we don’t know the agenda of the new owners, and, as many security people know, security based only on reputation is bound to fail (as recently made evident by the XZ incident). The problem with the polyfill.io incident also cannot be solved by asking people not to sell components used by thousands of websites because more of that will happen.

The problem needs to be solved individually by every website developer. Developers must select a good security architecture, carefully picking what software components they want to be exposed to, and make sure that any impact coming from those components is attenuated with proper security isolation.

We have talked about this in the past, first revisiting the last 20 years of web isolation features. From there, we concluded that the web does offer native isolation mechanisms. But knowing how to use them properly is not always simple. In fact, more often than not, people relax these mechanisms to make things work, sacrificing tighter security. We then addressed the elephant in the room by telling people what they already know: patching alone isn’t working and that it should be complemented with good security architecture, attack surface reduction, and proper isolation mechanisms.

So, if we have the tools and knowledge to do it, why do we continue to fail? It’s because, despite how many times we have said this, security is still an afterthought, and the only way to get it right is to plan for it right from the beginning. Web applications became the standard way to offer software to people, not just because they are portable, but mainly because they are the quickest and cheapest to build. And the main reason behind this is that, intentionally or not, their nature is to be highly composable. Whoever came up with

We don’t believe people made a conscious decision to choose higher composability and higher exposure to supply chain attacks over less composability and less exposure to supply chain attacks. It is just how things happened, and now that higher exposure to supply chain attacks is knocking on our door, the payment is due. So where does that leave us? Rather than crossing our fingers and hoping not to be using the next polyfill.io when it hits you, the community needs to start figuring out how future web applications should be secured.

The good news is that the community has started taking action and here is what we determined–unless you decide to stop using third-party software that you don’t control, your best option is to stop trusting your dependency stack and start properly isolating each component.

Today’s browsers already offer a number of mechanisms like the Same Origin Police (leveraged by Cross-origin iframes) or the Content Security Policy (CSP). These things are great, but if you try to build a defense based solely on what the browser provides, it will soon feel like you are working on a patchwork quilt. Web security standards seemingly grew in a semi-chaotic way, with small, incremental improvements getting picked over complete redesigns of security and isolation mechanisms. It makes sense since the web is a living standard, and keeping things working is critical. Even when keeping things working equates to poking holes into browser isolation mechanisms and making them bypassable.

So, people came up with other mechanisms to provide stronger, more resilient solutions to defend against third-party components. These solutions are split into those that require you to adapt your code and the ones that are transparent and do not require any changes. The former, like approaches based on Secure EcmaScript (SES), follow the object-capability (ocap) principle and can be quite secure. However, the integration work and the not-so-easy debuggability can be intimidating. The latter approaches were developed with visibility and transparency as one of the main design principles, resulting in a more friendly adoption.

Both approaches offer ways to sandbox scripts, specifying in fine-grained detail what capability (or code behaviors) dependencies should be given access (unlike browser native features that lack granularity). Ultimately, this is what protects applications from the issues that inevitably will arise with some of the components that they use. It prevents compromised components from touching sensitive data or from compromising the integrity of the other parts of the application.