A flaw in the R programming language could allow code execution
May 01, 2024
A flaw in the R programming language enables the execution of arbitrary code when parsing specially crafted RDS and RDX files.
A vulnerability, tracked as CVE-2024-27322 (CVSS v3: 8.8), in the R programming language could allow arbitrary code execution upon deserializing specially crafted R Data Serialization (RDS) or R package files (RDX).
R is an open-source programming language widely used for statistical computing and graphics. It was initially developed by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, in the early 1990s. Since then, it has gained popularity among statisticians and data miners for its powerful features and extensive libraries for data manipulation, visualization, and statistical analysis.
The R programming language has also become increasingly popular in the AI/ML field because it allows to manage large datasets.
The vulnerability was reported by researchers at HiddenLayer, the experts pointed out that the attack vector is very effective because RDS files or R packages are often shared between developers and data scientists.
“Our team discovered that it is possible to craft a malicious RDS file that will execute arbitrary code when loaded and referenced. This vulnerability, assigned CVE-2024-27322, involves the use of promise objects and lazy evaluation in R.” reads the analysis published by HiddenLayer.
The R programming language has its serialization format, used for serializing objects with ‘saveRDS’ and deserializing them with ‘readRDS’. This format is also utilized when saving and loading R packages.
The vulnerability ties how R handles serialization (‘saveRDS’) and deserialization (‘readRDS’) and involves the use of promise objects and lazy evaluation in R.
“Lazy evaluation is a strategy that allows for symbols to be evaluated only when needed, i.e., when they are accessed.” continues the analysis. “The above is achieved by creating a promise object that has both a symbol and an expression attached to it. Once the symbol ‘y’ is accessed, the expression assigning the value of ‘x’ to ‘y’ is run. The key here is that ‘y’ is not assigned the value 1 because ‘y’ is not assigned to ‘x’ until it is accessed. While we were not successful in gaining code execution within the deserialization code itself, we thought that since we could create all of the needed objects, it might be possible to create a promise that would be evaluated once someone tried to use whatever had been deserialized.”
Attackers can put promise objects containing arbitrary code in the metadata of an RDS file in the form of expressions that will be evaluated during deserialization leading to the execution of the embedded code.
Possible attack scenarios see threat actors tricking victims into executing malicious files or distributing a malware-laced package through widely used repositories and waiting victims download them.
“Given the widespread usage of R and the readRDS function, the implications of this are far-reaching. Having followed our responsible disclosure process, we have worked closely with the team at R who have worked quickly to patch this vulnerability within the most recent release – R v4.4.0. In addition, HiddenLayer’s AISec Platform will provide additional protection from this vulnerability in its Q2 product release.” concludes the report.
Pierluigi Paganini
Follow me on Twitter: @securityaffairs and Facebook and Mastodon
(SecurityAffairs – hacking, R programming language)