Analysis of Google Keep WebAssembly module

Last month, i was at REcon Montreal to give my training about WebAssembly Security and after some discussion people always ask me this question: Is WebAssembly already used in the wild? 

The answer is of course YES and some WebAssembly modules are potentially running right now in your browser if you are using Google web services. Recently, Google was using WebAssembly for the beta version of Google Earth but also in production for services like Google Keep.

In this blogpost, we gonna reverse partially the WebAssembly module loaded by Google Keep, determine its purpose and extract a maximum of information for future complete analysis. Let’s Go!

webassembly security patrick ventuzelo training

1. Google Keep Wasm Module & JS File Extraction

Usually, in order to run a WebAssembly module in your web page, you will fetch a wasm file and instantiate the module using dedicated JavaScript API. Once it’s done, you will be able to call the module exported functions directly from JavaScript.

Regarding the Google Keep web app, the WebAssembly module “ink.wasm” is fetch (image below – left) and instantiated by the minified JavaScript file ink-loader.js. (image below – right)

Based on JS functions names, this JavaScript file seems to has been generated automatically by emscripten.

google keep wasm webassembly module patrick ventuzelo security analysis ink Sketchology protobuf webgl
google keep wasm webassembly module patrick ventuzelo security analysis ink Sketchology protobuf webgl

2. WebAssembly Module Reversing

One of the first step when reversing a WebAssembly module is to convert the wasm binary (.wasm) to his text format (.wat/.wast) representation. wasm2wat is the perfect tool for this job.

wasm2wat ink.wasm -o ink.wat

The output file (ink.wat) is a text file with around 1.5 Millions of lines.

Based on minified imported & exported function names (image – right), we can confirm that the module has been compiled by emscripten and the optimization flag (-O3)

google keep wasm webassembly module patrick ventuzelo security analysis ink Sketchology protobuf webgl

3. Extract Build Information

This module contains a Data section and the content of this section will be used to initialized the linear memory i.e. an ArrayBuffer shared between the module and the loader script (ink-loader.js).

google keep wasm webassembly module patrick ventuzelo security analysis ink Sketchology protobuf webgl

At the beginning of this module Data section, we get a lot of details about how the module has been built:

  • WebAssembly Toolchain: emscripten-wasm
  • Building date: Jun 25 2019 06:29:13
  • Google prod server: votl8.prod.google.com
  • Project path: third_party/sketchology/public/js/wasm
  • Google building software (Bazel): Blaze, release blaze-2019.06.17-2

At this point, I first tried to retrieve the source code of the WebAssembly module by searching the project path on the web. I found the repository of Chromium 66 (66.0.3359.158 ~1 year old) but without C/C++ source code inside. On the master branch, there is no reference of sketchology anymore but we get information about what is Ink. Finally, the github repository (https://github.com/google/ink) return a 404 error.

4. What is Sketchology and Ink?

After some research, Sketchology refers to an IOS application called “Sketchology Review”. This application isn’t available anymore, the twitter account is inactive and the official website (sketchologyapp.com) is down but you can find a copy using the WayBack Machine. On the LinkedIn’s profile of the creator, we can find that Sketchology is the first vector drawing app with realtime natural media brush effects like blur or watercolor.”

On the other hand, “Ink is a software library enabling Google applications to let their users express themselves using freehand drawing and handwriting”. This library is also used in Google Canvas released end of 2018 (source).

So, it seems that Ink is the evolution/successor of Sketchology and Google Keep use this module ink.wasm when the user want to draw a note (image on top).

To verify our hypothesis, you can debug the WebAssembly module and set breakpoints using the Developer console. In the image below, my breakpoint was triggered when i tried to create a new drawing note.

google keep wasm webassembly module patrick ventuzelo security analysis ink Sketchology protobuf webgl

5. Reversing Protobuf Encoded Blobs

Still inside the module data section, you will find multiple chunk of Google Protobuf encoded blobs (image on the top). 

“Protocol buffers are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler.”source

Those chunk of bytes can be reversed/deserialized using tools such as protobuf-inspector (image at the bottom). Source code of the more generic protocolbuffer file can be found directly on the github repository of the protobuf project (like descriptor.proto)

This kind of information is particularly useful if your are doing pentesting/vulnerability research on the server-side web API.  

google keep wasm webassembly module patrick ventuzelo security analysis ink Sketchology protobuf webgl
google keep wasm webassembly module patrick ventuzelo security analysis protobuf

6. Extract WebGL Vertex Shader Structure

Another part of the data section contains complete piece of codes (bottom image) with variables and main functions. This code is a WebGL “Vertex shader structure” and it will be loaded by WebGL building shader functions at runtime.

google keep wasm webassembly module patrick ventuzelo security analysis ink Sketchology protobuf webgl

7. Absolute path, Error messages, Mangling & Constant names

Finally, we reach the last part of this module data section that is for me the most interesting one. Inside you will find more than 5 thousands strings like:

  • Absolute project files path (“third_party/sketchology/engine/public/sengine.cc”)
  • Error messages (“Could not add image data, no URI specified.”)
  • Mangling functions name (“N3ink26ElementAnimationControllerE”)
  • Constant names (“GL_GEOMETRY_SHADER”)

Just with those strings, we can reconstruct the project tree (image on the left) and associate the corresponding error messages, mangling names and constants for each file. 

If you want to reverse completely this module, you will need first to match the previous information (WebGL, debug strings, …) with memory accesses/offsets (image on the top).

Then, you can determine the functions prototype (mangling names + arguments) and associate each WebAssembly functions with C++ source files. Finally, you can try to decompile your new labeled module into C code using tool like wasm2c.

google keep wasm webassembly module patrick ventuzelo security analysis ink Sketchology protobuf webgl
google keep wasm webassembly module patrick ventuzelo security analysis

8. Conclusion

Nevertheless in this blogpost, we have at the end:

  • Extract a WebAssembly module and related JS file.
  • Convert a module to the text format representation.
  • Found build information
  • Determine the origin and the purpose of the module.
  • Reverse Google Protobuf encoded blobs.
  • Extract WebGL shader source code.
  • Reconstruct the project tree
  • Find debug strings to reverse completely the module (with more time) 

All the files (wasm, js) and extracted information are available in this github repository.

If you want to learn about WebAssembly security from module reversing to WebAssembly VM vulnerability research, you should consider taking one of our trainings. We also offer on-site trainings for companies, starting at just 5 participants.

Patrick Ventuzelo / @Pat_Ventuzelo

SUBSCRIBE TO OUR NEWSLETTER

Privacy Policy

Your personal information will only be used for the purposes of contacting you and will not be shared with any third parties. By submitting your personal information you give your consent for us to contact you with the purpose of providing tailored professional services to you and/or your company.