Analysis of Google Keep WebAssembly module
The answer is of course YES and some WebAssembly modules are potentially running right now in your browser if you are using Google web services. Recently, Google was using WebAssembly for the beta version of Google Earth but also in production for services like Google Keep.
In this blogpost, we gonna reverse partially the WebAssembly module loaded by Google Keep, determine its purpose and extract a maximum of information for future complete analysis. Let’s Go!
1. Google Keep Wasm Module & JS File Extraction
2. WebAssembly Module Reversing
One of the first step when reversing a WebAssembly module is to convert the wasm binary (.wasm) to his text format (.wat/.wast) representation. wasm2wat is the perfect tool for this job.
wasm2wat ink.wasm -o ink.wat
The output file (ink.wat) is a text file with around 1.5 Millions of lines.
Based on minified imported & exported function names (image – right), we can confirm that the module has been compiled by emscripten and the optimization flag (-O3)
3. Extract Build Information
This module contains a Data section and the content of this section will be used to initialized the linear memory i.e. an ArrayBuffer shared between the module and the loader script (ink-loader.js).
At the beginning of this module Data section, we get a lot of details about how the module has been built:
- WebAssembly Toolchain: emscripten-wasm
- Building date: Jun 25 2019 06:29:13
- Google prod server: votl8.prod.google.com
- Project path: third_party/sketchology/public/js/wasm
- Google building software (Bazel): Blaze, release blaze-2019.06.17-2
At this point, I first tried to retrieve the source code of the WebAssembly module by searching the project path on the web. I found the repository of Chromium 66 (66.0.3359.158 ~1 year old) but without C/C++ source code inside. On the master branch, there is no reference of sketchology anymore but we get information about what is Ink. Finally, the github repository (https://github.com/google/ink) return a 404 error.
4. What is Sketchology and Ink?
After some research, Sketchology refers to an IOS application called “Sketchology Review”. This application isn’t available anymore, the twitter account is inactive and the official website (sketchologyapp.com) is down but you can find a copy using the WayBack Machine. On the LinkedIn’s profile of the creator, we can find that “Sketchology is the first vector drawing app with realtime natural media brush effects like blur or watercolor.”
On the other hand, “Ink is a software library enabling Google applications to let their users express themselves using freehand drawing and handwriting”. This library is also used in Google Canvas released end of 2018 (source).
So, it seems that Ink is the evolution/successor of Sketchology and Google Keep use this module ink.wasm when the user want to draw a note (image on top).
To verify our hypothesis, you can debug the WebAssembly module and set breakpoints using the Developer console. In the image below, my breakpoint was triggered when i tried to create a new drawing note.
5. Reversing Protobuf Encoded Blobs
Still inside the module data section, you will find multiple chunk of Google Protobuf encoded blobs (image on the top).
“Protocol buffers are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler.” – source
Those chunk of bytes can be reversed/deserialized using tools such as protobuf-inspector (image at the bottom). Source code of the more generic protocolbuffer file can be found directly on the github repository of the protobuf project (like descriptor.proto)
This kind of information is particularly useful if your are doing pentesting/vulnerability research on the server-side web API.
6. Extract WebGL Vertex Shader Structure
Another part of the data section contains complete piece of codes (bottom image) with variables and main functions. This code is a WebGL “Vertex shader structure” and it will be loaded by WebGL building shader functions at runtime.
If you want to learn more about WebGL and Vertex shader, take a look at those links:
7. Absolute path, Error messages, Mangling & Constant names
Finally, we reach the last part of this module data section that is for me the most interesting one. Inside you will find more than 5 thousands strings like:
- Absolute project files path (“third_party/sketchology/engine/public/sengine.cc”)
- Error messages (“Could not add image data, no URI specified.”)
- Mangling functions name (“N3ink26ElementAnimationControllerE”)
- Constant names (“GL_GEOMETRY_SHADER”)
Just with those strings, we can reconstruct the project tree (image on the left) and associate the corresponding error messages, mangling names and constants for each file.
If you want to reverse completely this module, you will need first to match the previous information (WebGL, debug strings, …) with memory accesses/offsets (image on the top).
Then, you can determine the functions prototype (mangling names + arguments) and associate each WebAssembly functions with C++ source files. Finally, you can try to decompile your new labeled module into C code using tool like wasm2c.
Nevertheless in this blogpost, we have at the end:
- Extract a WebAssembly module and related JS file.
- Convert a module to the text format representation.
- Found build information
- Determine the origin and the purpose of the module.
- Reverse Google Protobuf encoded blobs.
- Extract WebGL shader source code.
- Reconstruct the project tree
- Find debug strings to reverse completely the module (with more time)
All the files (wasm, js) and extracted information are available in this github repository.
If you want to learn about WebAssembly security from module reversing to WebAssembly VM vulnerability research, you should consider taking one of our trainings. We also offer on-site trainings for companies, starting at just 5 participants.
Patrick Ventuzelo / @Pat_Ventuzelo