Open-sourcing Sandboxed API

Many software projects process data which is externally generated, and thus potentially untrusted. For example, this could be the conversion of user-provided picture files into different formats, or even executing user-generated software code.

When a software library parsing such data is sufficiently complex, it might fall victim to certain types of security vulnerabilities: memory corruption bugs or certain other types of problems related to the parsing logic (e.g. path traversal issues). Those vulnerabilities can have serious security implications.

In order to mitigate those problems, developers frequently employ software isolation methods, a process commonly referred to as sandboxing. By using sandboxing methods, developers make sure that only resources (files, networking connections and other operating system resources) which are deemed necessary are accessible to the code involved in parsing user-generated content. In the worst-case scenario, when potential attackers gain remote code execution rights within the scope of a software project, a sandboxing technique can contain them, protecting the rest of the software infrastructure.

Sandboxing techniques must be highly resistant to attacks and sufficiently protect the rest of the operating system, yet must be sufficiently easy-to-use for software developers. Many popular software containment tools might not sufficiently isolate the rest of the OS, and those which do, might require time-consuming redefinition of security boundaries for each and every project that should be sandboxed.

Sandbox once, use anywhere

To help with this task, we are open-sourcing our battle-tested project called Sandboxed API. Sandboxed API makes it possible to create security policies for individual software libraries. This concept allows to create reusable and secure implementations of functionality residing within popular software libraries, yet is granular enough to protect the rest of used software infrastructure.

As Sandboxed API serves the purpose of accessing individual software functions inside a sandboxed library, we are also making publicly available our core sandboxing project, Sandbox2. This is now part of Sandboxed API and provides the underlying sandboxing primitives. It can be also used standalone to isolate arbitrary Linux processes, but is considered a lower-level API.

Overview

Sandboxed API is currently implemented for software libraries written in the C programming language (or providing C bindings), though we might add support for more programming runtimes in the future.

From a high-level perspective, Sandboxed API separates the library to be sandboxed and its callers into two separate OS processes: the host binary and the sandboxee. Actual library calls are then marshalled by an API object on the host side and send via interprocess communication to the sandboxee where an RPC stub unmarshals and forwards calls to the original library.

Both the API object (SAPI object) and the RPC stub are provided by the project, with the former being auto-generated by an interface generator. Users just need to provide a sandbox policy, a set of system calls that the underlying library is allowed to make, as well as the resources it is allowed to access and use. Once ready, a library based on sandboxed API can easily be reused in other projects.

The resulting API of the SAPI object is similar to the one of the original library. For example, when using zlib, the popular compression library, a code snippet like this compresses a chunk of data (error handling omitted for brevity):

void Compress(const std::string& chunk, std::string* out) {
 z_stream zst{};
 constexpr char kZlibVersion[] = “1.2.11”;
 CHECK(deflateInit_(&zst, /*level=*/4, kZlibVersion, sizeof(zst)) == Z_OK);
 zst.avail_in = chunk.size();
 zst.next_in = reinterpret_cast<uint8_t*>(&chunk[0]);
 zst.avail_out = out->size();
 zst.next_out = reinterpret_cast<uint8_t*>(&(*out)[0]);
 CHECK(deflate(&zst, Z_FINISH) != Z_STREAM_ERROR);
 out->resize(zst.avail_out);
 deflateEnd(&zst);
}


Using Sandboxed API, this becomes:
void CompressSapi(const std::string& chunk, std::string* out) {
 sapi::Sandbox sandbox(sapi::zlib::zlib_sapi_embed_create());
 CHECK(sandbox.Init().ok());
 sapi::zlib::ZlibApi api(&sandbox);
 sapi::v::Array<uint8_t> s_chunk(&chunk[0], chunk.size());
 sapi::v::Array<uint8_t> s_out(&(*out)[0], out->size());
 CHECK(sandbox.Allocate(&s_chunk).ok() && sandbox.Allocate(&s_out).ok());
 sapi::v::Struct<sapi::zlib::z_stream> s_zst;
 
 constexpr char kZlibVersion[] = “1.2.11”;
 sapi::v::Array<char> s_version(kZlibVersion, ABSL_ARRAYSIZE(kZlibVersion));
 CHECK(api.deflateInit_(s_zst.PtrBoth(), /*level=*/4, s_version.PtrBefore(),
                         sizeof(sapi::zlib::z_stream).ValueOrDie() == Z_OK));
 CHECK(sandbox.TransferToSandboxee(&s_chunk).ok());
 s_zst.mutable_data()->avail_in = chunk.size();
 s_zst.mutable_data()->next_in = reinterpet_cast<uint8_t*>(s_chunk.GetRemote());
 s_zst.mutable_data()->avail_out = out->size();
 s_zst.mutable_data()->next_out = reinterpret_cast<uint8_t*>(s_out.GetRemote());
 CHECK(api.deflate(s_zst.PtrBoth(), Z_FINISH).ValueOrDie() != Z_STREAM_ERROR);
 CHECK(sandbox.TransferFromSandboxee(&s_out).ok());
 out->resize(s_zst.data().avail_out);
 CHECK(api.deflateEnd(s_zst.PtrBoth()).ok());
}

As you can see, when using Sandboxed API there is extra code for setting up the sandbox itself and for transferring memory to and from the sandboxee, but other than that, the code flow stays the same.

Try for yourself

It only takes a few moments to get up and running with Sandboxed API. If Bazel is installed:

sudo apt-get install python-typing python-clang-7 libclang-7-dev linux-libc-dev
git clone github.com/google/sandboxed-api && cd sandboxed-api
bazel test //sandboxed_api/examples/stringop:main_stringop

This will download the necessary dependencies and run the project through its paces. More detailed instructions can be found in our Getting Started guide and be sure to check out the examples for Sandboxed API.

Where do we go from here?

Sandboxed API and Sandbox2 are used by many teams at Google. While the project is mature, we do have plans for the future beyond just maintaining it:

  • Support more operating systems – So far, only Linux is supported. We will look into bringing Sandboxed API to the Unix-like systems like the BSDs (FreeBSD, OpenBSD) and macOS. A Windows port is a bigger undertaking and will require some more groundwork to be done.
  • New sandboxing technologies – With things like hardware-virtualization becoming almost ubiquitous, confining code into VMs for sandboxing opens up new possibilities.
  • Build system – Right now, we are using Bazel to build everything, including dependencies. We acknowledge that this is not how everyone will want to use it, so CMake support is high on our priority list.
  • Spread the word – Use Sandboxed API to secure open source projects. If you want to get involved, this work is also eligible for the Patch Reward Program.
Get involved

We are constantly looking at improving Sandboxed API and Sandbox2 as well as adding more features: supporting more programming runtimes, different operating systems or alternative containment technologies.
Check out the Sandboxed API GitHub repository. We will be happy to consider your contributions and look forward to any suggestions to help improve and extend this code.


Source: Google
Open-sourcing Sandboxed API