> The problem is that in WASM-land we're heading towards WASI and WAT components, which is similar to the .NET, COM & IDL ecosystems. While this is actually really cool in terms of component and interface discovery, the downside is that it means you have to re-invent the world to work with this flavor of runtime.
At the application level, you're generally going to write to the standards + your embedding. Companies that write embeddings are encouraced/incentivized to write good abstractions that work with standards to reduce user friction.
For example, for making HTTP requests and responding to HTTP requests, there is WASI HTTP:
It's written in a way that is robust enough to handle most use cases without much loss of efficiency. There are a few inefficiencies in the WIT contracts (that will go away soon, as async lands in p3), but it represents a near-ideal representation of a HTTP request and is easy for many vendors to build on/against.
As far as rewriting the world, this happens to luckily not be quite true, thanks to projects like wasi-libc:
Networking is actually much more solved in WASI now than it was roughly a year ago -- threads is taking a little longer to cook (for good reasons), but async (without function coloring) is coming this year (likely in the next 3-4 months).
The sandboxing abilities of WASM are near unmatched, along with it's startup time and execution speed compared to native.
I'm really eager to see what happens in the near future with WAT & WASI, but I'm also very aware of seeing a repeat of DLL hell.
There are a few niches where standardization of interfaces and discoverability will be extremely valuable in terms of interoperability and reducing the development effort to bring-up products that deeply integrate with many things, where currently each team has to re-invent the wheel again for every end-user product they integrate with, with the more ideal alternative being that each product provides their own implementations of the standard interfaces that are plugged into interfaces.
But, the reason I'm still on the fence is that I think there's more value in the UNIX style 'discrete commands' model, whether it's WASM or RISC-V I don't think anybody cares, but it's much more about self-describing interfaces with discoverability that can be glued together using whatever tools you have at your disposal.
> I'm really eager to see what happens in the near future with WAT & WASI, but I'm also very aware of seeing a repeat of DLL hell.
I think we can at least say WebAssembly + WASI is distinct from DLL hell because at the very least your DLLs will run everywhere, and be intrinsically tied to a version and strict interface.
These are things we've just never had before, which is what makes it "different this time". Having cross-language runnable/introspectable binaries/object files with implicit descriptions of their interfaces that are this approachable is new. You can't ensure semantics are the same but it's a better place than we've been before.
> But, the reason I'm still on the fence is that I think there's more value in the UNIX style 'discrete commands' model, whether it's WASM or RISC-V I don't think anybody cares, but it's much more about self-describing interfaces with discoverability that can be glued together using whatever tools you have at your disposal.
A bit hard to understand here the difference you were intending between discrete commands and a self-describing interface, could you explain?
I'd also argue that WASM + Component Model/WASI as a (virtual) instruction set versus RISC-V are very different!
DLLs already run everywhere since CLR became cross platform.
Really this is walking an already trailed path, multiple times, we can even notice the parts grass no longer grows, how much it has been walked through.
The "universal compile target" facet of wasm is much less focal than the "universally embeddable" one.
The sandboxing is the keystone holding up the entire wasm ecosystem, without it no one would be interested in it same as nobody would run javacript in browsers without a sandbox (we used to, it was called flash, we no longer do).
I am curious why you focus so much on "universal runtime/compile-target do fail" rather than its actual strenght when at least in the case of java applet they failed because their sandbox sucked (and startup times).
Because WASM sandbox only works, to the extent hackers have not bothered attacking existing implementations to the same level as they did to Java applets, which is anyway one implementation among many since 1958 UNCOL idea.
Additionally, it is a kind of worthless sandbox, given that the way it is designed it doesn't protect against memory corruption, so it is still possible to devise attacks, that will trigger execution flows leading to internal memory corruption, possibly changing the behaviour of an WASM module.
> Nevertheless, other classes of bugs are not obviated by the semantics of WebAssembly. Although attackers cannot perform direct code injection attacks, it is possible to hijack the control flow of a module using code reuse attacks against indirect calls. However, conventional return-oriented programming (ROP) attacks using short sequences of instructions (“gadgets”) are not possible in WebAssembly, because control-flow integrity ensures that call targets are valid functions declared at load time. Likewise, race conditions, such as time of check to time of use (TOCTOU) vulnerabilities, are possible in WebAssembly, since no execution or scheduling guarantees are provided beyond in-order execution and post-MVP atomic memory primitives :unicorn:. Similarly, side channel attacks can occur, such as timing attacks against modules. In the future, additional protections may be provided by runtimes or the toolchain, such as code diversification or memory randomization (similar to address space layout randomization (ASLR)), or bounded pointers (“fat” pointers).
> The sandboxing abilities of WASM are near unmatched, along with it's startup time and execution speed compared to native.
Could you expand on this? I think everyone would agree with the first two of these - sandboxing is the whole point of WASM, so it would be excellent at that. And startup latency matters a great deal to WASM programs, again not surprised that runtimes have optimised that.
But execution speed compared to native? Are you saying WASM programs execute faster than native? Or even at the same speed?
Ah this could have been clearer -- the context is userland emulation (and I expand that to broadly mean emulation/VMs and even containers -- i.e. the current group of options). It's not that Wasm is likely to run faster than native, it's that it runs reasonably close to native speed when compared to the other options.
Separately, it also matters what you consider "native" -- it is possible to write programs in a more efficient language (ex. one without a runtime), apply reasonable optimizations, and with AOT/JIT be faster than what could be reasonably written idiomatically in the host language (e.g. some library that already exists to do X but just does it inefficiently).
> the downside is that it means you have to re-invent the world to work with this flavor of runtime.
This is at least one of the reasons we've been building thin kernel interfaces for Wasm. We've built two now, one for the Linux syscall interface (https://github.com/arjunr2/WALI) and one for Zephyr. A preliminary paper we wrote a year or so back is here (https://arxiv.org/abs/2312.03858), and we have a new one coming up in Eurosys 25.
One of the advantages of a thin kernel interface to something like Linux is really low overhead and low implementation burden for Wasm engines. This makes it easier to then build things like WASI just one level up, compiled against the kernel interface and delivered as a Wasm module. Thus a single WASI implementation can be reused across engines.
A thin kernel interface isn't a reimplementation of a kernel. The WALI implementation in WAMR is ~2000 lines of C, most of which is just pass-through system calls.
It does not throw away the Wasm sandbox. Sandboxing means two things: memory sandboxing and system sandboxing. It retains the former. For the latter you can apply the same kinds of sandboxing policies as native processes and achieve the same effect, or even do it more efficiently in-process by the engine, and do interposition and whitelist/blacklisting more robustly than, e.g. seccomp.
Alright, selectively forwarding the syscalls, now you're approaching the problem again where you need to reimplement parts of Linux to understand the state machine of what fd 432 means at any given point in time etc; basically you're implementing the ideas of gVisor in a slightly different shape, without being able to run preexisting binaries. Doesn't seem like a useful combination of features, to me.
Again, no. The security policies we have in mind can be implemented above the WALI call layer and supplied as an interposition library as a Wasm module. So you can have custom policies that run on any engine, such as implementing the WASI security model as a library. As it is now, all of WASI has to be implemented within the Wasm engine because the engine is the only entity with authority to do so. That's problematic in that engines have N different incompatible, incomplete and buggy implementations of WASI, and those bugs can be memory safety violations that own the entire process.
Thin kernel interfaces separate the engine evolution problem from the system interface evolution problem and make the entire software stack more robust by providing isolation for higher-level interfaces.
To filter out syscalls for complex policies, you need to understand the semantics of prior syscalls. For example, you need to keep track of what the dirfs in an unlinkat call refers to. And to keep track of FDs you need to reimplement fcntl. And so on.
This is why gVisor contains a reimplementation of parts of Linux.
Yes, but the engine doesn't need to do this, you can do this on your own time as a library. As there are literally dozens of Wasm engines now, thin kernel interfaces are a stable interface that they can all implement in exactly the same way[1] (simple safety checks + pass through) and then higher-level, more safe, and in some way better policies and APIs can be implemented as Wasm modules on top.
[1] This makes the interface per-kernel, not per-kernel x per-engine. It's also not per-kernel x per-kernel; engines would not be required to emulate one kernel on another kernel.
> let's delegate the hardest part back to the caller!
Obviously, an expert would write the security policies and make them reusable as libraries. Incidentally, that is what WASI is--it's not only a new security model, but a new API that requires rewrites of applications to fit with the new capability design.
> Try writing a seccomp policy for filesystem access
Try implementing an entire new system API (like WASI) in every engine! You have that problem and a whole lot more.
For comparison, implementing WASI preview1 is 6000 lines of C code in libuvwasi--and that's not even complete. Other engines have their own, less complete and broken, buggy versions of WASI p1. And WASI p2 completely upends all of that and needs to be redone all over again in every engine.
Obviously, WASI p1 and p2 should be implemented in an engine-independent way and linked in. Which is exactly the game plan of thin kernel interfaces. In that sense, at the very least thin kernel interfaces is a layering tool for the engine/system API split that enhances security and evolvability of both. Nothing requires the engine to expose the kernel interface, so if you want a WASI only engine then only expose WALI to WASI and call it a day.
At the application level, you're generally going to write to the standards + your embedding. Companies that write embeddings are encouraced/incentivized to write good abstractions that work with standards to reduce user friction.
For example, for making HTTP requests and responding to HTTP requests, there is WASI HTTP:
https://github.com/WebAssembly/wasi-http
It's written in a way that is robust enough to handle most use cases without much loss of efficiency. There are a few inefficiencies in the WIT contracts (that will go away soon, as async lands in p3), but it represents a near-ideal representation of a HTTP request and is easy for many vendors to build on/against.
As far as rewriting the world, this happens to luckily not be quite true, thanks to projects like wasi-libc:
https://github.com/webassembly/wasi-libc
Networking is actually much more solved in WASI now than it was roughly a year ago -- threads is taking a little longer to cook (for good reasons), but async (without function coloring) is coming this year (likely in the next 3-4 months).
The sandboxing abilities of WASM are near unmatched, along with it's startup time and execution speed compared to native.