Incredibly disappointing release, especially for a company with so much talent and capital.
Looking at the worlds generated here https://marble.worldlabs.ai/ it looks a lot more like they are just doing image generation for a multiview stereo 360 panoramas and then reprojecting that into space. The generations exhibit all the same image artifacts that come from this type of scanning/reconstruction work, all the same data shadow artifacts, etc.
This is more of a glorified image generator, a far cry from a "world model".
To be fair, multiview-consistent diffusion is extremely hard - it's an accomplishment of it's own right to get right, and still very useful. "World model" is probably a misnomer though (what even is a world model?). Their recent work on frame gen models is probably a bit closer to an actual world model in the traditional sense (https://www.worldlabs.ai/blog/rtfm).
They have $230m in funding and some of the best CS/AI researchers in the world. People like Skybox labs have already released stuff that is effectively the same as this with far less capital and resources. This is THE premiere world model company, and the fact their first release is a far cry from the promise here feels like a bit of a bellweather.
I agree RTFM is in more of the "right" direction here, and what is presented here is a bit of a derivative of that. Which makes this release so much more crass, as it seems like a ploy to get platform buy in from users more so than a release of a "world model".
Yeah, I'm likewise a bit underwhelmed by the results.
If you go in with the expectation that you give it a single image and it's doing gaussian splatting from a single image and a prompt it's phenomenal. If you deviate too far from the image viewpoint it breaks down, but it looks decent long enough to be very usable. But if you go in with the expectation that it's generating "worlds" it's not very good. This only passes as a world in a 20 second tech demo where the user isn't given camera controls
My best guess is that they are forced (by investors, lack of investors, fear of the AI bubble, or whatever) to release something, and this was something they could polish up to production quality and host with reasonable GPU resources
I assume this is definitely the case, with a drive to create platform economics on their sharing platform so that there is platform lock-in when any better thing releases. This is more of a platform launch than any notable model launch imo.
Looking at the worlds generated here https://marble.worldlabs.ai/ it looks a lot more like they are just doing image generation for a multiview stereo 360 panoramas and then reprojecting that into space. The generations exhibit all the same image artifacts that come from this type of scanning/reconstruction work, all the same data shadow artifacts, etc.
This is more of a glorified image generator, a far cry from a "world model".