Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> what is /dev/shm? I thought /dev is already a synthetic FS?

/dev/shm is a tmpfs mount. It's not synthetic. It's basically "/tmp but it uses your RAM so it's not essentially-unbounded, so we didn't put it on /tmp because then stupid legacy programs might OOM the kernel."

It's mounted under /dev presumably because you can think of "a tmpfs mount" as a system facility for "named shared-memory allocations" — in the same way that e.g. /dev/audio is a system facility, or /dev/random is a system facility. It's a facility that presents itself as a virtual filesystem, but it's still fundamentally a "low level system API" meant for consumption by the systems programmers writing e.g. glibc, rather than something the user is meant to see and browse into as part of the filesystem abstraction. (I believe that /dev/mapper is another thing that ended up under /dev for this reason.)

Also, /dev was a synthetic FS for a little while — devfs — but this long post-dates the creation and conventional-ization of these other mountpoints inside /dev. Until 2009, /dev was just a directory in the rootfs, with special-device inodes created manually during rootfs creation with mknod(8).

These days, under udev, /dev is once again "just a directory" — just one with a daemon that automatically creates and destroys the special-device inodes in it to match what's actually available on the system.



Thank you for the explanation. The implementation details and the history are both interesting, but again - completely irrelevant to the question of "am I running out of disk space".

Having thought of your suggestion again (df should hide irrelevant mounts), I think it would be unfortunate. I might have legitimate reasons to check on my tmpfs mounts (e.g. as you suggested, a stupid program dumping cruft into /tmp). Changing df also doesn't help clean up my Grafana dashboard.

Again, I think the root cause of the problem lies somewhere between exposing implementation details to the user, and layering on more uninvited complexity. Between a bunch of Linux boxes I have at hand, /run, /run/lock, /run/wrappers, /run/user/1000 (and /run/user/1001, 1002...), are all separate mount points. If /run is already a tmpfs mount point, why does /run/lock have to be one as well? The mount flags are pretty much identical. If uid 1000 shouldn't be able to deny uid 1001 the ability to write to their XDG_RUNTIME_DIR, why not implement some form of quotas?

I'm sure there are good answers to all of these questions, but my main point still stands - even when this information is relevant, it's being drowned in the noise.


> I might have legitimate reasons to check on my tmpfs mounts

Yeah, but the output of df(8) re: tmpfs is meaningless, as 1. every tmpfs mount is considered to each have as much free space as there is RAM in the system (total RAM, not free RAM!); and 2. the "used space" stat will track unlinked-but-open files, which for a tmpfs shouldn't be thought of as even being "in" the tmpfs at that point, but rather just being reserved IPC memory held by one or between several processes.

If you want to know the disk usage of a tmpfs dir, use du(1). That's what it's for: seeing what the on-disk space taken by the tree of inodes in a directory or mountpoint aggregates up to.

df(1) is an abbreviation for "disk free" for a reason — the output of df(1) only makes sense insofar as there's a bounded, independent, reserved pool of "free space" of a resource that you want to measure and manage.

Since df(1) used to exist in an environment where the only mount points were of such resources, df(1) never previously had to do any filtering of the mounts table to achieve its stated job. But the "spirit of its semantics" would today imply filtering.

> If /run is already a tmpfs mount point, why does /run/lock have to be one as well?

I believe this is a long-term deprecation caught in mid-transition — in theory, /run itself doesn't actually need to be a tmpfs any more; according to the FHS and the newer versions of XDG, all the tmpfs-es should be subdirectories of /run. Once that actually happens, /run would just become a regular directory, like /dev is.

> If uid 1000 shouldn't be able to deny uid 1001 the ability to write to their XDG_RUNTIME_DIR, why not implement some form of quotas?

I believe these dirs are actually done this way for efficiency, not security: one magical thing about a tmpfs mount is that it somewhat acts like a memory arena for the allocations within it; so unmounting it will batch free all those memory allocations in a very efficient manner. (Think: the time it takes to `rm -rf` a million tiny files in a scratch partition, vs. just reformatting said partition.) /run/user/... is created at user login; users expect logout [i.e. session refcount dropping to 0] — and/or graceful shutdown — to not hang on unlinking a million tiny files; and having /run/user/... be a separate tmpfs that gets unmounted on logout, helps with that.

Separately, I believe "application users" for containerized services get /run/user/... mounts created for them, within their separate mount namespace (and this is the only way that that can work); having non-containerized users also use mountpoints here allows code reuse, rather than two mostly-redundant and potentially-buggy codepaths.

> even when this information is relevant, it's being drowned in the noise

Yes, I agree. My point is more that only certain filesystem mounts are actually mounts of bounded-size writable disks. And that it's only mounts of bounded-size writable† disks that a system administrator would be concerned about "managing" by asking the question "how much of this disk is free?"

(† One example you didn't mention: read-only squashfs images, used for e.g. Ubuntu snaps — which show up under df(1) as always-100%-utilized mounts.)


> 1. every tmpfs mount is considered to each have as much free space as there is RAM in the system (total RAM, not free RAM!)

This is wrong as can be seen from the output of `df -h -t tmpfs` on a Linux system:

    df -t tmpfs -h
    Filesystem      Size  Used Avail Use% Mounted on
    tmpfs            22G  168K   22G   1% /dev/shm
    tmpfs           8,6G  2,2M  8,6G   1% /run
    tmpfs           5,0M     0  5,0M   0% /run/lock
    tmpfs           4,3G  464K  4,3G   1% /run/user/1000
Every tmpfs has a different size / available space.

> 2. the "used space" stat will track unlinked-but-open files, which for a tmpfs shouldn't be thought of as even being "in" the tmpfs at that point

This is also wrong as they still count against the space the given tmpfs might use. That is relevant information when something reports ENOSPACE.

> If you want to know the disk usage of a tmpfs dir, use du(1).

This number is usually irrelevant.

> > If /run is already a tmpfs mount point, why does /run/lock have to be one as well? > > I believe this is a long-term deprecation caught in mid-transition

This is wrong too. /run/lock is a separate tmpfs as /run/lock is world-writable, but /run is not. It's basically to prevent denial of service by using the world-writable /run/lock to fill the /run tmpfs. Which separate tmpfs for /run and /run/lock this is no longer possible for all users.

This is also why /run/user/X is a separate tmpfs.


>/dev/shm is a tmpfs mount. It's not synthetic. It's basically "/tmp but it uses your RAM so it's not essentially-unbounded

/dev/shm is for the shared-memory facility. It's not used as a temporary filesystem by anything (possibly excepting the user). FreeBSD's shm implementation actually does support the read/write interface, but it doesn't expose a filesystem.

>, so we didn't put it on /tmp because then stupid legacy programs might OOM the kernel."

>Yeah, but the output of df(8) re: tmpfs is meaningless,

/tmp is tmpfs on Fedora and other distros. The size parameter on mount can limit available 'storage' so the free 'space' df output can be useful.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: