
Programs that don't work:

rm -r:
needs to open directory FDs and use fchdir() to work without being
vulnerable to symlink races.  fchdir() isn't implemented.

install:
uses fchdir() too, it seems.

gcc:
tries 1000 times to open files in /tmp in a loop (so hangs for a while),
then aborts (without a descriptive error)
-- really, it should test for EEXIST

mkisofs:
/usr/bin/mkisofs: Operation not permitted. Panic cannot set back efective uid.
It does the following:
setreuid32(0xffffffff, 0x1f6)           = -1 EPERM (Operation not permitted)
where 0x1f6 = 502 (mrs), the UID it believes it already has

make:
setresuid32(0xffffffff, 0x1f6, 0xffffffff) = -1 EPERM (Operation not permitted)
(but this doesn't happen with Debian, even though make is 3.79.1 on both)

mailq:
masqmail is setuid root, and it tries to close all file descriptors,
and consequently cuts off its access to the server.

emacs:
  X-plash$ emacs
  emacs: Memory exhausted--use M-x save-some-buffers RET
I'm not sure why this is happening.
Emacs doesn't like being run like this, from Bash:
  /lib/ld-linux.so.2 /usr/bin/emacs
This error is mentioned in:
http://www.cs.berkeley.edu/~nks/fig/fig.ps

xemacs:
Running the shell "shell" gives:
  sendmsg: Bad file descriptor
  recvmsg: Bad file descriptor
  [2622] cap-protocol: [fd 5] to-server: connection error, errno 9 (Bad file descriptor)
  Can't exec program /bin/sh
  Process shell exited abnormally with code 1
It would appear that XEmacs is closing the file descriptor that Plash uses.

xemacs: also relies on pseudo-terminals; do this:
nm /usr/bin/xemacs | egrep 'getpt|grantpt|unlockpt|ptsname|openpty|forkpty'

xemacs:
Filename completion doesn't seem to work on fabricated directories,
only real directories.  Is it relying on the object type being included
in directory listings?


Bugs in Plash:

Race condition:
connect() on Unix domain sockets presumably follows symlinks, and
there is no way to switch this off.  This means an adversary could
replace a domain socket with a symlink to exploit the race condition
and open sockets in the server's namespace.
(bind() probably doesn't follow symlinks; it probably behaves like
open() with O_CREAT|O_EXCL.)
Note that this is a race condition between servers.  If there was only
one server process providing real filesystem objects, there would be
no race condition, because socket_connect() would be atomic.
Possible solution:
 * The server has a private directory in which no server will create
   symlinks on behalf of a client.
   The server's socket_connect() method hardlinks the socket file into
   the private directory, checks that the object that got hardlinked
   is not a symlink, and then calls connect() on it.
    * Problem: this doesn't work across devices.  We could try creating
      a hard link in the same directory, but that doesn't work if the
      server doesn't have write access to the directory.
      (But then, if the server doesn't have write access *and other
      servers don't either* the symlink attack cannot be carried out.)
      Doing this in /tmp/.X11-unix creates a file that is owned by
      root in a directory owned by root (with the sticky bit set), which
      then cannot be deleted.  (The sticky bit means you can unlink an
      object only if you own it, or you own the directory.)
    * We could have a list of directories into which we try to create
      hard links.  These are tried in turn.  The user can set these up
      to cover all the filesystems.
      Plash would have to create a directory per user ID inside these
      directories (as in /tmp) -- incidentally, this means the sticky
      bit isn't necessary, because processes with other user IDs can't
      delete the directories while they're in use.
      What should the default be?  "/tmp/plash/"
      This would be enough for opening X11 sockets.
      How should this be configured?
      Via an environment variable?  We would want to unset this so that
      programs run under Plash don't see it.
      Via a configuration file in /etc?  /etc/plash/hardlink-dirs could
      contain a list of directories, each on a separate line.
See
http://cert.uni-stuttgart.de/archive/bugtraq/1999/10/msg00011.html
SSH authentication agent follows symlinks via a UNIX domain socket

Race conditions in gc-uid-locks:
This race means that run-as-anonymous could assign a process a UID that
*is* currently in use.  If a process manages to get the same UID as
another, it can kill the other process, or worse, ptrace() it and gain
its authority.  The risk is very low, because a program inside the
chroot() environment can't run run-as-anonymous or gc-uid-locks.
 * If gc-uid-locks runs between run-as-anonymous claiming the lock file
   and changing its uid, gc-uid-locks will delete the lock file
    * Solution:  run-as-anonymous should set its group before claiming
      the lock file
 * gc-uid-locks reads uids first then locks.  Consider:
    * gc-uid-locks reads uids
    * run-as-anonymous sets uids, claims lock
    * gc-uid-locks reads lock files -> deletes lock that actually in use
   Solution:  read lock files first.
 * gc-uid-locks deletes lock file after checks:
    * gc-uid-locks[1] determines that lock file F is to be deleted
    * gc-uid-locks[2] deletes F
    * run-as-anonymous claims lock file F
    * gc-uid-locks[1] deletes F, but it's actually in use
   Solution:  gc-uid-locks should have its own mutual exclusion lock
Assumptions:  that /proc/ is a reliable, *atomic* way of reading uid/gids

Re-entrancy: run_server_step() is called while waiting for a reply on
a return continuation object.  It will handle incoming requests --
these should be queued instead.
 * I don't think this actually causes any bugs, since there are no
   TOCTTOU problems in the code.  (There aren't really any invariants
   that are broken during a method call.)

Running as root:  /dev/tty is a writable slot; it should be a writable
object in a read-only slot
 * The same applies to /tmp/.X11-unix

No thread safety in libc

No resource accountability (not really a bug)

stat64() doesn't work properly

Make sure that messages are encoded and decoded properly on 64-bit and
other-endian machines.
 * Currently I assume sizeof(int) == 4.

Sending on a socket is never queued.  This could lead to DoS of
servers.  It could potentially lead to deadlocks, if both ends of a
connection send at the same time (this doesn't happen at the moment
because all connections are client-server and call-return).

The server processes are included as part of the job with the client
processes in the job.  The server has the same process group ID, and
the shell will wait for it.  This is convenient (for printing the exit
status), but wrong.  If the user presses Ctrl-C, and the client
handles SIGINT and survives, the server will still be killed, but the
client will become mostly useless.

Shell: build-fs.c:
If you have the command "cmd foo", and `foo' is a symlink, the symlink
will be followed and the shell will also grant access to the
destination of the link.
If you have the command "cmd => foo", the symlink is not followed.
This is inconsistent.
Actually, I have realised that following the symlink is not good from
a security point of view.

libc's object-based execve() ignores the close-on-exec flag

There may be cases where libc calls should preserve errno but don't.



Behaviour that might need changing:

build-fs.c attaches copies of symlinks into processes' file
namespaces, so the process won't see them change when they change in
the real filesystem.  This may not be expected.



Fixed

Now works:

xemacs
(uses open() on directories)
