SmackerNews

Fork() without exec() is dangerous in large programs

84 points · 101 comments · 9 years ago · zdw

evanjones.ca

caf9 years ago
In my opinion, it's libraries that secretly use threads under the covers (but still access state shared with the rest of the program like the malloc heap) that are what's dangerous.
rdtsc9 years ago
Yap. Erlang 19 (latest version) switched to using a smaller spawner executable and spawns all OS processes from there by forking. So it forks something restricted and small not the whole VM.
Here are the details of how it works:
https://github.com/erlang/otp/blob/a5256e5221aff30f6d2cc7fab...
They also claim a 3-5x speedup for launching external commands because of it. So there is a nice performance boost as well.
So basically can add 4th strategy -- fork once a small program at the start, then fork from there from then on.
qwertyuiop9249 years ago
In some programs, fork(2) is the right way to do concurrency. It's simple (API-wise), and it makes IPC explicit, as opposed to the implicit IPC of threads. clone isn't posix, so you can forget it, if you don't feel like being behind systemd in the "screw any system that isn't Linux" line.
In places where you need the speed, threads are useful, but they're harder to use than forks.
Annoyingly, threads and forks don't work well together. zzzcpan already talked about how to fork threaded code. As for the problem of libraries using threads, if a library you're using is using threads, and it's documented (it probably is), and you didn't know about it, I'll pencil that in as your fault.
Other comments in this thread have dismissed fork(2) as a bad job entirely, but I don't think I agree. It's an effective way to do simple multiprocessing, and it's a lot simpler than threads in many contexts.
kostyash9 years ago
It sounds like if you use threads and fork it might be a disaster. Its true, but the author blames only fork and skip the second part of the problem, threads. This is not fair, because fork is much older than POSIX threads. In fact, POSIX threads were poorly designed to use with fork.
antirez9 years ago
If you are in control of all the threads that an application is running, you can call fork() safely by making sure the threads are put in a safe state (no critical locks retained) when the fork() call happens. Also note that many things that should normally be unsafe, like having running threads calling malloc() while another thread is forking, are actually safe in the real world using certain implementation of fork, since there are pre/post fork "hooks" in the malloc implementation in order to fix the state of the child.
So if you control very closely the libs you link with and what they do, as well as the threads you use yourself, it is possible to use fork() in a reliable way.
ambrop79 years ago
The fork-exec code in my program currently does some things that are supposedly unsafe, but I can't figure what to use instead of initgroups() which is not async-signal-safe. I want to run the child as a different user for which I do initgroups+setgid+setuid between fork and exec. The only solution I see is to run getgrouplist() before the fork then in the child use setgroups() instead of initgroups(), but both of these functions are non-standard.
EDIT: Never mind, seems like initgroups() is also nonstandard but generally available on Unix-like systems.
datenwolf9 years ago
Hmm, consider we'd want to keep fork(), then the only safe way to deal with this would be, that critical sections were actually transactions implemented on the OS level and at after fork() in the child all transactions in flight get rolled back before returning from fork().
I see two implementation challenges with that suggestion:
1.) implementing that transaction mechanism as a kernel feature: When entering a CS mark all pages CoW, upon leaving the CS merge modified pages (problem: Whole pages are then mutual exclusive, dealing with this is the challenge)
2.) battling with user space implemented locks that use atomics.
-----
An immediate mitigation I see is, that fork() itself is a CS on _all_ the locks of a process. If we consider that only the standard locking mechanisms are used, then whenever a CS is entered (which includes the creation process of a locking primitive) it raises/posts a global fork-lock semaphore. And upon leaving that semaphore is lowered.
This still leaves the DIY-locking primitives problem open. But it should be more or less straightforward to add this to the system libc/pthread libraries' locking primitive implementation and fork() syscall wrappers.
Or did I miss something essential here? Talk is cheap, so if nobody has any obvious objections I'd actually go ahead implement it.
EDIT: Okay, one immediate problem I see is, that this would pose a challenge for calling fork inside a CS. Technically this is a situation where thread recursive locks would help, but as we all know, recursive locks are highly problematic.
ahh9 years ago
Endorsed.
Even fork _with_ exec can be real trouble. This is one of my bugaboos at work: due to poor life choices and high pain tolerance, I own the infrastructure we use to spawn subprocesses (carefully.) For various reasons (security most notably) we have to do some very tricky things in and to the forked child before exec(), and pretty much all of this code is a disaster waiting to happen. Every so often I get feature requests for more stupid pet tricks people would like out of subprocesses, and they're always surprised by what their "simple" change would entail.
I'd like it if Linux had native support for posix_spawn, but even that would require a lot of extensions to be useful.
Don't get me started on the teams that want to break forking rules and thus ask me how to guarantee a process has no non-main threads. There are few ways you can make me more upset than by building software that breaks if some one else happens to call pthread_create and doesn't tell you.
hyperbovine9 years ago
I ran into this exact problem recently in a multithreaded Python app and spent two days trying to figure out wtf was going on. The multiprocessing spawn startup mode that was added in 3.4 solves this for most use cases at the expense of a small performance hit. For 2.x you are SOL however.
saynsedit9 years ago
Face the fact, fork() is fundamentally flawed!
xroche9 years ago
fork() is not per se "dangerous" in such context. You just have to be careful enough and only use asynchronous-signal-safe fonctions, as you would do in a signal handler (see https://www.securecoding.cert.org/confluence/display/c/SIG30...)
Typically calls such as malloc(), printf() etc. are strictly forbidden in the child after a fork().
lsiebert9 years ago
If there are specific issues with locks from dead child processes that deadlock, maybe the locks could be addressed specifically.
zzzcpan9 years ago
How to use fork safely: 1. Only use fork to immediately call exec. 2. Fork a worker at the beginning of your program, before there can be other threads. 3. Only use fork in toy programs.
4. Stop writing broken multithreaded code altogether or if you must at least run an event loop per core/thread and use a wrapper for fork() to put the system into a fork()able state before forking. It's nice and reliable.
Multithreading by itself is just not a high-level concept to be used reliably by programs.
known9 years ago
https://en.wikipedia.org/wiki/Thrashing_(computer_science)

news.ycombinator.com/item?id=12302539