How does Linux find a program to run
2024年7月4日
In a shell, if I type a-command-to-run
, how does Linux find this program?
Exit code
Z Shell has an option print_exit_value. If you set it, Z Shell will print the exit code if the exit code is not 0. See Listing .
If you load oh-my-zsh, the output may be [1] 179519 exit 127 some-command
. The deviation from standard zsh is seemingly due to async_prompt.zsh. [1]
is a file descriptor, and 179519 is the process id. Z Shell follows the convention of Bash that if a command is not found, the exit code is 127.[1]
The PATH environment variable
We know that in Bash, the PATH
Bourne variable controls the directories for command lookup. In addition to PATH
of type string, Z Shell provides path
of type array so that users can add or remove directories easily. Z Shell internally synchronizes path
and PATH
.
Despite that the PATH variable is special to Bourne shell, Bash, and Z Shell[2][3], is it special to the Linux (kernel) itself? We have to check the manual.
The exec()
family of functions is used to start a program[4]. I use execl()
to write C++ program.
Save Listing as p.cpp
and run Listing . If you already installed C++ development tools, you don’t have to install build-essential.
In Listing , I ask to run “uname” but I don’t give a full path. The execution result shows that execl()
failed and set errno to “ENOENT: No such file or directory”[5].
How execl()
looks up a program is defined in path_resolution. In a word, if pathname doesn’t start with /, it is assumed to be relative to the current directory. My “uname” contains no “.”, so execl()
just tries to run uname under the current folder, and since the current folder doesn’t have this file, the execution failed. Notice this whoe process does not involve the so-called PATH
environment variable.
I can make a.out
success. Don’t have to recompile the binary. I only need to provide my own version of uname under the current directory.
Listing shows that I created an executable file named uname with shebang, and echo something. I then run a.out
again. execl()
this time found my uname, and executed it.
How, we have hands-on experience that the PATH environment variable is not special to Linux. The PATH
variable is probabaly rooted in Bourne Shell and adopted by Bash and Z Shell.
How other programs use PATH
Take whereis, which attempts to locate the desired program in the hard-coded Linux places, as well as in the places specified by $PATH
and $MANPATH
.
My own uname is in ~/dev
, so I added this path to the path environment/shell variable. Then, whereis can find two uname programs, one in /usr/bin
, and the other in /home/vm/dev
. Next, I nuke path
, whereis cannot be found. But if I specify the path of whereis, whereis can run and can find uname in the hard-coded system path, as Listing shows.
How PHP uses PATH?
env
is a utility to run a program in a modified environment. Especially, it has an option --ignore-environment
(-i
) in order to start with an empty environment. For instance, env --ignore-environment printenv
will print nothing. Similarly, PHP function getenv()
will return nothing under -i
.
$ env php -r 'print_r(getenv());'
Array
(
[USER] => vm
[LOGNAME] => vm
[HOME] => /home/vm
[PATH] => /home/vm/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:
/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
...
)
$ env -i php -r 'print_r(getenv());'
Array
(
)
$handle = proc_open(array('env'), [
0 => ["pipe", "r"], // stdin
1 => ["pipe", "w"], // stdout
2 => ["pipe", "w"], // stderr
], $pipes, null, null, ['bypass_shell' => true]);
if (!is_resource($handle)) {
echo 'cannot create process';
return;
}
$exit_code = get_proc_output($handle, $pipes, $stdout, $stderr);
echo "exit code: $exit_code\n";
echo "\nstdout:\n";
echo $stdout;
echo "\nstderr:\n";
echo $stderr;
$ env -i php p.php
exit code: 0
stdout:
stderr:
This behavior is quite different. Even though the PATH environment variable is unset, php can still find the printenv program. Let dig into the source code of PHP.
The PHP function proc_open()
calls Linux kernel function posix_spawnp()
if an array is provided as the command[6]. Then according to manpages, posix_spawnp()
helps searches for the specified file in the PATH
environment variable (in the same way as for execvp(3)), if the executable file is a simple filename[7].
However, we have experimented that the PHP function getenv() returns empty list. How does posix_spawnp()
utilize the PATH
environment variable?
https://manpages.ubuntu.com/manpages/focal/en/man3/execvp.3.html says “If this [PATH] variable isn’t defined, the path list defaults to a list that includes the directories returned by confstr(_CS_PATH)
”
参考资料
- . Bash Reference Manual. . [2024-07-08].↑
- . The Z Shell Manual. . 2022-05-14 [2024-07-08].↑
- . Bash Reference Manual. . 2022-09-19 [2024-07-08].↑
- . Ubuntu Manpage: execl, execlp, execle, execv, execvp, execvpe - execute a file. . [2024-07-08].↑
- . Ubuntu Manpage: errno - number of last error. . [2024-07-08].↑
- . php-src/ext/standard/proc_open.c at· php/php-src. . [2024-07-08].↑
- . posix_spawn, posix_spawnp - spawn a process. . [2024-07-08].↑