Linux内核中的sys_execve()系统调用可以同时接收绝对路径还是相对路径?
内核级代码中的sys_execve()
是否应接收filename
参数的绝对或相对路径?
Shall sys_execve()
in kernel level code receive absolute or relative path for the filename
parameter?
sys_execve
可以采用绝对路径或相对路径
sys_execve
can take either absolute or relative paths
让我们通过以下方式进行验证:
Let's verify it in the following ways:
- 使用原始系统调用的实验
- 阅读内核源代码
- 在内核+ QEMU上运行GDB以验证我们的源代码分析
实验
main.c
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
int main(void) {
syscall(__NR_execve, "../main2.out", NULL, NULL);
}
main2.c
#include <stdio.h>
int main(void) {
puts("hello main2");
}
编译并运行:
gcc -o main.out main.c
gcc -o ../main2.out main2.c
./main.out
输出:
hello main2
在Ubuntu 16.10中进行了测试.
Tested in Ubuntu 16.10.
内核源
首先,进入内核树
git grep '"\.\."' fs
我们专注于fs
,因为我们知道在那里定义了execve
.
We focus on fs
because we know that execve
is defined there.
这立即产生如下结果: https://github.com/torvalds/linux/blob/v4.9/fs/namei.c#L1759 明确表明他的内核了解..
:
This immediately gives results like: https://github.com/torvalds/linux/blob/v4.9/fs/namei.c#L1759 which clearly indicate that he kernel knows about ..
:
/*
* "." and ".." are special - ".." especially so because it has
* to be able to know about the current root directory and
* parent relationships.
*/
然后,我们查看execve的定义 https ://github.com/torvalds/linux/blob/v4.9/fs/exec.c#L1869 ,它要做的第一件事是在输入路径上调用getname()
:
We then look at the definition of execve https://github.com/torvalds/linux/blob/v4.9/fs/exec.c#L1869 and the first thing it does is to call getname()
on the input path:
SYSCALL_DEFINE3(execve,
const char __user *, filename,
const char __user *const __user *, argv,
const char __user *const __user *, envp)
{
return do_execve(getname(filename), argv, envp);
}
getname
是在fs/namei.c
中定义的,该文件是上面的".."
引用所在的文件.
getname
is defined in fs/namei.c
, which is the file where the above ".."
quote came from.
我没有费心遵循完整的通话路径,但是我敢打赌getname
它最终会完成..
分辨率.
I haven't bothered to follow the full call path, but I bet that getname
it ends up doing ..
resolution.
follow_dotdot
看起来特别有前途.
GDB + QEMU
阅读源代码很棒,但是我们无法确定是否实际使用了代码路径.
Reading the source is great, but we can never be sure that the code paths are actually used.
有两种方法可以做到这一点:
There are two ways to do that:
-
printk
,重新编译,printk
,重新编译 - GDB + QEMU.设置有点粗糙,但是一旦完成,那就太幸福了
-
printk
, recompile,printk
, recompile - GDB + QEMU. Setup is a bit rougher, but once done it is pure bliss
First get the setup working as explained at: How to debug the Linux kernel with GDB and QEMU?
现在,我们将使用两个程序:
Now, we will use two programs:
init.c
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
int main(void) {
chdir("d");
syscall(__NR_execve, "../b.out", NULL, NULL);
}
b.c
#include <unistd.h>
#include <stdio.h>
int main(void) {
puts("hello");
sleep(0xFFFFFFFF);
}
rootfs
文件结构应类似于:
init
b.out
d/
一旦GDB运行,我们将做:
Once GDB is running, we will do:
b sys_execve
c
x/s filename
输出../b.out
,所以我们知道它是正确的系统调用.
Outputs ../b.out
, so we know it is the right syscall.
现在我们之前看过的有趣的".."
注释位于一个名为walk_component
的函数中,所以让我们看看它是否被称为:
Now the interesting ".."
comment we had seen before was in a function called walk_component
, so let's see if that is called:
b walk_component
c
是的,我们击中了它.
如果我们读了一点,就会看到一个电话:
If we read a bit into it, we see a call:
error = handle_dots(nd, nd->last_type);
听起来很有前途,并且确实做到了:
which sounds promising and does:
static inline int handle_dots(struct nameidata *nd, int type)
{
if (type == LAST_DOTDOT) {
if (!nd->root.mnt)
set_root(nd);
if (nd->flags & LOOKUP_RCU) {
return follow_dotdot_rcu(nd);
} else
return follow_dotdot(nd);
}
return 0;
}
那么将此type
(nd->last_type
)设置为LAST_DOTDOT
的原因是什么?
So what is it that sets this type
(nd->last_type
) to LAST_DOTDOT
?
好吧,在源代码中搜索= LAST_DOTDOT
,我们发现link_path_walk
正在执行此操作.
Well, search the source for = LAST_DOTDOT
, and we find that link_path_walk
is doing it.
更好的是:bt
说link_path_walk
是一个呼叫者,因此很容易理解现在的情况.
And even better: bt
says that link_path_walk
is a caller, so it will be easy to understand what is going on now.
在link_path_walk
中,我们看到:
if (name[0] == '.') switch (hashlen_len(hash_len)) {
case 2:
if (name[1] == '.') {
type = LAST_DOTDOT;
因此,迷雾得到了解决:".."
不是正在执行的检查,它挫败了我们之前的抱怨!
and thus the mistery is solved: ".."
was not the check that was being done, which foiled our previous greps!
相反,分别检查了两个点(因为.
是子案例).
Instead, the two dots were being checked separately (because .
is a subcase).