5 This a tool that should make Linux kernel's NetEffect RNIC driver (nes) returns
6 `NULL` to uverbs layer, while a `NULL` pointer is not an expected return value,
7 leading to `NULL` pointer dereferenced in `uverbs_create_qp()` (and under specific
8 circumstances, later, in `ib_destroy_qp()`).
10 While the tool can be used with any InfiniBand, iWARP, RoCE, RDMA adapter
11 (HCA), only NetEffect driver should be able to trigger the `NULL` pointer
12 dereference in upper layer.
14 To test, a NetEffect HCA (aka. NetEffect NE020 10Gb Accelerated Ethernet Adapter
15 (iWARP RNIC), aka. Intel NetEffect Ethernet Server Cluster Adapter) is needed
16 along with the matching kernel driver (`iw_nes`, enabled with
17 `CONFIG_INFINIBAND_NES=[my]`) and userspace driver (`libnes` library).
22 The tool can be built with:
31 The tool can be executed directly:
33 $ ./ib-hw-nes-create-qp-null
38 Here's the output of test ran on a system with a QLogic^WIntel InfiniBand
41 $ ./ib-hw-nes-create-qp-null
43 Memory mapped @ 0x7f4ed4b04000 [page 0]
44 @ 0x7f4ed4b05000 [page 1]
45 @ 0x7f4ed4b06000 [page 2]
46 @ 0x7f4ed4b07000 [page 3]
47 @ 0x7f4ed4b08000 [page 4]
48 Unmapping @ 0x7f4ed4b05000 [page 1]
49 Unmapping @ 0x7f4ed4b07000 [page 3]
50 Using unmmaped page @ 0x7f4ed4b07000 [page 3] for response
51 Response @ 0x7f4ed4b06fe0 [page 2]
52 Using unmmaped page @ 0x7f4ed4b05000 [page 1] for command
53 Command @ 0x7f4ed4b04fc0 [page 0]
54 CREATE_QP : sret = 64 errno = 0 : SUCCESS
55 DESTROY_QP : sret = 24, errno = 0 : SUCCESS
60 In [drivers/infiniband/hw/nes/nes_verbs.c][1], function [`nes_create_qp()`][2] calls
61 [`ib_copy_from_udata()`][3] which in turn calls [`copy_from_user()`][4].
62 If the pointer [`udata->inbuf`][5] is not pointing to valid userspace page,
63 [`ib_copy_from_udata()`][3] will fail and returns `-EFAULT`.
65 [1]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/hw/nes/nes_verbs.c?id=v3.14-rc5 "nes_verb.c"
66 [2]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/hw/nes/nes_verbs.c?id=v3.14-rc5#n1103 "nes_create_qp()"
67 [3]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/hw/nes/nes_verbs.c?id=v3.14-rc5#n1185 "calling ib_copy_from_udata()"
68 [4]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/rdma/ib_verbs.h?id=v3.14-rc5#n1506 "ib_copy_from_udata()"
69 [5]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/rdma/ib_verbs.h?id=v3.14-rc5#n1000 "strict ib_udata, inbuf member"
71 Then [`nes_create_qp()`][2] [returns `NULL`][6].
73 [6]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/hw/nes/nes_verbs.c?id=v3.14-rc5#n1189 "returns NULL;"
75 In [drivers/infiniband/core/uverbs_cmd.c][7], function [`uverbs_create_qp()`][8],
76 which was [calling][9] [`nes_create_qp()`][2] through [`device->create_qp` function pointer][10],
77 [tests its return code][11] using [`IS_ERR()`][12]. But does not check for `NULL`.
79 [7]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5 "uverbs_cmd.c"
80 [8]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n1489 "uverbs_create_qp()"
81 [9]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n1589 "call device->create_qp()"
82 [10]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/rdma/ib_verbs.h?id=v3.14-rc5#n1361 "struct ib_device, create_qp member"
83 [11]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n1591 "test return value with IS_ERR()"
84 [12]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/err.h?id=v3.14-rc5#n33 "IS_ERR()"
86 In the [following lines][13] of [`uverbs_create_qp()`][8], [`NULL` pointer is dereferenced
87 to access various fields of the `struct ib_qp`][13].
89 [13]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#1597 "uverbs_create_qp() access to fields"
91 In most cases, memory page at address 0 is not mapped, so kernel will report
92 an Oops and will terminate the test program.
94 Unfortunately, if a memory page is mapped by userspace at 0x0,
95 [`uverbs_create_qp()`][8] would [continue and record the `NULL` pointer][14] in
96 [`struct idr ib_uverbs_qp_idr`][15] with [`idr_add_uobj()`][16].
98 [14]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n1617 "NULL pointer passed to idr_add_uobj()"
99 [15]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_main.c?id=v3.14-rc5#n73 "struct idr ib_uverbs_qp_idr"
100 [16]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n118 "idr_add_uobj()"
102 Then [`uverbs_create_qp()`][8] will returns [the QP handle as allocated][17] by
103 [`idr_add_uobj()`][16] to userspace. (It will only be possible if page holding the
104 response buffer are valid and mapped, otherwise [`ib_destroy_qp()`][18] is called
107 [17]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n1623 "QP handle"
108 [18]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/verbs.c?id=v3.14-rc5#n975 "ib_destroy_qp()"
110 [`uverbs_create_qp()`][8] will return success to userspace, with a valid
111 QP handle in the response buffer.
113 But most uverbs won't be able to use the QP handle returned to userspace,
114 since they will try to retrieve the QP by its handle using [`idr_read_qp()`][19]
115 and check for a `NULL` pointer returned by [`idr_read_obj()`][20] in case of invalid
116 or non-matching handle.
118 [19]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n238 "idr_read_qp()"
119 [20]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n205 "idr_read_obj()"
121 The only uverb that can be called against the handle is [`uverbs_destroy_qp()`][21]
122 since it [use][22] [`idr_write_uobj()`][23] [directly][22], allowing it to [retrieve
123 the `NULL` pointer][24]. Then the function will [call][25] [`ib_destroy_qp()`][18] with
124 the `NULL` pointer as `struct qp *` argument.
126 [21]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n1987 "uverbs_destroy_qp()"
127 [22]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n2003 "call idr_write_uobj()"
128 [23]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n181 "idr_write_obj()"
129 [24]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n2006 "retrieve NULL pointer from QP handle"
130 [25]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n2014 "call ib_destroy_qp()"
132 [`ib_destroy_qp()`][18] [dereferences the `NULL` pointer][26] to access to the [`struct
133 ib_device` holding the function pointer `destroy_qp()`][27] in order to [call it][26].
135 [26]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/verbs.c?id=v3.14-rc5#n993 "dereferences NULL pointer to call function pointer device->destroy_qp()"
136 [27]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/rdma/ib_verbs.h?id=v3.14-rc5#n1372 "struct ib_device, destroy_qp member"
138 And you should guess the end of the story (see links below for related
139 documentation if not).
141 Why it is not a big deal ?
142 ==========================
144 To turn this into an exploit page 0 must be mapped and accessible by userspace
145 and kernel: it's not possible by default, thanks to features added to Linux
148 - `vm.mmap_min_addr` is the minimal address the kernel will allow you to map
152 $ sysctl vm.mmap_min_addr
156 $ cat /proc/sys/vm/mmap_min_addr
161 # sysctl -w vm.mmap_min_addr=0
165 # echo 0 > /proc/sys/vm/mmap_min_addr
167 - SELinux enforce another limit on context:
170 $ getsebool mmap_low_allowed
174 # setsebool mmap_low_allowed=on
177 - [PaX][PAX] UDREF (and KERNEXEC) (within [grsecurity][GRSECURITY] kernel) will disallow direct access
178 to userspace from kernel mode, so the kernel won't be able to access page 0
179 with a `NULL` pointer dereference.
181 - IA32 (eg. x86) [SMAP][SMAP] (and [SMEP][SMEP]) can also defeat NULL pointer dereference
182 from kernel mode, [just like Pax UDREF (and KERNEXEC)][PAXSMAPSMEP].
184 [PAX]: http://en.wikipedia.org/wiki/PaX "PaX"
185 [GRSECURITY]: http://grsecurity.net "grsecurity"
186 [SMAP]: https://lwn.net/Articles/517475/ "Supervisor Mode Access Prevention"
187 [SMEP]: http://vulnfactory.org/blog/2011/06/05/smep-what-is-it-and-how-to-beat-it-on-linux/ "Supervisor Mode Execution Prevention"
188 [PAXSMAPSMEP]: http://forums.grsecurity.net/viewtopic.php?f=7&t=3046
190 What's needed to turn the test program in an exploit ?
191 ======================================================
193 After making nes driver returning `NULL` and create qp uverbs returns the
194 QP handle to userspace, exploit must have to overwrite `struct ib_device`
195 pointer and make it point to a `struct ib_device` of its own where
196 `destroy_qp` function pointer will be an exploit controlled function.
197 Then, test tool has to call destroy qp uverbs to make the kernel
198 call its exploit function.
203 This also demonstrate that nes driver does not check `udata->inlen` before
204 trying to access to userspace command buffer. Conversely, it does not check
205 `udata->outlen` before trying to write to userspace response buffer.
206 Unfortunately, those behaviors are quite common in InfiniBand drivers.
211 - ["Bypassing Linux' `NULL` pointer dereference exploit prevention (`mmap_min_addr`)"](http://blog.cr0.org/2009/06/bypassing-linux-null-pointer.html),
212 by Julien Tinnes, June 26, 2009
214 - ["Fun with `NULL` pointers, part 1"](http://lwn.net/Articles/342330/),
215 by Jonathan Corbet, July 20, 2009
217 - ["Fun with `NULL` pointers, part 2"](http://lwn.net/Articles/342420/),
218 by Jonathan Corbet, July 21, 2009
220 - ["`mmap_min_addr` on SELinux and non-SELinux systems"](http://eparis.livejournal.com/606.html),
221 by Eric Paris, July 21, 2009
223 - ["Confining the unconfined. Oxymoron?"](http://danwalsh.livejournal.com/30084.html),
224 by Dan Walsh, July 21st, 2009
226 - ["Bug 511143 - selinux policy allows addr 0 mappings by default"](https://bugzilla.redhat.com/show_bug.cgi?id=511143),
229 - ["Security-Enhanced Linux (SELinux) policy and the `mmap_min_addr` protection (CVE-2009-2695)"](https://access.redhat.com/site/articles/17995)
231 - ["How do I mitigate against `NULL` pointer dereference vulnerabilities?"](https://access.redhat.com/site/articles/20484)
233 - ["SELinux hardening for `mmap_min_addr` protections"](http://eparis.livejournal.com/891.html),
234 by Eric Paris, August 26th, 2009
236 - ["Much ado about `NULL`: Exploiting a kernel `NULL` dereference"](https://blogs.oracle.com/ksplice/entry/much_ado_about_null_exploiting1),
237 By Nelson Elhage, Apr 12, 2010
242 Yann Droneaud <ydroneaud@opteya.com>
247 Copyright (C) 2014 OPTEYA SAS
249 This software is available to you under a choice of one of two licenses.
250 You may choose to be licensed under the terms of the the OpenIB.org BSD license
251 or the GNU General Public License (GPL) Version 2, both included in file