Introduction ============ This a tool that should make Linux kernel's NetEffect RNIC driver (nes) returns `NULL` to uverbs layer, while a `NULL` pointer is not an expected return value, leading to `NULL` pointer dereferenced in `uverbs_create_qp()` (and under specific circumstances, later, in `ib_destroy_qp()`). While the tool can be used with any InfiniBand, iWARP, RoCE, RDMA adapter (HCA), only NetEffect driver should be able to trigger the `NULL` pointer dereference in upper layer. To test, a NetEffect HCA (aka. NetEffect NE020 10Gb Accelerated Ethernet Adapter (iWARP RNIC), aka. Intel NetEffect Ethernet Server Cluster Adapter) is needed along with the matching kernel driver (`iw_nes`, enabled with `CONFIG_INFINIBAND_NES=[my]`) and userspace driver (`libnes` library). Build ===== The tool can be built with: $ autoreconf $ ./configure $ make Run === The tool can be executed directly: $ ./ib-hw-nes-create-qp-null Output ====== Here's the output of test ran on a system with a QLogic^WIntel InfiniBand HCA. $ ./ib-hw-nes-create-qp-null Opening qib0 Memory mapped @ 0x7f4ed4b04000 [page 0] @ 0x7f4ed4b05000 [page 1] @ 0x7f4ed4b06000 [page 2] @ 0x7f4ed4b07000 [page 3] @ 0x7f4ed4b08000 [page 4] Unmapping @ 0x7f4ed4b05000 [page 1] Unmapping @ 0x7f4ed4b07000 [page 3] Using unmmaped page @ 0x7f4ed4b07000 [page 3] for response Response @ 0x7f4ed4b06fe0 [page 2] Using unmmaped page @ 0x7f4ed4b05000 [page 1] for command Command @ 0x7f4ed4b04fc0 [page 0] CREATE_QP : sret = 64 errno = 0 : SUCCESS DESTROY_QP : sret = 24, errno = 0 : SUCCESS Explanation =========== In [drivers/infiniband/hw/nes/nes_verbs.c][1], function [`nes_create_qp()`][2] calls [`ib_copy_from_udata()`][3] which in turn calls [`copy_from_user()`][4]. If the pointer [`udata->inbuf`][5] is not pointing to valid userspace page, [`ib_copy_from_udata()`][3] will fail and returns `-EFAULT`. [1]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/hw/nes/nes_verbs.c?id=v3.14-rc5 "nes_verb.c" [2]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/hw/nes/nes_verbs.c?id=v3.14-rc5#n1103 "nes_create_qp()" [3]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/hw/nes/nes_verbs.c?id=v3.14-rc5#n1185 "calling ib_copy_from_udata()" [4]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/rdma/ib_verbs.h?id=v3.14-rc5#n1506 "ib_copy_from_udata()" [5]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/rdma/ib_verbs.h?id=v3.14-rc5#n1000 "strict ib_udata, inbuf member" Then [`nes_create_qp()`][2] [returns `NULL`][6]. [6]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/hw/nes/nes_verbs.c?id=v3.14-rc5#n1189 "returns NULL;" In [drivers/infiniband/core/uverbs_cmd.c][7], function [`uverbs_create_qp()`][8], which was [calling][9] [`nes_create_qp()`][2] through [`device->create_qp` function pointer][10], [tests its return code][11] using [`IS_ERR()`][12]. But does not check for `NULL`. [7]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5 "uverbs_cmd.c" [8]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n1489 "uverbs_create_qp()" [9]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n1589 "call device->create_qp()" [10]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/rdma/ib_verbs.h?id=v3.14-rc5#n1361 "struct ib_device, create_qp member" [11]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n1591 "test return value with IS_ERR()" [12]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/linux/err.h?id=v3.14-rc5#n33 "IS_ERR()" In the [following lines][13] of [`uverbs_create_qp()`][8], [`NULL` pointer is dereferenced to access various fields of the `struct ib_qp`][13]. [13]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#1597 "uverbs_create_qp() access to fields" In most cases, memory page at address 0 is not mapped, so kernel will report an Oops and will terminate the test program. Unfortunately, if a memory page is mapped by userspace at 0x0, [`uverbs_create_qp()`][8] would [continue and record the `NULL` pointer][14] in [`struct idr ib_uverbs_qp_idr`][15] with [`idr_add_uobj()`][16]. [14]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n1617 "NULL pointer passed to idr_add_uobj()" [15]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_main.c?id=v3.14-rc5#n73 "struct idr ib_uverbs_qp_idr" [16]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n118 "idr_add_uobj()" Then [`uverbs_create_qp()`][8] will returns [the QP handle as allocated][17] by [`idr_add_uobj()`][16] to userspace. (It will only be possible if page holding the response buffer are valid and mapped, otherwise [`ib_destroy_qp()`][18] is called early on `NULL`). [17]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n1623 "QP handle" [18]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/verbs.c?id=v3.14-rc5#n975 "ib_destroy_qp()" [`uverbs_create_qp()`][8] will return success to userspace, with a valid QP handle in the response buffer. But most uverbs won't be able to use the QP handle returned to userspace, since they will try to retrieve the QP by its handle using [`idr_read_qp()`][19] and check for a `NULL` pointer returned by [`idr_read_obj()`][20] in case of invalid or non-matching handle. [19]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n238 "idr_read_qp()" [20]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n205 "idr_read_obj()" The only uverb that can be called against the handle is [`uverbs_destroy_qp()`][21] since it [use][22] [`idr_write_uobj()`][23] [directly][22], allowing it to [retrieve the `NULL` pointer][24]. Then the function will [call][25] [`ib_destroy_qp()`][18] with the `NULL` pointer as `struct qp *` argument. [21]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n1987 "uverbs_destroy_qp()" [22]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n2003 "call idr_write_uobj()" [23]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n181 "idr_write_obj()" [24]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n2006 "retrieve NULL pointer from QP handle" [25]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/uverbs_cmd.c?id=v3.14-rc5#n2014 "call ib_destroy_qp()" [`ib_destroy_qp()`][18] [dereferences the `NULL` pointer][26] to access to the [`struct ib_device` holding the function pointer `destroy_qp()`][27] in order to [call it][26]. [26]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core/verbs.c?id=v3.14-rc5#n993 "dereferences NULL pointer to call function pointer device->destroy_qp()" [27]: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/include/rdma/ib_verbs.h?id=v3.14-rc5#n1372 "struct ib_device, destroy_qp member" And you should guess the end of the story (see links below for related documentation if not). Why it is not a big deal ? ========================== To turn this into an exploit page 0 must be mapped and accessible by userspace and kernel: it's not possible by default, thanks to features added to Linux kernel: - `vm.mmap_min_addr` is the minimal address the kernel will allow you to map - get: - with sysctl $ sysctl vm.mmap_min_addr - with cat $ cat /proc/sys/vm/mmap_min_addr - set: - with sysctl # sysctl -w vm.mmap_min_addr=0 - with `echo` # echo 0 > /proc/sys/vm/mmap_min_addr - SELinux enforce another limit on context: - get: $ getsebool mmap_low_allowed - set: # setsebool mmap_low_allowed=on - [PaX][PAX] UDREF (and KERNEXEC) (within [grsecurity][GRSECURITY] kernel) will disallow direct access to userspace from kernel mode, so the kernel won't be able to access page 0 with a `NULL` pointer dereference. - IA32 (eg. x86) [SMAP][SMAP] (and [SMEP][SMEP]) can also defeat NULL pointer dereference from kernel mode, [just like Pax UDREF (and KERNEXEC)][PAXSMAPSMEP]. [PAX]: http://en.wikipedia.org/wiki/PaX "PaX" [GRSECURITY]: http://grsecurity.net "grsecurity" [SMAP]: https://lwn.net/Articles/517475/ "Supervisor Mode Access Prevention" [SMEP]: http://vulnfactory.org/blog/2011/06/05/smep-what-is-it-and-how-to-beat-it-on-linux/ "Supervisor Mode Execution Prevention" [PAXSMAPSMEP]: http://forums.grsecurity.net/viewtopic.php?f=7&t=3046 What's needed to turn the test program in an exploit ? ====================================================== After making nes driver returning `NULL` and create qp uverbs returns the QP handle to userspace, exploit must have to overwrite `struct ib_device` pointer and make it point to a `struct ib_device` of its own where `destroy_qp` function pointer will be an exploit controlled function. Then, test tool has to call destroy qp uverbs to make the kernel call its exploit function. Aside ===== This also demonstrate that nes driver does not check `udata->inlen` before trying to access to userspace command buffer. Conversely, it does not check `udata->outlen` before trying to write to userspace response buffer. Unfortunately, those behaviors are quite common in InfiniBand drivers. Links ===== - ["Bypassing Linux' `NULL` pointer dereference exploit prevention (`mmap_min_addr`)"](http://blog.cr0.org/2009/06/bypassing-linux-null-pointer.html), by Julien Tinnes, June 26, 2009 - ["Fun with `NULL` pointers, part 1"](http://lwn.net/Articles/342330/), by Jonathan Corbet, July 20, 2009 - ["Fun with `NULL` pointers, part 2"](http://lwn.net/Articles/342420/), by Jonathan Corbet, July 21, 2009 - ["`mmap_min_addr` on SELinux and non-SELinux systems"](http://eparis.livejournal.com/606.html), by Eric Paris, July 21, 2009 - ["Confining the unconfined. Oxymoron?"](http://danwalsh.livejournal.com/30084.html), by Dan Walsh, July 21st, 2009 - ["Bug 511143 - selinux policy allows addr 0 mappings by default"](https://bugzilla.redhat.com/show_bug.cgi?id=511143), July 13, 2009 - ["Security-Enhanced Linux (SELinux) policy and the `mmap_min_addr` protection (CVE-2009-2695)"](https://access.redhat.com/site/articles/17995) - ["How do I mitigate against `NULL` pointer dereference vulnerabilities?"](https://access.redhat.com/site/articles/20484) - ["SELinux hardening for `mmap_min_addr` protections"](http://eparis.livejournal.com/891.html), by Eric Paris, August 26th, 2009 - ["Much ado about `NULL`: Exploiting a kernel `NULL` dereference"](https://blogs.oracle.com/ksplice/entry/much_ado_about_null_exploiting1), By Nelson Elhage, Apr 12, 2010 Author ====== Yann Droneaud License ======= Copyright (C) 2014 OPTEYA SAS This software is available to you under a choice of one of two licenses. You may choose to be licensed under the terms of the the OpenIB.org BSD license or the GNU General Public License (GPL) Version 2, both included in file COPYING.