summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw
Commit message (Collapse)AuthorAge
...
* | | | | | | | infiniband: Replace usnic_ib_netdev_event_to_string() with netdev_cmd_to_name()Kirill Tkhai2018-03-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function just calls netdev_cmd_to_name(). Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | net: Make NETDEV_XXX commands enum { }Kirill Tkhai2018-03-26
|/ / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is preparation to drop NETDEV_UNREGISTER_FINAL. Since the cmd is used in usnic_ib_netdev_event_to_string() to get cmd name, after plain removing NETDEV_UNREGISTER_FINAL from everywhere, we'd have holes in event2str[] in this function. Instead of that, let's make NETDEV_XXX commands names available for everyone, and to define netdev_cmd_to_name() in the way we won't have to shaffle names after their numbers are changed. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-03-23
|\ \ \ \ \ \ \ | |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fun set of conflict resolutions here... For the mac80211 stuff, these were fortunately just parallel adds. Trivially resolved. In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the function phy_disable_interrupts() earlier in the file, whilst in 'net-next' the phy_error() call from this function was removed. In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the 'rt_table_id' member of rtable collided with a bug fix in 'net' that added a new struct member "rt_mtu_locked" which needs to be copied over here. The mlxsw driver conflict consisted of net-next separating the span code and definitions into separate files, whilst a 'net' bug fix made some changes to that moved code. The mlx5 infiniband conflict resolution was quite non-trivial, the RDMA tree's merge commit was used as a guide here, and here are their notes: ==================== Due to bug fixes found by the syzkaller bot and taken into the for-rc branch after development for the 4.17 merge window had already started being taken into the for-next branch, there were fairly non-trivial merge issues that would need to be resolved between the for-rc branch and the for-next branch. This merge resolves those conflicts and provides a unified base upon which ongoing development for 4.17 can be based. Conflicts: drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f9524 (IB/mlx5: Fix cleanup order on unload) added to for-rc and commit b5ca15ad7e61 (IB/mlx5: Add proper representors support) add as part of the devel cycle both needed to modify the init/de-init functions used by mlx5. To support the new representors, the new functions added by the cleanup patch needed to be made non-static, and the init/de-init list added by the representors patch needed to be modified to match the init/de-init list changes made by the cleanup patch. Updates: drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function prototypes added by representors patch to reflect new function names as changed by cleanup patch drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init stage list to match new order from cleanup patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | infiniband: bnxt_re: use BIT_ULL() for 64-bit bit masksArnd Bergmann2018-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 32-bit targets, we otherwise get a warning about an impossible constant integer expression: In file included from include/linux/kernel.h:11, from include/linux/interrupt.h:6, from drivers/infiniband/hw/bnxt_re/ib_verbs.c:39: drivers/infiniband/hw/bnxt_re/ib_verbs.c: In function 'bnxt_re_query_device': include/linux/bitops.h:7:24: error: left shift count >= width of type [-Werror=shift-count-overflow] #define BIT(nr) (1UL << (nr)) ^~ drivers/infiniband/hw/bnxt_re/bnxt_re.h:61:34: note: in expansion of macro 'BIT' #define BNXT_RE_MAX_MR_SIZE_HIGH BIT(39) ^~~ drivers/infiniband/hw/bnxt_re/bnxt_re.h:62:30: note: in expansion of macro 'BNXT_RE_MAX_MR_SIZE_HIGH' #define BNXT_RE_MAX_MR_SIZE BNXT_RE_MAX_MR_SIZE_HIGH ^~~~~~~~~~~~~~~~~~~~~~~~ drivers/infiniband/hw/bnxt_re/ib_verbs.c:149:25: note: in expansion of macro 'BNXT_RE_MAX_MR_SIZE' ib_attr->max_mr_size = BNXT_RE_MAX_MR_SIZE; ^~~~~~~~~~~~~~~~~~~ Fixes: 872f3578241d ("RDMA/bnxt_re: Add support for MRs with Huge pages") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | infiniband: qplib_fp: fix pointer castArnd Bergmann2018-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Building for a 32-bit target results in a couple of warnings from casting between a 32-bit pointer and a 64-bit integer: drivers/infiniband/hw/bnxt_re/qplib_fp.c: In function 'bnxt_qplib_service_nq': drivers/infiniband/hw/bnxt_re/qplib_fp.c:333:23: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] bnxt_qplib_arm_srq((struct bnxt_qplib_srq *)q_handle, ^ drivers/infiniband/hw/bnxt_re/qplib_fp.c:336:12: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] (struct bnxt_qplib_srq *)q_handle, ^ In file included from include/linux/byteorder/little_endian.h:5, from arch/arm/include/uapi/asm/byteorder.h:22, from include/asm-generic/bitops/le.h:6, from arch/arm/include/asm/bitops.h:342, from include/linux/bitops.h:38, from include/linux/kernel.h:11, from include/linux/interrupt.h:6, from drivers/infiniband/hw/bnxt_re/qplib_fp.c:39: drivers/infiniband/hw/bnxt_re/qplib_fp.c: In function 'bnxt_qplib_create_srq': include/uapi/linux/byteorder/little_endian.h:31:43: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast] #define __cpu_to_le64(x) ((__force __le64)(__u64)(x)) ^ include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64' #define cpu_to_le64 __cpu_to_le64 ^~~~~~~~~~~~~ drivers/infiniband/hw/bnxt_re/qplib_fp.c:569:19: note: in expansion of macro 'cpu_to_le64' req.srq_handle = cpu_to_le64(srq); Using a uintptr_t as an intermediate works on all architectures. Fixes: 37cb11acf1f7 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | IB/mlx5: Fix cleanup order on unloadMark Bloch2018-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On load we create private CQ/QP/PD in order to be used by UMR, we create those resources after we register ourself as an IB device, and we destroy them after we unregister as an IB device. This was changed by commit 16c1975f1032 ("IB/mlx5: Create profile infrastructure to add and remove stages") which moved the destruction before we unregistration. This allowed to trigger an invalid memory access when unloading mlx5_ib while there are open resources: BUG: unable to handle kernel paging request at 00000001002c012c ... Call Trace: mlx5_ib_post_send_wait+0x75/0x110 [mlx5_ib] __slab_free+0x9a/0x2d0 delay_time_func+0x10/0x10 [mlx5_ib] unreg_umr.isra.15+0x4b/0x50 [mlx5_ib] mlx5_mr_cache_free+0x46/0x150 [mlx5_ib] clean_mr+0xc9/0x190 [mlx5_ib] dereg_mr+0xba/0xf0 [mlx5_ib] ib_dereg_mr+0x13/0x20 [ib_core] remove_commit_idr_uobject+0x16/0x70 [ib_uverbs] uverbs_cleanup_ucontext+0xe8/0x1a0 [ib_uverbs] ib_uverbs_cleanup_ucontext.isra.9+0x19/0x40 [ib_uverbs] ib_uverbs_remove_one+0x162/0x2e0 [ib_uverbs] ib_unregister_device+0xd4/0x190 [ib_core] __mlx5_ib_remove+0x2e/0x40 [mlx5_ib] mlx5_remove_device+0xf5/0x120 [mlx5_core] mlx5_unregister_interface+0x37/0x90 [mlx5_core] mlx5_ib_cleanup+0xc/0x225 [mlx5_ib] SyS_delete_module+0x153/0x230 do_syscall_64+0x62/0x110 entry_SYSCALL_64_after_hwframe+0x21/0x86 ... We restore the original behavior by breaking the UMR stage into two parts, pre and post IB registration stages, this way we can restore the original functionality and maintain clean separation of logic between stages. Fixes: 16c1975f1032 ("IB/mlx5: Create profile infrastructure to add and remove stages") Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | | | | RDMA/mlx5: Fix crash while accessing garbage pointer and freed memoryLeon Romanovsky2018-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The failure in rereg_mr flow caused to set garbage value (error value) into mr->umem pointer. This pointer is accessed at the release stage and it causes to the following crash. There is not enough to simply change umem to point to NULL, because the MR struct is needed to be accessed during MR deregistration phase, so delay kfree too. [ 6.237617] BUG: unable to handle kernel NULL pointer dereference a 0000000000000228 [ 6.238756] IP: ib_dereg_mr+0xd/0x30 [ 6.239264] PGD 80000000167eb067 P4D 80000000167eb067 PUD 167f9067 PMD 0 [ 6.240320] Oops: 0000 [#1] SMP PTI [ 6.240782] CPU: 0 PID: 367 Comm: dereg Not tainted 4.16.0-rc1-00029-gc198fafe0453 #183 [ 6.242120] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 6.244504] RIP: 0010:ib_dereg_mr+0xd/0x30 [ 6.245253] RSP: 0018:ffffaf5d001d7d68 EFLAGS: 00010246 [ 6.246100] RAX: 0000000000000000 RBX: ffff95d4172daf00 RCX: 0000000000000000 [ 6.247414] RDX: 00000000ffffffff RSI: 0000000000000001 RDI: ffff95d41a317600 [ 6.248591] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 [ 6.249810] R10: ffff95d417033c10 R11: 0000000000000000 R12: ffff95d4172c3a80 [ 6.251121] R13: ffff95d4172c3720 R14: ffff95d4172c3a98 R15: 00000000ffffffff [ 6.252437] FS: 0000000000000000(0000) GS:ffff95d41fc00000(0000) knlGS:0000000000000000 [ 6.253887] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6.254814] CR2: 0000000000000228 CR3: 00000000172b4000 CR4: 00000000000006b0 [ 6.255943] Call Trace: [ 6.256368] remove_commit_idr_uobject+0x1b/0x80 [ 6.257118] uverbs_cleanup_ucontext+0xe4/0x190 [ 6.257855] ib_uverbs_cleanup_ucontext.constprop.14+0x19/0x40 [ 6.258857] ib_uverbs_close+0x2a/0x100 [ 6.259494] __fput+0xca/0x1c0 [ 6.259938] task_work_run+0x84/0xa0 [ 6.260519] do_exit+0x312/0xb40 [ 6.261023] ? __do_page_fault+0x24d/0x490 [ 6.261707] do_group_exit+0x3a/0xa0 [ 6.262267] SyS_exit_group+0x10/0x10 [ 6.262802] do_syscall_64+0x75/0x180 [ 6.263391] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 6.264253] RIP: 0033:0x7f1b39c49488 [ 6.264827] RSP: 002b:00007ffe2de05b68 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 [ 6.266049] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1b39c49488 [ 6.267187] RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000 [ 6.268377] RBP: 00007f1b39f258e0 R08: 00000000000000e7 R09: ffffffffffffff98 [ 6.269640] R10: 00007f1b3a147260 R11: 0000000000000246 R12: 00007f1b39f258e0 [ 6.270783] R13: 00007f1b39f2ac20 R14: 0000000000000000 R15: 0000000000000000 [ 6.271943] Code: 74 07 31 d2 e9 25 d8 6c 00 b8 da ff ff ff c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 07 53 48 8b 5f 08 <48> 8b 80 28 02 00 00 e8 f7 d7 6c 00 85 c0 75 04 3e ff 4b 18 5b [ 6.274927] RIP: ib_dereg_mr+0xd/0x30 RSP: ffffaf5d001d7d68 [ 6.275760] CR2: 0000000000000228 [ 6.276200] ---[ end trace a35641f1c474bd20 ]--- Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Cc: syzkaller <syzkaller@googlegroups.com> Cc: <stable@vger.kernel.org> Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | | | | IB/mlx5: Fix integer overflows in mlx5_ib_create_srqBoris Pismenny2018-03-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch validates user provided input to prevent integer overflow due to integer manipulation in the mlx5_ib_create_srq function. Cc: syzkaller <syzkaller@googlegroups.com> Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | | | | IB/mlx5: Fix out-of-bounds read in create_raw_packet_qp_rqBoris Pismenny2018-03-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a check for the length of the qpin structure to prevent out-of-bounds reads BUG: KASAN: slab-out-of-bounds in create_raw_packet_qp+0x114c/0x15e2 Read of size 8192 at addr ffff880066b99290 by task syz-executor3/549 CPU: 3 PID: 549 Comm: syz-executor3 Not tainted 4.15.0-rc2+ #27 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 Call Trace: dump_stack+0x8d/0xd4 print_address_description+0x73/0x290 kasan_report+0x25c/0x370 ? create_raw_packet_qp+0x114c/0x15e2 memcpy+0x1f/0x50 create_raw_packet_qp+0x114c/0x15e2 ? create_raw_packet_qp_tis.isra.28+0x13d/0x13d ? lock_acquire+0x370/0x370 create_qp_common+0x2245/0x3b50 ? destroy_qp_user.isra.47+0x100/0x100 ? kasan_kmalloc+0x13d/0x170 ? sched_clock_cpu+0x18/0x180 ? fs_reclaim_acquire.part.15+0x5/0x30 ? __lock_acquire+0xa11/0x1da0 ? sched_clock_cpu+0x18/0x180 ? kmem_cache_alloc_trace+0x17e/0x310 ? mlx5_ib_create_qp+0x30e/0x17b0 mlx5_ib_create_qp+0x33d/0x17b0 ? sched_clock_cpu+0x18/0x180 ? create_qp_common+0x3b50/0x3b50 ? lock_acquire+0x370/0x370 ? __radix_tree_lookup+0x180/0x220 ? uverbs_try_lock_object+0x68/0xc0 ? rdma_lookup_get_uobject+0x114/0x240 create_qp.isra.5+0xce4/0x1e20 ? ib_uverbs_ex_create_cq_cb+0xa0/0xa0 ? copy_ah_attr_from_uverbs.isra.2+0xa00/0xa00 ? ib_uverbs_cq_event_handler+0x160/0x160 ? __might_fault+0x17c/0x1c0 ib_uverbs_create_qp+0x21b/0x2a0 ? ib_uverbs_destroy_cq+0x2e0/0x2e0 ib_uverbs_write+0x55a/0xad0 ? ib_uverbs_destroy_cq+0x2e0/0x2e0 ? ib_uverbs_destroy_cq+0x2e0/0x2e0 ? ib_uverbs_open+0x760/0x760 ? futex_wake+0x147/0x410 ? check_prev_add+0x1680/0x1680 ? do_futex+0x3d3/0xa60 ? sched_clock_cpu+0x18/0x180 __vfs_write+0xf7/0x5c0 ? ib_uverbs_open+0x760/0x760 ? kernel_read+0x110/0x110 ? lock_acquire+0x370/0x370 ? __fget+0x264/0x3b0 vfs_write+0x18a/0x460 SyS_write+0xc7/0x1a0 ? SyS_read+0x1a0/0x1a0 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL_64_fastpath+0x18/0x85 RIP: 0033:0x4477b9 RSP: 002b:00007f1822cadc18 EFLAGS: 00000292 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00000000004477b9 RDX: 0000000000000070 RSI: 000000002000a000 RDI: 0000000000000005 RBP: 0000000000708000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000292 R12: 00000000ffffffff R13: 0000000000005d70 R14: 00000000006e6e30 R15: 0000000020010ff0 Allocated by task 549: __kmalloc+0x15e/0x340 kvmalloc_node+0xa1/0xd0 create_user_qp.isra.46+0xd42/0x1610 create_qp_common+0x2e63/0x3b50 mlx5_ib_create_qp+0x33d/0x17b0 create_qp.isra.5+0xce4/0x1e20 ib_uverbs_create_qp+0x21b/0x2a0 ib_uverbs_write+0x55a/0xad0 __vfs_write+0xf7/0x5c0 vfs_write+0x18a/0x460 SyS_write+0xc7/0x1a0 entry_SYSCALL_64_fastpath+0x18/0x85 Freed by task 368: kfree+0xeb/0x2f0 kernfs_fop_release+0x140/0x180 __fput+0x266/0x700 task_work_run+0x104/0x180 exit_to_usermode_loop+0xf7/0x110 syscall_return_slowpath+0x298/0x370 entry_SYSCALL_64_fastpath+0x83/0x85 The buggy address belongs to the object at ffff880066b99180 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 272 bytes inside of 512-byte region [ffff880066b99180, ffff880066b99380) The buggy address belongs to the page: page:000000006040eedd count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 flags: 0x4000000000008100(slab|head) raw: 4000000000008100 0000000000000000 0000000000000000 0000000180190019 raw: ffffea00019a7500 0000000b0000000b ffff88006c403080 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff880066b99180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880066b99200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff880066b99280: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff880066b99300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff880066b99380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Cc: syzkaller <syzkaller@googlegroups.com> Fixes: 0fb2ed66a14c ("IB/mlx5: Add create and destroy functionality for Raw Packet QP") Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | | | | RDMA/mlx5: Fix integer overflow while resizing CQLeon Romanovsky2018-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The user can provide very large cqe_size which will cause to integer overflow as it can be seen in the following UBSAN warning: ======================================================================= UBSAN: Undefined behaviour in drivers/infiniband/hw/mlx5/cq.c:1192:53 signed integer overflow: 64870 * 65536 cannot be represented in type 'int' CPU: 0 PID: 267 Comm: syzkaller605279 Not tainted 4.15.0+ #90 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 Call Trace: dump_stack+0xde/0x164 ? dma_virt_map_sg+0x22c/0x22c ubsan_epilogue+0xe/0x81 handle_overflow+0x1f3/0x251 ? __ubsan_handle_negate_overflow+0x19b/0x19b ? lock_acquire+0x440/0x440 mlx5_ib_resize_cq+0x17e7/0x1e40 ? cyc2ns_read_end+0x10/0x10 ? native_read_msr_safe+0x6c/0x9b ? cyc2ns_read_end+0x10/0x10 ? mlx5_ib_modify_cq+0x220/0x220 ? sched_clock_cpu+0x18/0x200 ? lookup_get_idr_uobject+0x200/0x200 ? rdma_lookup_get_uobject+0x145/0x2f0 ib_uverbs_resize_cq+0x207/0x3e0 ? ib_uverbs_ex_create_cq+0x250/0x250 ib_uverbs_write+0x7f9/0xef0 ? cyc2ns_read_end+0x10/0x10 ? print_irqtrace_events+0x280/0x280 ? ib_uverbs_ex_create_cq+0x250/0x250 ? uverbs_devnode+0x110/0x110 ? sched_clock_cpu+0x18/0x200 ? do_raw_spin_trylock+0x100/0x100 ? __lru_cache_add+0x16e/0x290 __vfs_write+0x10d/0x700 ? uverbs_devnode+0x110/0x110 ? kernel_read+0x170/0x170 ? sched_clock_cpu+0x18/0x200 ? security_file_permission+0x93/0x260 vfs_write+0x1b0/0x550 SyS_write+0xc7/0x1a0 ? SyS_read+0x1a0/0x1a0 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL_64_fastpath+0x1e/0x8b RIP: 0033:0x433549 RSP: 002b:00007ffe63bd1ea8 EFLAGS: 00000217 ======================================================================= Cc: syzkaller <syzkaller@googlegroups.com> Cc: <stable@vger.kernel.org> # 3.13 Fixes: bde51583f49b ("IB/mlx5: Add support for resize CQ") Reported-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | | | | Revert "RDMA/mlx5: Fix integer overflow while resizing CQ"Doug Ledford2018-03-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original commit of this patch has a munged log message that is missing several of the tags the original author intended to be on the patch. This was due to patchworks misinterpreting a cut-n-paste separator line as an end of message line and munging the mbox that was used to import the patch: https://patchwork.kernel.org/patch/10264089/ The original patch will be reapplied with a fixed commit message so the proper tags are applied. This reverts commit aa0de36a40f446f5a21a7c1e677b98206e242edb. Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | | | | RDMA/mlx5: Fix integer overflow while resizing CQLeon Romanovsky2018-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The user can provide very large cqe_size which will cause to integer overflow as it can be seen in the following UBSAN warning: Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | | | | RDMA/bnxt_re: Avoid Hard lockup during error CQE processingSelvin Xavier2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hitting the following hardlockup due to a race condition in error CQE processing. [26146.879798] bnxt_en 0000:04:00.0: QPLIB: FP: CQ Processed Req [26146.886346] bnxt_en 0000:04:00.0: QPLIB: wr_id[1251] = 0x0 with status 0xa [26156.350935] NMI watchdog: Watchdog detected hard LOCKUP on cpu 4 [26156.357470] Modules linked in: nfsd auth_rpcgss nfs_acl lockd grace [26156.447957] CPU: 4 PID: 3413 Comm: kworker/4:1H Kdump: loaded [26156.457994] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, [26156.466390] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] [26156.472639] Call Trace: [26156.475379] <NMI> [<ffffffff98d0d722>] dump_stack+0x19/0x1b [26156.481833] [<ffffffff9873f775>] watchdog_overflow_callback+0x135/0x140 [26156.489341] [<ffffffff9877f237>] __perf_event_overflow+0x57/0x100 [26156.496256] [<ffffffff98787c24>] perf_event_overflow+0x14/0x20 [26156.502887] [<ffffffff9860a580>] intel_pmu_handle_irq+0x220/0x510 [26156.509813] [<ffffffff98d16031>] perf_event_nmi_handler+0x31/0x50 [26156.516738] [<ffffffff98d1790c>] nmi_handle.isra.0+0x8c/0x150 [26156.523273] [<ffffffff98d17be8>] do_nmi+0x218/0x460 [26156.528834] [<ffffffff98d16d79>] end_repeat_nmi+0x1e/0x7e [26156.534980] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200 [26156.543268] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200 [26156.551556] [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200 [26156.559842] <EOE> [<ffffffff98d083e4>] queued_spin_lock_slowpath+0xb/0xf [26156.567555] [<ffffffff98d15690>] _raw_spin_lock+0x20/0x30 [26156.573696] [<ffffffffc08381a1>] bnxt_qplib_lock_buddy_cq+0x31/0x40 [bnxt_re] [26156.581789] [<ffffffffc083bbaa>] bnxt_qplib_poll_cq+0x43a/0xf10 [bnxt_re] [26156.589493] [<ffffffffc083239b>] bnxt_re_poll_cq+0x9b/0x760 [bnxt_re] The issue happens if RQ poll_cq or SQ poll_cq or Async error event tries to put the error QP in flush list. Since SQ and RQ of each error qp are added to two different flush list, we need to protect it using locks of corresponding CQs. Difference in order of acquiring the lock in SQ poll_cq and RQ poll_cq can cause a hard lockup. Revisits the locking strategy and removes the usage of qplib_cq.hwq.lock. Instead of this lock, introduces qplib_cq.flush_lock to handle addition/deletion of QPs in flush list. Also, always invoke the flush_lock in order (SQ CQ lock first and then RQ CQ lock) to avoid any potential deadlock. Other than the poll_cq context, the movement of QP to/from flush list can be done in modify_qp context or from an async error event from HW. Synchronize these operations using the bnxt_re verbs layer CQ locks. To achieve this, adds a call back to the HW abstraction layer(qplib) to bnxt_re ib_verbs layer in case of async error event. Also, removes the buddy cq functions as it is no longer required. Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | IB/mlx5: Fix an error code in __mlx5_ib_modify_qp()Dan Carpenter2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "err" is either zero or possibly uninitialized here. It should be -EINVAL. Fixes: 427c1e7bcd7e ("{IB, net}/mlx5: Move the modify QP operation table to mlx5_ib") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | IB/mlx5: When not in dual port RoCE mode, use provided port as nativeMark Bloch2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The series that introduced dual port RoCE mode assumed that we don't have a dual port HCA that use the mlx5 driver, this is not the case for Connect-IB HCAs. This reasoning led to assigning 1 as the native port index which causes issue when the second port is used. For example query_pkey() when called on the second port will return values of the first port. Make sure that we assign the right port index as the native port index. Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE") Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | IB/mlx4: Include GID type when deleting GIDs from HW table under RoCEJack M2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit cited below added a gid_type field (RoCEv1 or RoCEv2) to GID properties. When adding GIDs, this gid_type field was copied over to the hardware gid table. However, when deleting GIDs, the gid_type field was not copied over to the hardware gid table. As a result, when running RoCEv2, all RoCEv2 gids in the hardware gid table were set to type RoCEv1 when any gid was deleted. This problem would persist until the next gid was added (which would again restore the gid_type field for all the gids in the hardware gid table). Fix this by copying over the gid_type field to the hardware gid table when deleting gids, so that the gid_type of all remaining gids is preserved when a gid is deleted. Fixes: b699a859d17b ("IB/mlx4: Add gid_type to GID properties") Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | IB/mlx4: Fix corruption of RoCEv2 IPv4 GIDsJack Morgenstein2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using IPv4 addresses in RoCEv2, the GID format for the mapped IPv4 address should be: ::ffff:<4-byte IPv4 address>. In the cited commit, IPv4 mapped IPV6 addresses had the 3 upper dwords zeroed out by memset, which resulted in deleting the ffff field. However, since procedure ipv6_addr_v4mapped() already verifies that the gid has format ::ffff:<ipv4 address>, no change is needed for the gid, and the memset can simply be removed. Fixes: 7e57b85c444c ("IB/mlx4: Add support for setting RoCEv2 gids in hardware") Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | RDMA/qedr: Fix iWARP write and send with immediateKalderon, Michal2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iWARP does not support RDMA WRITE or SEND with immediate data. Driver should check this before submitting to FW and return an immediate error Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | RDMA/qedr: Fix kernel panic when running fio over NFSoRDMAKalderon, Michal2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Race in qedr_poll_cq, lastest_cqe wasn't protected by lock, leading to a case where two context's accessing poll_cq at the same time lead to one of them having a pointer to an old latest_cqe and reading an invalid cqe element Signed-off-by: Amit Radzi <Amit.Radzi@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | RDMA/qedr: Fix iWARP connect with port mapperKalderon, Michal2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix iWARP connect and listen to use the mapped port for ipv4 and ipv6. Without this fixed, running on a server that has iwpmd enabled will not use the correct port Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | RDMA/qedr: Fix ipv6 destination address resolutionKalderon, Michal2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The wrong parameter was passed to dst_neigh_lookup Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | RDMA/bnxt_re: Fix the ib_reg failure cleanupSelvin Xavier2018-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Release the netdev references in the cleanup path. Invokes the cleanup routines if bnxt_re_ib_reg fails. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | RDMA/bnxt_re: Fix incorrect DB offset calculationDevesh Sharma2018-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To support host systems with non 4K page size, l2_db_size shall be calculated with 4096 instead of PAGE_SIZE. Also, supply the host page size to FW during initialization. Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | RDMA/bnxt_re: Unconditionly fence non wire memory operationsDevesh Sharma2018-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | HW requires an unconditonal fence for all non-wire memory operations through SQ. This guarantees the completions of these memory operations. Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | IB/mlx: Set slid to zero in Ethernet completion structMoni Shoua2018-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IB spec says that a lid should be ignored when link layer is Ethernet, for example when building or parsing a CM request message (CA17-34). However, since ib_lid_be16() and ib_lid_cpu16() validates the slid, not only when link layer is IB, we set the slid to zero to prevent false warnings in the kernel log. Fixes: 62ede7779904 ("Add OPA extended LID support") Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | {net, IB}/mlx5: Raise fatal IB event when sys error occursDaniel Jurgens2018-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All other mlx5_events report the port number as 1 based, which is how FW reports it in the port event EQE. Reporting 0 for this event causes mlx5_ib to not raise a fatal event notification to registered clients due to a seemingly invalid port. All switch cases in mlx5_ib_event that go through the port check are supposed to set the port now, so just do it once at variable declaration. Fixes: 89d44f0a6c73("net/mlx5_core: Add pci error handlers to mlx5_core driver") Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | IB/mlx5: Avoid passing an invalid QP type to firmwareNoa Osherovich2018-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During QP creation, the mlx5 driver translates the QP type to an internal value which is passed on to FW. There was no check to make sure that the translated value is valid, and -EINVAL was coerced into the mailbox command. Current firmware refuses this as an invalid QP type, but future/past firmware may do something else. Fixes: 09a7d9eca1a6c ('{net,IB}/mlx5: QP/XRCD commands via mlx5 ifc') Reviewed-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | IB/mlx5: Fix incorrect size of klms in the memory regionSergey Gorenko2018-02-28
| | |_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The value of mr->ndescs greater than mr->max_descs is set in the function mlx5_ib_sg_to_klms() if sg_nents is greater than mr->max_descs. This is an invalid value and it causes the following error when registering mr: mlx5_0:dump_cqe:276:(pid 193): dump error cqe 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00000030: 00 00 00 00 0f 00 78 06 25 00 00 8b 08 1e 8f d3 Cc: <stable@vger.kernel.org> # 4.5 Fixes: b005d3164713 ("mlx5: Add arbitrary sg list support") Signed-off-by: Sergey Gorenko <sergeygo@mellanox.com> Tested-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | | | | cxgb4: notify fatal error to uld driversGanesh Goudar2018-03-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | notify uld drivers if the adapter encounters fatal error. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | Merge tag 'mlx5-updates-2018-02-28-1' of ↵David S. Miller2018-03-07
|\ \ \ \ \ \ | | |_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-updates-2018-02-28-1 (IPSec-1) This series consists of some fixes and refactors for the mlx5 drivers, especially around the FPGA and flow steering. Most of them are trivial fixes and are the foundation of allowing IPSec acceleration from user-space. We use flow steering abstraction in order to accelerate IPSec packets. When a user creates a steering rule, [s]he states that we'll carry an encrypt/decrypt flow action (using a specific configuration) for every packet which conforms to a certain match. Since currently offloading these packets is done via FPGA, we'll add another set of flow steering ops. These ops will execute the required FPGA commands and then call the standard steering ops. In order to achieve this, we need that the commands will get all the required information. Therefore, we pass the fte object and embed the flow_action struct inside the fte. In addition, we add the shim layer that will later be used for alternating between the standard and the FPGA steering commands. Some fixes, like " net/mlx5e: Wait for FPGA command responses with a timeout" are very relevant for user-space applications, as these applications could be killed, but we still want to wait for the FPGA and update the kernel's database. Regards, Aviad and Matan ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | {net,IB}/mlx5: Add flow steering helpersBoris Pismenny2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add helper functions that check if a protocol is part of a flow steering match criteria. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | | {net,IB}/mlx5: Add has_tag to mlx5_flow_actMatan Barak2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The has_tag member will indicate whether a tag action was specified in flow specification. A flow tag 0 = MLX5_FS_DEFAULT_FLOW_TAG is assumed a valid flow tag that is currently used by mlx5 RDMA driver, whereas in HW flow_tag = 0 means that the user doesn't care about flow_tag. HW always provide a flow_tag = 0 if all flow tags requested on a specific flow are 0. So we need a way (in the driver) to differentiate between a user really requesting flow_tag = 0 and a user who does not care, in order to be able to report conflicting flow tags on a specific flow. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | | IB/mlx5: Pass mlx5_flow_act struct instead of multiple argumentsBoris Pismenny2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Group and pass all function arguments of parse_flow_attr call in one common struct mlx5_flow_act. This patch passes all the action arguments of parse_flow_attr in one common struct mlx5_flow_act. It allows us to scale the number of actions without adding new arguments to the function. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | IB/mlx5: Removed not used parametersAviad Yehezkel2018-03-06
| | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com>
* | | | | Merge tag 'mlx5-updates-2018-02-23' of ↵David S. Miller2018-02-28
|\ \ \ \ \ | |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: mlx5-update-2018-02-23 (IB representors) From: Mark Bloch <markb@mellanox.com> ========= Add IB representor when in switchdev mode The following series adds support for an IB (RAW Ethernet only) device representor which is created when the user switches to switchdev mode. Today when switching to switchdev mode the only representors which are created are net devices. Each netdev is a representor of a virtual function and any data sent via the representor is received on the virtual function, and any data sent via the virtual function is received by the representor. For the mlx5 driver the main use of this functionality is to be able to use Open vSwitch on the hypervisor in order to manage/control traffic from/to the virtual functions. Open vSwitch can also work with DPDK devices and not just net devices, this series exposes an IB device, which Mellanox PMD driver uses, which then can be used by Open vSwitch DPDK. An IB device representor exposes only RAW Ethernet QP capabilities and the ability to create flow rules to direct traffic to its RX queues. The state of the IB device (ACTIVE/DOWN etc..) is based on the state of the corresponding net device representor. No other RDMA/RoCE functionality is currently supported and no GID table is exposed. ========= Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | IB/mlx5: Disable self loopback check when in switchdev modeMark Bloch2018-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When in switchdev mode, there is no need to do self loopback checks as we can't receive those packets, we insert steering rules to the eswitch that make sure packets can't be looped back. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | IB/mlx5: Add proper representors supportMark Bloch2018-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds full support for IB representor: 1) Representors profile, We add two new profiles: nic_rep_profile - This profile will be used to create an IB device that represents the PF/UPLINK. rep_profile - This profile will be used to create an IB device that represents VFs. Each VF will be its own representor. 2) Proper load/unload callbacks, Those are called by the E-Switch when moving to/from switchdev mode. 3) Different flow DB handling for when we in switchdev mode. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | IB/mlx5: E-Switch, Add rule to forward traffic to vportMark Bloch2018-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to forward traffic from representor's SQ to the right virtual function, every time an SQ is created also add the corresponding flow rule to the FDB. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | IB/mlx5: Don't expose MR cache in switchdev modeMark Bloch2018-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When enabling many VFs and switching to switchdev mode, the total amount of mkeys we try to allocate when loading representors is very large and may cause timeouts on allocations, the same issues was observed on VFs and we employ the same fix that was done for them. We avoid allocating the full MR cache on load but still allow it to be manipulated once the IB device is loaded. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | IB/mlx5: When in switchdev mode, expose only raw packet capabilitiesMark Bloch2018-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently in switchdev mode we allow only for raw packet QPs. Expose the right capabilities and set the gid table length to 0, also make sure we don't try to enable RoCE, so split the function to enable RoCE so representors can enable only the notifier needed for net device events. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | IB/mlx5: Listen to netdev register/unresiter events in switchdev modeMark Bloch2018-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we listen to netdev register/unregister event based on PCI device. When in switchdev mode PF and representors share the same PCI device, so in order to pair ib device and netdev in switchdev mode compare the netdev that triggered the event to that of the representor. Expose a function that lets you receive the netdev associated what a given representor. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | IB/mlx5: Add match on vport when in switchdev modeMark Bloch2018-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we point to a representor, it means we are in switchdev mode. The flow db is shared between PF and virtual function representors so each rule created needs to have a match on its specific source port. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | IB/mlx5: Allocate flow DB only on PF IB deviceMark Bloch2018-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A flow DB is a shared resource between PF and representors, need to allocate it only when creating the PF IB device. Once we add IB representors, they will use the flow db which was created by the PF. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
| * | | | IB/mlx5: Add basic regiser/unregister representors codeMark Bloch2018-02-23
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create the basic infrastructure of registering and unregistering IB representors. The load/unload callbacks are left empty and proper implementation will be introduced in following patches. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-02-24
|\ \ \ \ | | |/ / | |/| |
| * | | RDMA/bnxt_re: Avoid system hang during device un-regSelvin Xavier2018-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BNXT_RE_FLAG_TASK_IN_PROG doesn't handle multiple work requests posted together. Track schedule of multiple workqueue items by maintaining a per device counter and proceed with IB dereg only if this counter is zero. flush_workqueue is no longer required from NETDEV_UNREGISTER path. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | RDMA/bnxt_re: Fix system crash during load/unloadSelvin Xavier2018-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During driver unload, the driver proceeds with cleanup without waiting for the scheduled events. So the device pointers get freed up and driver crashes when the events are scheduled later. Flush the bnxt_re_task work queue before starting device removal. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | RDMA/bnxt_re: Synchronize destroy_qp with poll_cqSelvin Xavier2018-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid system crash when destroy_qp is invoked while the driver is processing the poll_cq. Synchronize these functions using the cq_lock. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | RDMA/bnxt_re: Unpin SQ and RQ memory if QP create failsDevesh Sharma2018-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Driver leaves the QP memory pinned if QP create command fails from the FW. Avoids this scenario by adding a proper exit path if the FW command fails. Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | RDMA/bnxt_re: Disable atomic capability on bnxt_re adaptersDevesh Sharma2018-02-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | More testing needs to be done before enabling this feature. Disabling the feature temporarily Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>