summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw
Commit message (Collapse)AuthorAge
...
| * Merge branch 'k.o/wip/dl-for-rc' into k.o/wip/dl-for-nextDoug Ledford2018-03-14
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to bug fixes found by the syzkaller bot and taken into the for-rc branch after development for the 4.17 merge window had already started being taken into the for-next branch, there were fairly non-trivial merge issues that would need to be resolved between the for-rc branch and the for-next branch. This merge resolves those conflicts and provides a unified base upon which ongoing development for 4.17 can be based. Conflicts: drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f9524 (IB/mlx5: Fix cleanup order on unload) added to for-rc and commit b5ca15ad7e61 (IB/mlx5: Add proper representors support) add as part of the devel cycle both needed to modify the init/de-init functions used by mlx5. To support the new representors, the new functions added by the cleanup patch needed to be made non-static, and the init/de-init list added by the representors patch needed to be modified to match the init/de-init list changes made by the cleanup patch. Updates: drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function prototypes added by representors patch to reflect new function names as changed by cleanup patch drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init stage list to match new order from cleanup patch Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA/i40iw: include linux/irq.hArnd Bergmann2018-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We get a build failure on ARM unless the header is included explicitly: drivers/infiniband/hw/i40iw/i40iw_verbs.c: In function 'i40iw_get_vector_affinity': drivers/infiniband/hw/i40iw/i40iw_verbs.c:2747:9: error: implicit declaration of function 'irq_get_affinity_mask'; did you mean 'irq_create_affinity_masks'? [-Werror=implicit-function-declaration] return irq_get_affinity_mask(msix_vec->irq); ^~~~~~~~~~~~~~~~~~~~~ irq_create_affinity_masks drivers/infiniband/hw/i40iw/i40iw_verbs.c:2747:9: error: returning 'int' from a function with return type 'const struct cpumask *' makes pointer from integer without a cast [-Werror=int-conversion] return irq_get_affinity_mask(msix_vec->irq); Fixes: 7e952b19eb63 ("i40iw: Implement get_vector_affinity API") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | IB/mlx5: Maintain a single emergency pageIlya Lesokhin2018-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The mlx5 driver needs to be able to issue invalidation to ODP MRs even if it cannot allocate memory. To this end it preallocates emergency pages to use when the situation arises. This flow should be extremely rare enough, that we don't need to worry about contention and therefore a single emergency page is good enough. Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | IB/mlx5: Only synchronize RCU once when removing mkeysDaniel Jurgens2018-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead synchronizing RCU in a loop when removing mkeys in a batch do it once at the end before freeing them. The result is only waiting for one RCU grace period instead of many serially. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA/pvrdma: Properly annotate QP statesLeon Romanovsky2018-03-14
| | | | | | | | | | | | | | | | | | | | | | | | QP states provided by core layer are converted to enum ib_qp_state and better to use internal variable in that type instead of int. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA/mlx5: Fix NULL dereference while accessing XRC_TGT QPsLeon Romanovsky2018-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mlx5 modify_qp() relies on FW that the error will be thrown if wrong state is supplied. The missing check in FW causes the following crash while using XRC_TGT QPs. [ 14.769632] BUG: unable to handle kernel NULL pointer dereference at (null) [ 14.771085] IP: mlx5_ib_modify_qp+0xf60/0x13f0 [ 14.771894] PGD 800000001472e067 P4D 800000001472e067 PUD 14529067 PMD 0 [ 14.773126] Oops: 0002 [#1] SMP PTI [ 14.773763] CPU: 0 PID: 365 Comm: ubsan Not tainted 4.16.0-rc1-00038-g8151138c0793 #119 [ 14.775192] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 14.777522] RIP: 0010:mlx5_ib_modify_qp+0xf60/0x13f0 [ 14.778417] RSP: 0018:ffffbf48001c7bd8 EFLAGS: 00010246 [ 14.779346] RAX: 0000000000000000 RBX: ffff9a8f9447d400 RCX: 0000000000000000 [ 14.780643] RDX: 0000000000000000 RSI: 000000000000000a RDI: 0000000000000000 [ 14.781930] RBP: 0000000000000000 R08: 00000000000217b0 R09: ffffffffbc9c1504 [ 14.783214] R10: fffff4a180519480 R11: ffff9a8f94523600 R12: ffff9a8f9493e240 [ 14.784507] R13: ffff9a8f9447d738 R14: 000000000000050a R15: 0000000000000000 [ 14.785800] FS: 00007f545b466700(0000) GS:ffff9a8f9fc00000(0000) knlGS:0000000000000000 [ 14.787073] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 14.787792] CR2: 0000000000000000 CR3: 00000000144be000 CR4: 00000000000006b0 [ 14.788689] Call Trace: [ 14.789007] _ib_modify_qp+0x71/0x120 [ 14.789475] modify_qp.isra.20+0x207/0x2f0 [ 14.790010] ib_uverbs_modify_qp+0x90/0xe0 [ 14.790532] ib_uverbs_write+0x1d2/0x3c0 [ 14.791049] ? __handle_mm_fault+0x93c/0xe40 [ 14.791644] __vfs_write+0x36/0x180 [ 14.792096] ? handle_mm_fault+0xc1/0x210 [ 14.792601] vfs_write+0xad/0x1e0 [ 14.793018] SyS_write+0x52/0xc0 [ 14.793422] do_syscall_64+0x75/0x180 [ 14.793888] entry_SYSCALL_64_after_hwframe+0x21/0x86 [ 14.794527] RIP: 0033:0x7f545ad76099 [ 14.794975] RSP: 002b:00007ffd78787468 EFLAGS: 00000287 ORIG_RAX: 0000000000000001 [ 14.795958] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f545ad76099 [ 14.797075] RDX: 0000000000000078 RSI: 0000000020009000 RDI: 0000000000000003 [ 14.798140] RBP: 00007ffd78787470 R08: 00007ffd78787480 R09: 00007ffd78787480 [ 14.799207] R10: 00007ffd78787480 R11: 0000000000000287 R12: 00005599ada98760 [ 14.800277] R13: 00007ffd78787560 R14: 0000000000000000 R15: 0000000000000000 [ 14.801341] Code: 4c 8b 1c 24 48 8b 83 70 02 00 00 48 c7 83 cc 02 00 00 00 00 00 00 48 c7 83 24 03 00 00 00 00 00 00 c7 83 2c 03 00 00 00 00 00 00 <c7> 00 00 00 00 00 48 8b 83 70 02 00 00 c7 40 04 00 00 00 00 4c [ 14.804012] RIP: mlx5_ib_modify_qp+0xf60/0x13f0 RSP: ffffbf48001c7bd8 [ 14.804838] CR2: 0000000000000000 [ 14.805288] ---[ end trace 3f1da0df5c8b7c37 ]--- Cc: syzkaller <syzkaller@googlegroups.com> Reported-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | IB: remove duplicate header filesZhu Yanjun2018-03-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | In hfi.h, the header file opa_addr.h is included twice. In vt.h, the header file mmap.h is included twice. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA/hns: Support cq record doorbell for kernel spaceYixian Liu2018-03-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch updates to support cq record doorbell for the kernel space. Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Shaobo Xu <xushaobo2@huawei.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA/hns: Support rq record doorbell for kernel spaceYixian Liu2018-03-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch updates to support rq record doorbell for the kernel space. Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Shaobo Xu <xushaobo2@huawei.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA/hns: Support cq record doorbell for the user spaceYixian Liu2018-03-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch updates to support cq record doorbell for the user space. Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Shaobo Xu <xushaobo2@huawei.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA/hns: Support rq record doorbell for the user spaceYixian Liu2018-03-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds interfaces and definitions to support the rq record doorbell for the user space. Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Shaobo Xu <xushaobo2@huawei.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | RDMA/bnxt_re: Remove an unused variableBart Van Assche2018-03-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Selvin Xavier <selvin.xavier@broadcom.com> Cc: Devesh Sharma <devesh.sharma@broadcom.com> Cc: Somnath Kotur <somnath.kotur@broadcom.com> Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | IB/hfi1: Fix a kernel-doc warningBart Van Assche2018-03-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid that building with W=1 causes the following warning to appear: drivers/infiniband/hw/hfi1/qp.c:484: warning: Cannot understand * on line 484 - I thought it was a doc line Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | mlx4_ib: zero out struct ib_pd when allocatingSteve Wise2018-03-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Zero out the fields of the struct ib_pd for user mode pds so that users querying pds via nldev will not get garbage. For simplicity, use kzalloc() to allocate the mlx4_ib_pd struct. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | mlx4_ib: set user mr attributes in struct ib_mrSteve Wise2018-03-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | Setting iova, length, and page_size allows this information to be seen via NLDEV netlink queries, which can aid in user rdma debugging. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | iw_cxgb4: initialize ib_mr fields for user mrsSteve Wise2018-03-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some of the struct ib_mr fields weren't getting initialized. This was benign, but will cause problems when dumping the mr resource via nldev/restrack. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | IB/mlx4: Move mlx4_uverbs_ex_query_device_resp to include/uapi/Yishai Hadas2018-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This struct is involved in the user API for mlx4 and should not be hidden inside a driver header file. Fixes: 09d208b258a2 ("IB/mlx4: Add report for RSS capabilities by vendor channel") Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | Merge tag 'mlx5-updates-2018-02-28-1' of ↵Doug Ledford2018-03-07
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into k.o/wip/dl-for-next mlx5-updates-2018-02-28-1 (IPSec-1) This series consists of some fixes and refactors for the mlx5 drivers, especially around the FPGA and flow steering. Most of them are trivial fixes and are the foundation of allowing IPSec acceleration from user-space. We use flow steering abstraction in order to accelerate IPSec packets. When a user creates a steering rule, [s]he states that we'll carry an encrypt/decrypt flow action (using a specific configuration) for every packet which conforms to a certain match. Since currently offloading these packets is done via FPGA, we'll add another set of flow steering ops. These ops will execute the required FPGA commands and then call the standard steering ops. In order to achieve this, we need that the commands will get all the required information. Therefore, we pass the fte object and embed the flow_action struct inside the fte. In addition, we add the shim layer that will later be used for alternating between the standard and the FPGA steering commands. Some fixes, like " net/mlx5e: Wait for FPGA command responses with a timeout" are very relevant for user-space applications, as these applications could be killed, but we still want to wait for the FPGA and update the kernel's database. Regards, Aviad and Matan Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | RDMA/bnxt_re/qplib_sp: Use true and false for boolean valuesGustavo A. R. Silva2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Assign true or false to boolean variables instead of an integer value. This issue was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/hfi1: Add a missing rcu_read_unlock()Bart Van Assche2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch avoids that sparse reports the following: drivers/infiniband/hw/hfi1/driver.c:251:13: warning: context imbalance in 'rcv_hdrerr' - different lock contexts for basic block Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | infiniband: hw: Drop unnecessary continueArushi2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continue at the bottom of a loop are removed. Issue found using drop_continue.cocci Coccinelle script. Signed-off-by: Arushi Singhal <arushisinghal19971997@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | i40iw: Implement get_vector_affinity APIShiraz Saleem2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Storage ULPs (like NVMEoF) benefit from exposing affinity mapping per completion vector to find the optimal multi-queue affinity assignments. The ULPs call the verbs API ib_get_vector_affinity introduced in commit c66cd353bbe ("RDMA/core: expose affinity mappings per completion vector") to get the underlying devices affinity mappings. Add support in driver to expose the affinity masks per MSI-X completion vector. Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | i40iw: Improve CM node lookup time on connection setupShiraz Saleem2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently all CM nodes involved in a connection are maintained in a connected_node list per dev. During connection setup, we need to search this every time we receive a packet on the iWARP LAN Queue (ILQ) and this can be pretty inefficient for large number of connections. Fix this by organizing the CM nodes in two lists - accelerated list and non-accelerated list. The search on ILQ receive would be limited to only non accelerated nodes. When a node moves to RTS, it is added to the accelerated list. Benchmarking ucmatose 16k connections shows a 20% improvement in test completion time. Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | i40iw: Refactor handling of txpend listMustafa Ismail2018-03-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the TX pending lists for IEQ and ILQ are handled separately. The handling of both can be consolidated in i40iw_poll_completion. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/qib: Move char *qib_sdma_state_names[] and constify while there.Hernán Gonzalez2018-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note: This is compile only tested as I have no access to the hw. This variable was not used in qib_sdma.c but in qib_iba7322.c. Declaring it there, as static, saves 56 bytes. add/remove: 0/2 grow/shrink: 0/0 up/down: 0/-144 (-144) Function old new delta qib_sdma_state_names 56 - -56 qib_sdma_event_names 88 - -88 Total: Before=2874565, After=2874421, chg -0.01% Signed-off-by: Hernán Gonzalez <hernan@vanguardiasur.com.ar> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/qib: Remove unused variable (char *qib_sdma_event_names[])Hernán Gonzalez2018-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note: This is compile only tested as I have no access to the hw. This variable was not used anywhere in the code. Removing it saves 88 bytes. add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-88 (-88) Function old new delta qib_sdma_event_names 88 - -88 Total: Before=2874565, After=2874477, chg -0.00% Signed-off-by: Hernán Gonzalez <hernan@vanguardiasur.com.ar> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | infiniband: bnxt_re: use BIT_ULL() for 64-bit bit masksArnd Bergmann2018-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 32-bit targets, we otherwise get a warning about an impossible constant integer expression: In file included from include/linux/kernel.h:11, from include/linux/interrupt.h:6, from drivers/infiniband/hw/bnxt_re/ib_verbs.c:39: drivers/infiniband/hw/bnxt_re/ib_verbs.c: In function 'bnxt_re_query_device': include/linux/bitops.h:7:24: error: left shift count >= width of type [-Werror=shift-count-overflow] #define BIT(nr) (1UL << (nr)) ^~ drivers/infiniband/hw/bnxt_re/bnxt_re.h:61:34: note: in expansion of macro 'BIT' #define BNXT_RE_MAX_MR_SIZE_HIGH BIT(39) ^~~ drivers/infiniband/hw/bnxt_re/bnxt_re.h:62:30: note: in expansion of macro 'BNXT_RE_MAX_MR_SIZE_HIGH' #define BNXT_RE_MAX_MR_SIZE BNXT_RE_MAX_MR_SIZE_HIGH ^~~~~~~~~~~~~~~~~~~~~~~~ drivers/infiniband/hw/bnxt_re/ib_verbs.c:149:25: note: in expansion of macro 'BNXT_RE_MAX_MR_SIZE' ib_attr->max_mr_size = BNXT_RE_MAX_MR_SIZE; ^~~~~~~~~~~~~~~~~~~ Fixes: 872f3578241d ("RDMA/bnxt_re: Add support for MRs with Huge pages") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | infiniband: qplib_fp: fix pointer castArnd Bergmann2018-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Building for a 32-bit target results in a couple of warnings from casting between a 32-bit pointer and a 64-bit integer: drivers/infiniband/hw/bnxt_re/qplib_fp.c: In function 'bnxt_qplib_service_nq': drivers/infiniband/hw/bnxt_re/qplib_fp.c:333:23: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] bnxt_qplib_arm_srq((struct bnxt_qplib_srq *)q_handle, ^ drivers/infiniband/hw/bnxt_re/qplib_fp.c:336:12: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] (struct bnxt_qplib_srq *)q_handle, ^ In file included from include/linux/byteorder/little_endian.h:5, from arch/arm/include/uapi/asm/byteorder.h:22, from include/asm-generic/bitops/le.h:6, from arch/arm/include/asm/bitops.h:342, from include/linux/bitops.h:38, from include/linux/kernel.h:11, from include/linux/interrupt.h:6, from drivers/infiniband/hw/bnxt_re/qplib_fp.c:39: drivers/infiniband/hw/bnxt_re/qplib_fp.c: In function 'bnxt_qplib_create_srq': include/uapi/linux/byteorder/little_endian.h:31:43: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast] #define __cpu_to_le64(x) ((__force __le64)(__u64)(x)) ^ include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64' #define cpu_to_le64 __cpu_to_le64 ^~~~~~~~~~~~~ drivers/infiniband/hw/bnxt_re/qplib_fp.c:569:19: note: in expansion of macro 'cpu_to_le64' req.srq_handle = cpu_to_le64(srq); Using a uintptr_t as an intermediate works on all architectures. Fixes: 37cb11acf1f7 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/usnic: Delete an error message for a failed memory allocation in ↵Markus Elfring2018-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | usnic_transport_init() Omit an extra message for a memory allocation failure in this function. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/mlx5: Refactor QP type check to be as early as possibleLeon Romanovsky2018-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Perform QP type check in one place and fail as early as possible. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | Merge tag 'mlx5-updates-2018-02-23' of ↵Doug Ledford2018-02-28
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into k.o/wip/dl-for-next mlx5-update-2018-02-23 (IB representors) From: Mark Bloch <markb@mellanox.com> ========= Add IB representor when in switchdev mode The following series adds support for an IB (RAW Ethernet only) device representor which is created when the user switches to switchdev mode. Today when switching to switchdev mode the only representors which are created are net devices. Each netdev is a representor of a virtual function and any data sent via the representor is received on the virtual function, and any data sent via the virtual function is received by the representor. For the mlx5 driver the main use of this functionality is to be able to use Open vSwitch on the hypervisor in order to manage/control traffic from/to the virtual functions. Open vSwitch can also work with DPDK devices and not just net devices, this series exposes an IB device, which Mellanox PMD driver uses, which then can be used by Open vSwitch DPDK. An IB device representor exposes only RAW Ethernet QP capabilities and the ability to create flow rules to direct traffic to its RX queues. The state of the IB device (ACTIVE/DOWN etc..) is based on the state of the corresponding net device representor. No other RDMA/RoCE functionality is currently supported and no GID table is exposed. ========= Signed-off-by: Doug Ledford <dledford@redhat.com>
| * \ \ \ Merge branch 'k.o/for-rc' into k.o/wip/dl-for-nextDoug Ledford2018-02-22
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a 14 patch series waiting to come into for-next that has a dependecy on code submitted into this kernel's for-rc series. So, merge the for-rc branch into the current for-next in order to make the patch series apply cleanly. Signed-off-by: Doug Ledford <dledford@redhat.com>
| * \ \ \ \ Merge tag 'mlx5-updates-2018-02-21' of ↵Doug Ledford2018-02-22
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into k.o/wip/dl-for-next mlx5-updates-2018-02-21 This series includes shared code updates for mlx5 core driver for both netdev and rdma subsystems. By Saeed, First six patches of the series are meant to address a performance issue and should provide a performance boost for multi core IRQ interrupt hungry workloads. The issue is fixed in the first patch, all other patches are meant to refactor the code in light of this fix. The problem it comes to fix, is a shared spinlock accessed across all HCA IRQs which protects the CQ database. To solve this we simply move the CQ database and its spinlock to be per EQ (IRQ), thus per core. By Yonatan, Fragmented completion queue (CQ) for RDMA, core driver implementation to create fragmented CQ buffers rather than one large contiguous memory buffer, the implementation scheme already exist and used by the netdev CQs, the patch shares that code with the rdma CQ creation flow and makes use of the new API in mlx5_ib driver. Thanks, Saeed. Merged into rdma-next tree as well as net-next tree to prevent conflicts in future patches between the two trees. Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | | | | RDMA/hns: Replace __raw_write*(cpu_to_le*()) with LE write*()Andy Shevchenko2018-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no need to repeat the semantics of writel() and similar. Moreover sparse complains about this: drivers/infiniband/hw/hns/hns_roce_hw_v1.c:1690:22: expected unsigned long long val drivers/infiniband/hw/hns/hns_roce_hw_v1.c:1690:22: got restricted __le64 <noident> Fixing this by replacing __raw_write*(cpu_to_le*()) calls by plain write*() ones. Note, write*() accessors are little endian by definition. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | RDMA/hns: Use free_pages function instead of free_pageoulijun2018-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It need to use free_pages function for free the memory allocated by __get_free_pages function. Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | RDMA/hns: Fix QP state judgement before receiving work requestsYixian Liu2018-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The QP can accept receive work requests only when the QP is in the states that allow them to be submitted. This patch updates the QP state judgement based on the specification. Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | RDMA/hns: Fix a bug with modifying mac addressoulijun2018-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When modifying mac address, it will trigger hns_roce_del_gid function and can't delete the default gid matched the index because the attribute of gid is null. Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | IB/cxgb3: remove cxio_dbg.cCorentin Labbe2018-02-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cxio_dbg.c is uncompiled since commit 2b540355cd2f ("RDMA/cxgb3: cleanups") 10 years after, we could remove it. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-04-01
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Minor conflicts in drivers/net/ethernet/mellanox/mlx5/core/en_rep.c, we had some overlapping changes: 1) In 'net' MLX5E_PARAMS_LOG_{SQ,RQ}_SIZE --> MLX5E_REP_PARAMS_LOG_{SQ,RQ}_SIZE 2) In 'net-next' params->log_rq_size is renamed to be params->log_rq_mtu_frames. 3) In 'net-next' params->hard_mtu is added. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | RDMA/hns: ensure for-loop actually iterates and free's buffersColin Ian King2018-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current for-loop zeros variable i and only loops once, hence not all the buffers are free'd. Fix this by setting i correctly. Detected by CoverityScan, CID#1463415 ("Operands don't affect result") Fixes: a5073d6054f7 ("RDMA/hns: Add eq support of hip08") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | | RDMA/qedr: Fix QP state initialization raceKalderon, Michal2018-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once the FW is transitioned to error, FLUSH cqes can be received. We want the driver to be aware of the fact that QP is already in error. Without this fix, a user may see false error messages in the dmesg log, mentioning that a FLUSH cqe was received while QP is not in error state. Fixes: cecbcddf ("qedr: Add support for QP verbs") Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | | RDMA/qedr: Fix rc initialization on CNQ allocation failureKalderon, Michal2018-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return code wasn't set properly when CNQ allocation failed. This only affect error message logging, currently user will receive an error message that says the qedr driver load failed with rc '0', instead of ENOMEM Fixes: ec72fce4 ("qedr: Add support for RoCE HW init") Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | | RDMA/qedr: fix QP's ack timeout configurationKalderon, Michal2018-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | QPs that were configured with ack timeout value lower than 1 msec will not implement re-transmission timeout. This means that if a packet / ACK were dropped, the QP will not retransmit this packet. This can lead to an application hang. Fixes: cecbcddf6 ("qedr: Add support for QP verbs") Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | | | | | IB/mlx5: Don't clean uninitialized UMR resourcesMark Bloch2018-03-21
| | |_|_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case we failed to create UMR resources, mark them as invalid so we won't try to destroy them on the unwind path. Add the relevant checks to destroy_umrc_res(), this is done for the unlikely event ib_register_device() or create_umr_res() err out and we try to destroy invalid objects. Fixes: 42cea83f9524 ("IB/mlx5: Fix cleanup order on unload") Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2018-03-31
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf-next 2018-03-31 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Add raw BPF tracepoint API in order to have a BPF program type that can access kernel internal arguments of the tracepoints in their raw form similar to kprobes based BPF programs. This infrastructure also adds a new BPF_RAW_TRACEPOINT_OPEN command to BPF syscall which returns an anon-inode backed fd for the tracepoint object that allows for automatic detach of the BPF program resp. unregistering of the tracepoint probe on fd release, from Alexei. 2) Add new BPF cgroup hooks at bind() and connect() entry in order to allow BPF programs to reject, inspect or modify user space passed struct sockaddr, and as well a hook at post bind time once the port has been allocated. They are used in FB's container management engine for implementing policy, replacing fragile LD_PRELOAD wrapper intercepting bind() and connect() calls that only works in limited scenarios like glibc based apps but not for other runtimes in containerized applications, from Andrey. 3) BPF_F_INGRESS flag support has been added to sockmap programs for their redirect helper call bringing it in line with cls_bpf based programs. Support is added for both variants of sockmap programs, meaning for tx ULP hooks as well as recv skb hooks, from John. 4) Various improvements on BPF side for the nfp driver, besides others this work adds BPF map update and delete helper call support from the datapath, JITing of 32 and 64 bit XADD instructions as well as offload support of bpf_get_prandom_u32() call. Initial implementation of nfp packet cache has been tackled that optimizes memory access (see merge commit for further details), from Jakub and Jiong. 5) Removal of struct bpf_verifier_env argument from the print_bpf_insn() API has been done in order to prepare to use print_bpf_insn() soon out of perf tool directly. This makes the print_bpf_insn() API more generic and pushes the env into private data. bpftool is adjusted as well with the print_bpf_insn() argument removal, from Jiri. 6) Couple of cleanups and prep work for the upcoming BTF (BPF Type Format). The latter will reuse the current BPF verifier log as well, thus bpf_verifier_log() is further generalized, from Martin. 7) For bpf_getsockopt() and bpf_setsockopt() helpers, IPv4 IP_TOS read and write support has been added in similar fashion to existing IPv6 IPV6_TCLASS socket option we already have, from Nikita. 8) Fixes in recent sockmap scatterlist API usage, which did not use sg_init_table() for initialization thus triggering a BUG_ON() in scatterlist API when CONFIG_DEBUG_SG was enabled. This adds and uses a small helper sg_init_marker() to properly handle the affected cases, from Prashant. 9) Let the BPF core follow IDR code convention and therefore use the idr_preload() and idr_preload_end() helpers, which would also help idr_alloc_cyclic() under GFP_ATOMIC to better succeed under memory pressure, from Shaohua. 10) Last but not least, a spelling fix in an error message for the BPF cookie UID helper under BPF sample code, from Colin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | | treewide: remove large struct-pass-by-value from tracepoint argumentsAlexei Starovoitov2018-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - fix trace_hfi1_ctxt_info() to pass large struct by reference instead of by value - convert 'type array[]' tracepoint arguments into 'type *array', since compiler will warn that sizeof('type array[]') == sizeof('type *array') and later should be used instead The CAST_TO_U64 macro in the later patch will enforce that tracepoint arguments can only be integers, pointers, or less than 8 byte structures. Larger structures should be passed by reference. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | | | | | | | qed*: Utilize FW 8.33.11.0Michal Kalderon2018-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This FW contains several fixes and features RDMA Features - SRQ support - XRC support - Memory window support - RDMA low latency queue support - RDMA bonding support RDMA bug fixes - RDMA remote invalidate during retransmit fix - iWARP MPA connect interop issue with RTR fix - iWARP Legacy DPM support - Fix MPA reject flow - iWARP error handling - RQ WQE validation checks MISC - Fix some HSI types endianity - New Restriction: vlan insertion in core_tx_bd_data can't be set for LB packets ETH - HW QoS offload support - Fix vlan, dcb and sriov flow of VF sending a packet with inband VLAN tag instead of default VLAN - Allow GRE version 1 offloads in RX flow - Allow VXLAN steering iSCSI / FcoE - Fix bd availability checking flow - Support 256th sge proerly in iscsi/fcoe retransmit - Performance improvement - Fix handle iSCSI command arrival with AHS and with immediate - Fix ipv6 traffic class configuration DEBUG - Update debug utilities Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: Manish Rangankar <Manish.Rangankar@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | | | mlx5: Move dump error CQE function out of mlx5_ib for code sharingEran Ben Elisha2018-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move mlx5_ib dump error CQE implementation to mlx5 CQ header file in order to use it in a downstream patch from mlx5e. In addition, use print_hex_dump instead of manual dumping of the buffer. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | | | | | | mlx5_{ib,core}: Add query SQ state helper functionEran Ben Elisha2018-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move query SQ state function from mlx5_ib to mlx5_core in order to have it in shared code. It will be used in a downstream patch from mlx5e. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* | | | | | | | net: Drop NETDEV_UNREGISTER_FINALKirill Tkhai2018-03-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Last user is gone after bdf5bd7f2132 "rds: tcp: remove register_netdevice_notifier infrastructure.", so we can remove this netdevice command. This allows to delete rtnl_lock() in netdev_run_todo(), which is hot path for net namespace unregistration. dev_change_net_namespace() and netdev_wait_allrefs() have rcu_barrier() before NETDEV_UNREGISTER_FINAL call, and the source commits say they were introduced to delemit the call with NETDEV_UNREGISTER, but this patch leaves them on the places, since they require additional analysis, whether we need in them for something else. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>