android_kernel_msm-6.1_noth.../include/linux/mlx5
Parav Pandit fbdd0049d9 RDMA/mlx5: Fix devlink deadlock on net namespace deletion
When a mlx5 core devlink instance is reloaded in different net namespace,
its associated IB device is deleted and recreated.

Example sequence is:
$ ip netns add foo
$ devlink dev reload pci/0000:00:08.0 netns foo
$ ip netns del foo

mlx5 IB device needs to attach and detach the netdevice to it through the
netdev notifier chain during load and unload sequence.  A below call graph
of the unload flow.

cleanup_net()
   down_read(&pernet_ops_rwsem); <- first sem acquired
     ops_pre_exit_list()
       pre_exit()
         devlink_pernet_pre_exit()
           devlink_reload()
             mlx5_devlink_reload_down()
               mlx5_unload_one()
               [...]
                 mlx5_ib_remove()
                   mlx5_ib_unbind_slave_port()
                     mlx5_remove_netdev_notifier()
                       unregister_netdevice_notifier()
                         down_write(&pernet_ops_rwsem);<- recurrsive lock

Hence, when net namespace is deleted, mlx5 reload results in deadlock.

When deadlock occurs, devlink mutex is also held. This not only deadlocks
the mlx5 device under reload, but all the processes which attempt to
access unrelated devlink devices are deadlocked.

Hence, fix this by mlx5 ib driver to register for per net netdev notifier
instead of global one, which operats on the net namespace without holding
the pernet_ops_rwsem.

Fixes: 4383cfcc65 ("net/mlx5: Add devlink reload")
Link: https://lore.kernel.org/r/20201026134359.23150-1-parav@nvidia.com
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:18:19 -03:00
..
accel.h net/mlx5: Accel, Add core IPsec support for the Connect-X family 2020-07-16 16:36:42 -07:00
cq.h net/mlx5: Avoid RDMA file inclusion in core driver 2020-06-27 13:50:46 -07:00
device.h net/mlx5: Add support for fw live patch event 2020-10-09 12:06:53 -07:00
doorbell.h
driver.h RDMA/mlx5: Fix devlink deadlock on net namespace deletion 2020-10-26 19:18:19 -03:00
eq.h
eswitch.h net/mlx5: E-switch, Use PF num in metadata reg c0 2020-09-30 21:26:28 -07:00
fs.h net/mlx5: Add NIC TX domain namespace 2020-10-12 15:37:44 -07:00
fs_helpers.h
mlx5_ifc.h Merge branch 'mlx_sw_owner_v2' into rdma.git for-next 2020-09-18 10:31:45 -03:00
mlx5_ifc_fpga.h net/mlx5: fix spelling mistake "reserverd" -> "reserved" 2020-02-18 15:44:07 +02:00
port.h RDMA/mlx5: Delete duplicated mlx5_ptys_width enum 2020-09-17 19:33:03 +03:00
qp.h net/mlx5e: IPsec: Add TX steering rule per IPsec state 2020-10-12 15:37:45 -07:00
rsc_dump.h net/mlx5: Add support in query QP, CQ and MKEY segments 2020-06-23 17:26:10 +03:00
transobj.h net/mlx5: Update transobj.c new cmd interface 2020-04-23 21:42:16 +03:00
vport.h net/mlx5: Constify mac address pointer 2020-06-22 15:29:19 -07:00