Commit graph

8387 commits

Author SHA1 Message Date
Linus Torvalds
d7e0a795bf ARM:
* More progress on the protected VM front, now with the full
   fixed feature set as well as the limitation of some hypercalls
   after initialisation.
 
 * Cleanup of the RAZ/WI sysreg handling, which was pointlessly
   complicated
 
 * Fixes for the vgic placement in the IPA space, together with a
   bunch of selftests
 
 * More memcg accounting of the memory allocated on behalf of a guest
 
 * Timer and vgic selftests
 
 * Workarounds for the Apple M1 broken vgic implementation
 
 * KConfig cleanups
 
 * New kvmarm.mode=none option, for those who really dislike us
 
 RISC-V:
 * New KVM port.
 
 x86:
 * New API to control TSC offset from userspace
 
 * TSC scaling for nested hypervisors on SVM
 
 * Switch masterclock protection from raw_spin_lock to seqcount
 
 * Clean up function prototypes in the page fault code and avoid
 repeated memslot lookups
 
 * Convey the exit reason to userspace on emulation failure
 
 * Configure time between NX page recovery iterations
 
 * Expose Predictive Store Forwarding Disable CPUID leaf
 
 * Allocate page tracking data structures lazily (if the i915
 KVM-GT functionality is not compiled in)
 
 * Cleanups, fixes and optimizations for the shadow MMU code
 
 s390:
 * SIGP Fixes
 
 * initial preparations for lazy destroy of secure VMs
 
 * storage key improvements/fixes
 
 * Log the guest CPNC
 
 Starting from this release, KVM-PPC patches will come from
 Michael Ellerman's PPC tree.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGBOiEUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNowwf/axlx3g9sgCwQHr12/6UF/7hL/RwP
 9z+pGiUzjl2YQE+RjSvLqyd6zXh+h4dOdOKbZDLSkSTbcral/8U70ojKnQsXM0XM
 1LoymxBTJqkgQBLm9LjYreEbzrPV4irk4ygEmuk3CPOHZu8xX1ei6c5LdandtM/n
 XVUkXsQY+STkmnGv4P3GcPoDththCr0tBTWrFWtxa0w9hYOxx0ay1AZFlgM4FFX0
 QFuRc8VBLoDJpIUjbkhsIRIbrlHc/YDGjuYnAU7lV/CIME8vf2BW6uBwIZJdYcDj
 0ejozLjodEnuKXQGnc8sXFioLX2gbMyQJEvwCgRvUu/EU7ncFm1lfs7THQ==
 =UxKM
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "ARM:

   - More progress on the protected VM front, now with the full fixed
     feature set as well as the limitation of some hypercalls after
     initialisation.

   - Cleanup of the RAZ/WI sysreg handling, which was pointlessly
     complicated

   - Fixes for the vgic placement in the IPA space, together with a
     bunch of selftests

   - More memcg accounting of the memory allocated on behalf of a guest

   - Timer and vgic selftests

   - Workarounds for the Apple M1 broken vgic implementation

   - KConfig cleanups

   - New kvmarm.mode=none option, for those who really dislike us

  RISC-V:

   - New KVM port.

  x86:

   - New API to control TSC offset from userspace

   - TSC scaling for nested hypervisors on SVM

   - Switch masterclock protection from raw_spin_lock to seqcount

   - Clean up function prototypes in the page fault code and avoid
     repeated memslot lookups

   - Convey the exit reason to userspace on emulation failure

   - Configure time between NX page recovery iterations

   - Expose Predictive Store Forwarding Disable CPUID leaf

   - Allocate page tracking data structures lazily (if the i915 KVM-GT
     functionality is not compiled in)

   - Cleanups, fixes and optimizations for the shadow MMU code

  s390:

   - SIGP Fixes

   - initial preparations for lazy destroy of secure VMs

   - storage key improvements/fixes

   - Log the guest CPNC

  Starting from this release, KVM-PPC patches will come from Michael
  Ellerman's PPC tree"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits)
  RISC-V: KVM: fix boolreturn.cocci warnings
  RISC-V: KVM: remove unneeded semicolon
  RISC-V: KVM: Fix GPA passed to __kvm_riscv_hfence_gvma_xyz() functions
  RISC-V: KVM: Factor-out FP virtualization into separate sources
  KVM: s390: add debug statement for diag 318 CPNC data
  KVM: s390: pv: properly handle page flags for protected guests
  KVM: s390: Fix handle_sske page fault handling
  KVM: x86: SGX must obey the KVM_INTERNAL_ERROR_EMULATION protocol
  KVM: x86: On emulation failure, convey the exit reason, etc. to userspace
  KVM: x86: Get exit_reason as part of kvm_x86_ops.get_exit_info
  KVM: x86: Clarify the kvm_run.emulation_failure structure layout
  KVM: s390: Add a routine for setting userspace CPU state
  KVM: s390: Simplify SIGP Set Arch handling
  KVM: s390: pv: avoid stalls when making pages secure
  KVM: s390: pv: avoid stalls for kvm_s390_pv_init_vm
  KVM: s390: pv: avoid double free of sida page
  KVM: s390: pv: add macros for UVC CC values
  s390/mm: optimize reset_guest_reference_bit()
  s390/mm: optimize set_guest_storage_key()
  s390/mm: no need for pte_alloc_map_lock() if we know the pmd is present
  ...
2021-11-02 11:24:14 -07:00
Linus Torvalds
fc02cb2b37 Core:
- Remove socket skb caches
 
  - Add a SO_RESERVE_MEM socket op to forward allocate buffer space
    and avoid memory accounting overhead on each message sent
 
  - Introduce managed neighbor entries - added by control plane and
    resolved by the kernel for use in acceleration paths (BPF / XDP
    right now, HW offload users will benefit as well)
 
  - Make neighbor eviction on link down controllable by userspace
    to work around WiFi networks with bad roaming implementations
 
  - vrf: Rework interaction with netfilter/conntrack
 
  - fq_codel: implement L4S style ce_threshold_ect1 marking
 
  - sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
 
 BPF:
 
  - Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
    as implemented in LLVM14
 
  - Introduce bpf_get_branch_snapshot() to capture Last Branch Records
 
  - Implement variadic trace_printk helper
 
  - Add a new Bloomfilter map type
 
  - Track <8-byte scalar spill and refill
 
  - Access hw timestamp through BPF's __sk_buff
 
  - Disallow unprivileged BPF by default
 
  - Document BPF licensing
 
 Netfilter:
 
  - Introduce egress hook for looking at raw outgoing packets
 
  - Allow matching on and modifying inner headers / payload data
 
  - Add NFT_META_IFTYPE to match on the interface type either from
    ingress or egress
 
 Protocols:
 
  - Multi-Path TCP:
    - increase default max additional subflows to 2
    - rework forward memory allocation
    - add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
 
  - MCTP flow support allowing lower layer drivers to configure msg
    muxing as needed
 
  - Automatic Multicast Tunneling (AMT) driver based on RFC7450
 
  - HSR support the redbox supervision frames (IEC-62439-3:2018)
 
  - Support for the ip6ip6 encapsulation of IOAM
 
  - Netlink interface for CAN-FD's Transmitter Delay Compensation
 
  - Support SMC-Rv2 eliminating the current same-subnet restriction,
    by exploiting the UDP encapsulation feature of RoCE adapters
 
  - TLS: add SM4 GCM/CCM crypto support
 
  - Bluetooth: initial support for link quality and audio/codec
    offload
 
 Driver APIs:
 
  - Add a batched interface for RX buffer allocation in AF_XDP
    buffer pool
 
  - ethtool: Add ability to control transceiver modules' power mode
 
  - phy: Introduce supported interfaces bitmap to express MAC
    capabilities and simplify PHY code
 
  - Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
 
 New drivers:
 
  - WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
 
  - Ethernet driver for ASIX AX88796C SPI device (x88796c)
 
 Drivers:
 
  - Broadcom PHYs
    - support 72165, 7712 16nm PHYs
    - support IDDQ-SR for additional power savings
 
  - PHY support for QCA8081, QCA9561 PHYs
 
  - NXP DPAA2: support for IRQ coalescing
 
  - NXP Ethernet (enetc): support for software TCP segmentation
 
  - Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
    Gigabit-capable IP found on RZ/G2L SoC
 
  - Intel 100G Ethernet
    - support for eswitch offload of TC/OvS flow API, including
      offload of GRE, VxLAN, Geneve tunneling
    - support application device queues - ability to assign Rx and Tx
      queues to application threads
    - PTP and PPS (pulse-per-second) extensions
 
  - Broadcom Ethernet (bnxt)
    - devlink health reporting and device reload extensions
 
  - Mellanox Ethernet (mlx5)
    - offload macvlan interfaces
    - support HW offload of TC rules involving OVS internal ports
    - support HW-GRO and header/data split
    - support application device queues
 
  - Marvell OcteonTx2:
    - add XDP support for PF
    - add PTP support for VF
 
  - Qualcomm Ethernet switch (qca8k): support for QCA8328
 
  - Realtek Ethernet DSA switch (rtl8366rb)
    - support bridge offload
    - support STP, fast aging, disabling address learning
    - support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
 
  - Mellanox Ethernet/IB switch (mlxsw)
    - multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
    - offload root TBF qdisc as port shaper
    - support multiple routing interface MAC address prefixes
    - support for IP-in-IP with IPv6 underlay
 
  - MediaTek WiFi (mt76)
    - mt7921 - ASPM, 6GHz, SDIO and testmode support
    - mt7915 - LED and TWT support
 
  - Qualcomm WiFi (ath11k)
    - include channel rx and tx time in survey dump statistics
    - support for 80P80 and 160 MHz bandwidths
    - support channel 2 in 6 GHz band
    - spectral scan support for QCN9074
    - support for rx decapsulation offload (data frames in 802.3
      format)
 
  - Qualcomm phone SoC WiFi (wcn36xx)
    - enable Idle Mode Power Save (IMPS) to reduce power consumption
      during idle
 
  - Bluetooth driver support for MediaTek MT7922 and MT7921
 
  - Enable support for AOSP Bluetooth extension in Qualcomm WCN399x
    and Realtek 8822C/8852A
 
  - Microsoft vNIC driver (mana)
    - support hibernation and kexec
 
  - Google vNIC driver (gve)
    - support for jumbo frames
    - implement Rx page reuse
 
 Refactor:
 
  - Make all writes to netdev->dev_addr go thru helpers, so that we
    can add this address to the address rbtree and handle the updates
 
  - Various TCP cleanups and optimizations including improvements
    to CPU cache use
 
  - Simplify the gnet_stats, Qdisc stats' handling and remove
    qdisc->running sequence counter
 
  - Driver changes and API updates to address devlink locking
    deficiencies
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGAzX4ACgkQMUZtbf5S
 IrvW3g//Q0ZLrOuHK9pZ8sCXMMhDj8qL6ajm0otMddHWA/+1UglwVBKFhsajfxOf
 wJ/5LZis+XKLpLqKTU5chKVfn39HuDGe/D3l+egi01Gv5BW0+XzEhagfyR5tJX5z
 wsGG5CXO/we/laVSzRiFtwwVEKHKN20YC+tIQwYOYP5Wy3q4G7qDsFhT7GqgsGCS
 n74QUEAIB5Tz0ODWFqLtbsySzIurXrskibwt5T9bvAAlPw/lCU68mmG+NVJ7VddO
 lBbNkLMOo8yW9Ci20H09SrYd4jZTmMARo9tsFO1tAvAMk7qpn0Wd8pnOYTjFFoMD
 +qjiFSVMh7E0JGb8Y7NCvwaB99suAK5rfGP68Xwe62DfP7vYWEx4pZGxBP19F4ld
 6Kn1ME33BX9rUF9tBecf0bdKfJUwB2Q2Xou/b9laG04bwiqsc9iG5FQq1C46lnLZ
 QdzNiS1My4dJMczkWt66HF3Kx30ibwHfvKMIHjf4PqkzEatkv6Y6SBZ57KXL+Lde
 0BQSFhbf0tm2Gf55etzrczLElI3uqHSFWUNZZ2Bt6WmzO1e6tpV9nAtRWF4C/dFg
 QDpLJtOOOY65uq+qz09zoPfv2lem868SrCAuFrVn99bEpYjx/CGNFDeEI02l6jyr
 84eUxd364UcbIk3fc+eTGdXHLQNVk30G0AHVBBxaWNIidwfqXeE=
 =srde
 -----END PGP SIGNATURE-----

Merge tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - Remove socket skb caches

   - Add a SO_RESERVE_MEM socket op to forward allocate buffer space and
     avoid memory accounting overhead on each message sent

   - Introduce managed neighbor entries - added by control plane and
     resolved by the kernel for use in acceleration paths (BPF / XDP
     right now, HW offload users will benefit as well)

   - Make neighbor eviction on link down controllable by userspace to
     work around WiFi networks with bad roaming implementations

   - vrf: Rework interaction with netfilter/conntrack

   - fq_codel: implement L4S style ce_threshold_ect1 marking

   - sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()

  BPF:

   - Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
     as implemented in LLVM14

   - Introduce bpf_get_branch_snapshot() to capture Last Branch Records

   - Implement variadic trace_printk helper

   - Add a new Bloomfilter map type

   - Track <8-byte scalar spill and refill

   - Access hw timestamp through BPF's __sk_buff

   - Disallow unprivileged BPF by default

   - Document BPF licensing

  Netfilter:

   - Introduce egress hook for looking at raw outgoing packets

   - Allow matching on and modifying inner headers / payload data

   - Add NFT_META_IFTYPE to match on the interface type either from
     ingress or egress

  Protocols:

   - Multi-Path TCP:
      - increase default max additional subflows to 2
      - rework forward memory allocation
      - add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS

   - MCTP flow support allowing lower layer drivers to configure msg
     muxing as needed

   - Automatic Multicast Tunneling (AMT) driver based on RFC7450

   - HSR support the redbox supervision frames (IEC-62439-3:2018)

   - Support for the ip6ip6 encapsulation of IOAM

   - Netlink interface for CAN-FD's Transmitter Delay Compensation

   - Support SMC-Rv2 eliminating the current same-subnet restriction, by
     exploiting the UDP encapsulation feature of RoCE adapters

   - TLS: add SM4 GCM/CCM crypto support

   - Bluetooth: initial support for link quality and audio/codec offload

  Driver APIs:

   - Add a batched interface for RX buffer allocation in AF_XDP buffer
     pool

   - ethtool: Add ability to control transceiver modules' power mode

   - phy: Introduce supported interfaces bitmap to express MAC
     capabilities and simplify PHY code

   - Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks

  New drivers:

   - WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)

   - Ethernet driver for ASIX AX88796C SPI device (x88796c)

  Drivers:

   - Broadcom PHYs
      - support 72165, 7712 16nm PHYs
      - support IDDQ-SR for additional power savings

   - PHY support for QCA8081, QCA9561 PHYs

   - NXP DPAA2: support for IRQ coalescing

   - NXP Ethernet (enetc): support for software TCP segmentation

   - Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
     Gigabit-capable IP found on RZ/G2L SoC

   - Intel 100G Ethernet
      - support for eswitch offload of TC/OvS flow API, including
        offload of GRE, VxLAN, Geneve tunneling
      - support application device queues - ability to assign Rx and Tx
        queues to application threads
      - PTP and PPS (pulse-per-second) extensions

   - Broadcom Ethernet (bnxt)
      - devlink health reporting and device reload extensions

   - Mellanox Ethernet (mlx5)
      - offload macvlan interfaces
      - support HW offload of TC rules involving OVS internal ports
      - support HW-GRO and header/data split
      - support application device queues

   - Marvell OcteonTx2:
      - add XDP support for PF
      - add PTP support for VF

   - Qualcomm Ethernet switch (qca8k): support for QCA8328

   - Realtek Ethernet DSA switch (rtl8366rb)
      - support bridge offload
      - support STP, fast aging, disabling address learning
      - support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch

   - Mellanox Ethernet/IB switch (mlxsw)
      - multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
      - offload root TBF qdisc as port shaper
      - support multiple routing interface MAC address prefixes
      - support for IP-in-IP with IPv6 underlay

   - MediaTek WiFi (mt76)
      - mt7921 - ASPM, 6GHz, SDIO and testmode support
      - mt7915 - LED and TWT support

   - Qualcomm WiFi (ath11k)
      - include channel rx and tx time in survey dump statistics
      - support for 80P80 and 160 MHz bandwidths
      - support channel 2 in 6 GHz band
      - spectral scan support for QCN9074
      - support for rx decapsulation offload (data frames in 802.3
        format)

   - Qualcomm phone SoC WiFi (wcn36xx)
      - enable Idle Mode Power Save (IMPS) to reduce power consumption
        during idle

   - Bluetooth driver support for MediaTek MT7922 and MT7921

   - Enable support for AOSP Bluetooth extension in Qualcomm WCN399x and
     Realtek 8822C/8852A

   - Microsoft vNIC driver (mana)
      - support hibernation and kexec

   - Google vNIC driver (gve)
      - support for jumbo frames
      - implement Rx page reuse

  Refactor:

   - Make all writes to netdev->dev_addr go thru helpers, so that we can
     add this address to the address rbtree and handle the updates

   - Various TCP cleanups and optimizations including improvements to
     CPU cache use

   - Simplify the gnet_stats, Qdisc stats' handling and remove
     qdisc->running sequence counter

   - Driver changes and API updates to address devlink locking
     deficiencies"

* tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2122 commits)
  Revert "net: avoid double accounting for pure zerocopy skbs"
  selftests: net: add arp_ndisc_evict_nocarrier
  net: ndisc: introduce ndisc_evict_nocarrier sysctl parameter
  net: arp: introduce arp_evict_nocarrier sysctl parameter
  libbpf: Deprecate AF_XDP support
  kbuild: Unify options for BTF generation for vmlinux and modules
  selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
  bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
  bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
  net: vmxnet3: remove multiple false checks in vmxnet3_ethtool.c
  net: avoid double accounting for pure zerocopy skbs
  tcp: rename sk_wmem_free_skb
  netdevsim: fix uninit value in nsim_drv_configure_vfs()
  selftests/bpf: Fix also no-alu32 strobemeta selftest
  bpf: Add missing map_delete_elem method to bloom filter map
  selftests/bpf: Add bloom map success test for userspace calls
  bpf: Add alignment padding for "map_extra" + consolidate holes
  bpf: Bloom filter map naming fixups
  selftests/bpf: Add test cases for struct_ops prog
  bpf: Add dummy BPF STRUCT_OPS for test purpose
  ...
2021-11-02 06:20:58 -07:00
Linus Torvalds
d2fac0afe8 audit/stable-5.16 PR 20211101
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmGANdUUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOmihAAgKSTv4Jf0s4yopdcxfuLweiyqHX1
 719QJzdLZohmllrJPq/83FZL9qodCzxy87nAm67Ht0baSKiEjtVgRaVCqJWEE+l6
 oQL+wUsGLP7CmExOP503Uh6tW35AhETQA4Uwu6QtiUYLYG17kAgeR3cTFuekUsJS
 iL4K65PXE2bBxMe7Ta1YIZqcxptbknMgpqYkdne7xs7RS+UiVj8TyRle6ACrfzEX
 IVy4LTk+spHCy1a494g9pt/21xOnbiLHr/FpckALscnvJiUThxbfQHGSQeMpM4uM
 BnwCqFrj860vMeh52M11/GAAXmdPh6AjoLhaSIW2I3M2GbV8ZP2hu1HYUz3osmrT
 f+aeMPJ4feX1xVj6qAC+1G83XRO83tP/YIEuocGiwyepImB25NHPin21xepf6Ru0
 wJX+aXC9O1eG6E2ghT6tBim/MpeNH5OT0hNO3uhGmEQ6xZpArRVVaBwlEdufJiCx
 ZljqEFUT7wA9nGEQif6GdLnGezGr/aNL65caTkIAzHKamd79QIr7VZXYjYIfHSqE
 p74Aro6E8qoQJjsTSkvZceM0u1LRzwS4wPRroE6eGz98oYDpiDm1RPb+9Gw5jyJf
 JN7UjJKO9+iPGAi3KivGBqpBskw4cCp2y/nHrMYmpGUPELcr5kQtDfQ6yp59tVZ8
 Dwo5GeSlG6khmiI=
 =WrEw
 -----END PGP SIGNATURE-----

Merge tag 'audit-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit

Pull audit updates from Paul Moore:
 "Add some additional audit logging to capture the openat2() syscall
  open_how struct info.

  Previous variations of the open()/openat() syscalls allowed audit
  admins to inspect the syscall args to get the information contained in
  the new open_how struct used in openat2()"

* tag 'audit-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: return early if the filter rule has a lower priority
  audit: add OPENAT2 record to list "how" info
  audit: add support for the openat2 syscall
  audit: replace magic audit syscall class numbers with macros
  lsm_audit: avoid overloading the "key" audit field
  audit: Convert to SPDX identifier
  audit: rename struct node to struct audit_node to prevent future name collisions
2021-11-01 21:17:39 -07:00
Linus Torvalds
cdab10bf32 selinux/stable-5.16 PR 20211101
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmGANbAUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNaMBAAg+9gZr0F7xiafu8JFZqZfx/AQdJ2
 G2cn3le+/tXGZmF8m/+82lOaR6LeQLatgSDJNSkXWkKr0nRwseQJDbtRfvYJdn0t
 Ax05/Fmz6OGxQ2wgRYgaFiSrKpE5p3NhDtiLFVdkCJaQNe/8DZOc7NhBl6EjZf3x
 ubhl2hUiJ4AmiXGwcYhr4uKgP4nhW8OM1/OkskVi+bBMmLA8KTY9kslmIDP5E3BW
 29W4qhqeLNQupY5dGMEMVcyxY9ZUWpO39q4uOaQVZrUGE7xABkj/jhnxT5gFTSlI
 pu8VhsYXm9KuRVveIsv0L5SZfadwoM9YAl7ki1wD3W5rHqOAte3rBTm6VmNlQwfU
 MqxP65Jiyxudxet5Be3/dCRH/+MDQuwBxivgmZXbeVxor2SeznVb0GDaEUC5FSHu
 CJIgWtQzsPJMxgAEGXN4F3QGP0htTTJni56GUPOsrf4TIBW02TT+oLTLFRIokQQL
 INNOfwVSRXElnCsvxsHR4oB+JZ9pJyBaAmeupcQ6jmcKiWlbLj4s+W0U0pM5h91v
 hmMpz7KMxrX6gVL4gB2Jj4aN3r5YRbq26NBu6D+wdwwBTeTTocaHSpAqkv4buClf
 uNk3cG8Hkp8TTg9cM8jYgpxMyzKH/AI/Uw3VhEa1xCiq2Ck3DgfnZvnvcRRaZevU
 FPgmwgqePJXGi60=
 =sb8J
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:

 - Add LSM/SELinux/Smack controls and auditing for io-uring.

   As usual, the individual commit descriptions have more detail, but we
   were basically missing two things which we're adding here:

      + establishment of a proper audit context so that auditing of
        io-uring ops works similarly to how it does for syscalls (with
        some io-uring additions because io-uring ops are *not* syscalls)

      + additional LSM hooks to enable access control points for some of
        the more unusual io-uring features, e.g. credential overrides.

   The additional audit callouts and LSM hooks were done in conjunction
   with the io-uring folks, based on conversations and RFC patches
   earlier in the year.

 - Fixup the binder credential handling so that the proper credentials
   are used in the LSM hooks; the commit description and the code
   comment which is removed in these patches are helpful to understand
   the background and why this is the proper fix.

 - Enable SELinux genfscon policy support for securityfs, allowing
   improved SELinux filesystem labeling for other subsystems which make
   use of securityfs, e.g. IMA.

* tag 'selinux-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  security: Return xattr name from security_dentry_init_security()
  selinux: fix a sock regression in selinux_ip_postroute_compat()
  binder: use cred instead of task for getsecid
  binder: use cred instead of task for selinux checks
  binder: use euid from cred instead of using task
  LSM: Avoid warnings about potentially unused hook variables
  selinux: fix all of the W=1 build warnings
  selinux: make better use of the nf_hook_state passed to the NF hooks
  selinux: fix race condition when computing ocontext SIDs
  selinux: remove unneeded ipv6 hook wrappers
  selinux: remove the SELinux lockdown implementation
  selinux: enable genfscon labeling for securityfs
  Smack: Brutalist io_uring support
  selinux: add support for the io_uring access controls
  lsm,io_uring: add LSM hooks to io_uring
  io_uring: convert io_uring to the secure anon inode interface
  fs: add anon_inode_getfile_secure() similar to anon_inode_getfd_secure()
  audit: add filtering for io_uring records
  audit,io_uring,io-wq: add some basic audit support to io_uring
  audit: prepare audit_context for use in calling contexts beyond syscalls
2021-11-01 21:06:18 -07:00
Jakub Kicinski
b7b98f8689 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-11-01

We've added 181 non-merge commits during the last 28 day(s) which contain
a total of 280 files changed, 11791 insertions(+), 5879 deletions(-).

The main changes are:

1) Fix bpf verifier propagation of 64-bit bounds, from Alexei.

2) Parallelize bpf test_progs, from Yucong and Andrii.

3) Deprecate various libbpf apis including af_xdp, from Andrii, Hengqi, Magnus.

4) Improve bpf selftests on s390, from Ilya.

5) bloomfilter bpf map type, from Joanne.

6) Big improvements to JIT tests especially on Mips, from Johan.

7) Support kernel module function calls from bpf, from Kumar.

8) Support typeless and weak ksym in light skeleton, from Kumar.

9) Disallow unprivileged bpf by default, from Pawan.

10) BTF_KIND_DECL_TAG support, from Yonghong.

11) Various bpftool cleanups, from Quentin.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (181 commits)
  libbpf: Deprecate AF_XDP support
  kbuild: Unify options for BTF generation for vmlinux and modules
  selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
  bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
  bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
  selftests/bpf: Fix also no-alu32 strobemeta selftest
  bpf: Add missing map_delete_elem method to bloom filter map
  selftests/bpf: Add bloom map success test for userspace calls
  bpf: Add alignment padding for "map_extra" + consolidate holes
  bpf: Bloom filter map naming fixups
  selftests/bpf: Add test cases for struct_ops prog
  bpf: Add dummy BPF STRUCT_OPS for test purpose
  bpf: Factor out helpers for ctx access checking
  bpf: Factor out a helper to prepare trampoline for struct_ops prog
  selftests, bpf: Fix broken riscv build
  riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h
  tools, build: Add RISC-V to HOSTARCH parsing
  riscv, bpf: Increase the maximum number of iterations
  selftests, bpf: Add one test for sockmap with strparser
  selftests, bpf: Fix test_txmsg_ingress_parser error
  ...
====================

Link: https://lore.kernel.org/r/20211102013123.9005-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-01 19:59:46 -07:00
James Prestwood
18ac597af2 net: ndisc: introduce ndisc_evict_nocarrier sysctl parameter
In most situations the neighbor discovery cache should be cleared on a
NOCARRIER event which is currently done unconditionally. But for wireless
roams the neighbor discovery cache can and should remain intact since
the underlying network has not changed.

This patch introduces a sysctl option ndisc_evict_nocarrier which can
be disabled by a wireless supplicant during a roam. This allows packets
to be sent after a roam immediately without having to wait for
neighbor discovery.

A user reported roughly a 1 second delay after a roam before packets
could be sent out (note, on IPv4). This delay was due to the ARP
cache being cleared. During testing of this same scenario using IPv6
no delay was noticed, but regardless there is no reason to clear
the ndisc cache for wireless roams.

Signed-off-by: James Prestwood <prestwoj@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-01 19:57:14 -07:00
James Prestwood
fcdb44d08a net: arp: introduce arp_evict_nocarrier sysctl parameter
This change introduces a new sysctl parameter, arp_evict_nocarrier.
When set (default) the ARP cache will be cleared on a NOCARRIER event.
This new option has been defaulted to '1' which maintains existing
behavior.

Clearing the ARP cache on NOCARRIER is relatively new, introduced by:

commit 859bd2ef1f
Author: David Ahern <dsahern@gmail.com>
Date:   Thu Oct 11 20:33:49 2018 -0700

    net: Evict neighbor entries on carrier down

The reason for this changes is to prevent the ARP cache from being
cleared when a wireless device roams. Specifically for wireless roams
the ARP cache should not be cleared because the underlying network has not
changed. Clearing the ARP cache in this case can introduce significant
delays sending out packets after a roam.

A user reported such a situation here:

https://lore.kernel.org/linux-wireless/CACsRnHWa47zpx3D1oDq9JYnZWniS8yBwW1h0WAVZ6vrbwL_S0w@mail.gmail.com/

After some investigation it was found that the kernel was holding onto
packets until ARP finished which resulted in this 1 second delay. It
was also found that the first ARP who-has was never responded to,
which is actually what caues the delay. This change is more or less
working around this behavior, but again, there is no reason to clear
the cache on a roam anyways.

As for the unanswered who-has, we know the packet made it OTA since
it was seen while monitoring. Why it never received a response is
unknown. In any case, since this is a problem on the AP side of things
all that can be done is to work around it until it is solved.

Some background on testing/reproducing the packet delay:

Hardware:
 - 2 access points configured for Fast BSS Transition (Though I don't
   see why regular reassociation wouldn't have the same behavior)
 - Wireless station running IWD as supplicant
 - A device on network able to respond to pings (I used one of the APs)

Procedure:
 - Connect to first AP
 - Ping once to establish an ARP entry
 - Start a tcpdump
 - Roam to second AP
 - Wait for operstate UP event, and note the timestamp
 - Start pinging

Results:

Below is the tcpdump after UP. It was recorded the interface went UP at
10:42:01.432875.

10:42:01.461871 ARP, Request who-has 192.168.254.1 tell 192.168.254.71, length 28
10:42:02.497976 ARP, Request who-has 192.168.254.1 tell 192.168.254.71, length 28
10:42:02.507162 ARP, Reply 192.168.254.1 is-at ac:86:74:55:b0:20, length 46
10:42:02.507185 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 1, length 64
10:42:02.507205 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 2, length 64
10:42:02.507212 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 3, length 64
10:42:02.507219 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 4, length 64
10:42:02.507225 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 5, length 64
10:42:02.507232 IP 192.168.254.71 > 192.168.254.1: ICMP echo request, id 52792, seq 6, length 64
10:42:02.515373 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 1, length 64
10:42:02.521399 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 2, length 64
10:42:02.521612 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 3, length 64
10:42:02.521941 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 4, length 64
10:42:02.522419 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 5, length 64
10:42:02.523085 IP 192.168.254.1 > 192.168.254.71: ICMP echo reply, id 52792, seq 6, length 64

You can see the first ARP who-has went out very quickly after UP, but
was never responded to. Nearly a second later the kernel retries and
gets a response. Only then do the ping packets go out. If an ARP entry
is manually added prior to UP (after the cache is cleared) it is seen
that the first ping is never responded to, so its not only an issue with
ARP but with data packets in general.

As mentioned prior, the wireless interface was also monitored to verify
the ping/ARP packet made it OTA which was observed to be true.

Signed-off-by: James Prestwood <prestwoj@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-01 19:57:14 -07:00
Linus Torvalds
316b7eaa93 Updates for the IPMI driver
A new type of low-level IPMI driver is added for direct communication
 over the IPMI message bus without a BMC between the driver and the bus.
 
 Other than that, lots of little bug fixes and enhancements.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE/Q1c5nzg9ZpmiCaGYfOMkJGb/4EFAmF/5YwACgkQYfOMkJGb
 /4GV5A/8CSXagtu6JJTY88+7CNxXSca4fIGcpO8RZUA1b+Hkw3+y/Qlv/djK6bk1
 lt16OpMxSYggTXtkfH6f8zRnf5jAbXDIrcdYAZdCdIt/u4zgXksa8XycUgYDcVS8
 Tb07qO55HsNwjjjPglIUXqohkpyuilJLpBDhWW+AWNDs2niAwFFpFoBTDnDbZEe1
 6ide1f/IBhibwEx4T4nGs45VKUB+jHsORSj8+fB2suREgs0GTGf51E9NyR91CYyk
 uSkqpeHVN85KzTWAeAV5nCfRab/rBVS/EL8tEQEt1LniXZpJs3bJHLJFhgaQWesB
 6b2hVdL8FoG5qO1Xcf2ofbLKdrY+AOWW8gBWfsZmsK2dQYex+CHO7bm2BkRt037i
 YJdg5+qXz093uDCotYZYvGjm+TTmrTeFVhEJD6+CW0AyVeSAqBB3TBxqDpBn3LfN
 p2sTUS/1BZuChe0VXPXcDkkUFsVs3J+Z4lFl0f/WzTFUxWZEC/ELh/3Nuz7J2eYy
 gEXWqvox8BAU8byYucmLNb7xelFztjXpga+WSDk7kEIIStM6aeIVKenpYI/KJCr/
 jZh0M/zYUk4QVMf54nsWEfWOiKi7D4UyE+C3Ijta2UordLTetLeY40HVZqyPwbPE
 AMDJH2MhmIr5soIw1yMVP0cp+6CWatBgnrHcXSbBc8ZpJEJzCNU=
 =WxcC
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.16-1' of https://github.com/cminyard/linux-ipmi

Pull IPMI driver updates from Corey Minyard:
 "A new type of low-level IPMI driver is added for direct communication
  over the IPMI message bus without a BMC between the driver and the
  bus.

  Other than that, lots of little bug fixes and enhancements"

* tag 'for-linus-5.16-1' of https://github.com/cminyard/linux-ipmi:
  ipmi: kcs_bmc: Fix a memory leak in the error handling path of 'kcs_bmc_serio_add_device()'
  char: ipmi: replace snprintf in show functions with sysfs_emit
  ipmi: ipmb: fix dependencies to eliminate build error
  ipmi:ipmb: Add OF support
  ipmi: bt: Add ast2600 compatible string
  ipmi: bt-bmc: Use registers directly
  ipmi: ipmb: Fix off-by-one size check on rcvlen
  ipmi:ssif: Use depends on, not select, for I2C
  ipmi: Add docs for the IPMI IPMB driver
  ipmi: Add docs for IPMB direct addressing
  ipmi:ipmb: Add initial support for IPMI over IPMB
  ipmi: Add support for IPMB direct messages
  ipmi: Export ipmb_checksum()
  ipmi: Fix a typo
  ipmi: Check error code before processing BMC response
  ipmi:devintf: Return a proper error when recv buffer too small
  ipmi: Disable some operations during a panic
  ipmi:watchdog: Set panic count to proper value on a panic
2021-11-01 18:53:03 -07:00
Linus Torvalds
73d21a3579 media updates for v5.16-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmF/qI0ACgkQCF8+vY7k
 4RWt1A//QSrYq6cxSzdEB+Jr6sZ4ToD2Lhp+Iq4vBMjeWljmIXLmsez7JHqTw2us
 sYocQS4tHUht6Y3T+ZV7dM2KWzKZfv18qKz1h2u36QuEfGN8y3PYGcBF4MQMmVpw
 gznqZb8bWS2onIDKSwn5BW7VRmc/HwRk0k9lXwJ7RuN7UX6zkAWhMUlEA7MiztfF
 DeTQ7KTWqc6v3xzYZ8BHBSOm+GPxvG7oiZv839cFfkNWsgz2ETJYBUvBQdgoWvbC
 uE5P+0/U+VoksRCuKzItltJQKaUsrXPDKODtA/UvWFr8NfTN7SAa8EsWbzC/cgBo
 8caHuhz++aYCfKZ6+rpwZ9oRY0BTnVXI52K0qC+NtsybU4pIDueZqwcYfo6bZr46
 XQXoFELOV6wkC7dHZ0+pFC3Lrqm5qXM3tmtITHOk8Xp1aPAzvTMMj4+mL3CawnXZ
 YH+ViKfadGdQ7gQb6ybbxwl1bvhGdSd1Pg8twYwO8yo3UHl2KI1iPKg7s9sDJupg
 npUUFNacyuv1+cSgEy4xqtiO+j6eoRBjYoj/Nv2D8tF8mJ3IZsf2JS+aIVqEAvg0
 d7M1IEFdvsPxgwnFMoOInr3mOZf2om6gTXVY71y01S1EbSfOqv/UG9mVZS1VMhJK
 Z3oigVDsujYJbGRWRomQTNhNxlsgpvGf3dTYSViLAH05CgtEIY0=
 =eLup
 -----END PGP SIGNATURE-----

Merge tag 'media/v5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:

 - New driver for SK Hynix Hi-846 8M pixel camera

 - New driver for the ov13b10 camera

 - New driver for Renesas R-Car ISP

 - mtk-vcodec gained support for version 2 of decoder firmware ABI

 - The legacy sir_ir driver got removed

 - videobuf2: the vb2_mem_ops kAPI had some improvements

 - lots of cleanups, fixes and new features at device drivers

* tag 'media/v5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (328 commits)
  media: venus: core: Add sdm660 DT compatible and resource struct
  media: dt-bindings: media: venus: Add sdm660 dt schema
  media: venus: vdec: decoded picture buffer handling during reconfig sequence
  media: venus: Handle fatal errors during encoding and decoding
  media: venus: helpers: Add helper to mark fatal vb2 error
  media: venus: hfi: Check for sys error on session hfi functions
  media: venus: Make sys_error flag an atomic bitops
  media: venus: venc: Use pmruntime autosuspend
  media: allegro: write vui parameters for HEVC
  media: allegro: nal-hevc: implement generator for vui
  media: allegro: write correct colorspace into SPS
  media: allegro: extract nal value lookup functions to header
  media: allegro: correctly scale the bit rate in SPS
  media: allegro: remove external QP table
  media: allegro: fix row and column in response message
  media: allegro: add control to disable encoder buffer
  media: allegro: add encoder buffer support
  media: allegro: add pm_runtime support
  media: allegro: lookup VCU settings
  media: allegro: fix module removal if initialization failed
  ...
2021-11-01 18:45:08 -07:00
Linus Torvalds
2dc26d98cf overflow updates for v5.16-rc1
The end goal of the current buffer overflow detection work[0] is to gain
 full compile-time and run-time coverage of all detectable buffer overflows
 seen via array indexing or memcpy(), memmove(), and memset(). The str*()
 family of functions already have full coverage.
 
 While much of the work for these changes have been on-going for many
 releases (i.e. 0-element and 1-element array replacements, as well as
 avoiding false positives and fixing discovered overflows[1]), this series
 contains the foundational elements of several related buffer overflow
 detection improvements by providing new common helpers and FORTIFY_SOURCE
 changes needed to gain the introspection required for compiler visibility
 into array sizes. Also included are a handful of already Acked instances
 using the helpers (or related clean-ups), with many more waiting at the
 ready to be taken via subsystem-specific trees[2]. The new helpers are:
 
 - struct_group() for gaining struct member range introspection.
 - memset_after() and memset_startat() for clearing to the end of structures.
 - DECLARE_FLEX_ARRAY() for using flex arrays in unions or alone in structs.
 
 Also included is the beginning of the refactoring of FORTIFY_SOURCE to
 support memcpy() introspection, fix missing and regressed coverage under
 GCC, and to prepare to fix the currently broken Clang support. Finishing
 this work is part of the larger series[0], but depends on all the false
 positives and buffer overflow bug fixes to have landed already and those
 that depend on this series to land.
 
 As part of the FORTIFY_SOURCE refactoring, a set of both a compile-time
 and run-time tests are added for FORTIFY_SOURCE and the mem*()-family
 functions respectively. The compile time tests have found a legitimate
 (though corner-case) bug[6] already.
 
 Please note that the appearance of "panic" and "BUG" in the
 FORTIFY_SOURCE refactoring are the result of relocating existing code,
 and no new use of those code-paths are expected nor desired.
 
 Finally, there are two tree-wide conversions for 0-element arrays and
 flexible array unions to gain sane compiler introspection coverage that
 result in no known object code differences.
 
 After this series (and the changes that have now landed via netdev
 and usb), we are very close to finally being able to build with
 -Warray-bounds and -Wzero-length-bounds. However, due corner cases in
 GCC[3] and Clang[4], I have not included the last two patches that turn
 on these options, as I don't want to introduce any known warnings to
 the build. Hopefully these can be solved soon.
 
 [0] https://lore.kernel.org/lkml/20210818060533.3569517-1-keescook@chromium.org/
 [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=FORTIFY_SOURCE
 [2] https://lore.kernel.org/lkml/202108220107.3E26FE6C9C@keescook/
 [3] https://lore.kernel.org/lkml/3ab153ec-2798-da4c-f7b1-81b0ac8b0c5b@roeck-us.net/
 [4] https://bugs.llvm.org/show_bug.cgi?id=51682
 [5] https://lore.kernel.org/lkml/202109051257.29B29745C0@keescook/
 [6] https://lore.kernel.org/lkml/20211020200039.170424-1-keescook@chromium.org/
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmGAFWcWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJmKFD/45MJdnvW5MhIEeW5tc5UjfcIPS
 ae+YvlEX/2ZwgSlTxocFVocE6hz7b6eCiX3dSAChPkPxsSfgeiuhjxsU+4ROnELR
 04RqTA/rwT6JXfJcXbDPXfxDL4huUkgktAW3m1sT771AZspeap2GrSwFyttlTqKA
 +kTiZ3lXJVFcw10uyhfp3Lk6eFJxdf5iOjuEou5kBOQfpNKEOduRL2K15hSowOwB
 lARiAC+HbmN+E+npvDE7YqK4V7ZQ0/dtB0BlfqgTkn1spQz8N21kBAMpegV5vvIk
 A+qGHc7q2oyk4M14TRTidQHGQ4juW1Kkvq3NV6KzwQIVD+mIfz0ESn3d4tnp28Hk
 Y+OXTI1BRFlApQU9qGWv33gkNEozeyqMLDRLKhDYRSFPA9UKkpgXQRzeTzoLKyrQ
 4B6n5NnUGcu7I6WWhpyZQcZLDsHGyy0vHzjQGs/NXtb1PzXJ5XIGuPdmx9pVMykk
 IVKnqRcWyGWahfh3asOnoXvdhi1No4NSHQ/ZHfUM+SrIGYjBMaUisw66qm3Fe8ZU
 lbO2CFkCsfGSoKNPHf0lUEGlkyxAiDolazOfflDNxdzzlZo2X1l/a7O/yoO4Pqul
 cdL0eDjiNoQ2YR2TSYPnXq5KSL1RI0tlfS8pH8k1hVhZsQx0wpAQ+qki0S+fLePV
 PdA9XB82G2tmqKc9cQ==
 =9xbT
 -----END PGP SIGNATURE-----

Merge tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull overflow updates from Kees Cook:
 "The end goal of the current buffer overflow detection work[0] is to
  gain full compile-time and run-time coverage of all detectable buffer
  overflows seen via array indexing or memcpy(), memmove(), and
  memset(). The str*() family of functions already have full coverage.

  While much of the work for these changes have been on-going for many
  releases (i.e. 0-element and 1-element array replacements, as well as
  avoiding false positives and fixing discovered overflows[1]), this
  series contains the foundational elements of several related buffer
  overflow detection improvements by providing new common helpers and
  FORTIFY_SOURCE changes needed to gain the introspection required for
  compiler visibility into array sizes. Also included are a handful of
  already Acked instances using the helpers (or related clean-ups), with
  many more waiting at the ready to be taken via subsystem-specific
  trees[2].

  The new helpers are:

   - struct_group() for gaining struct member range introspection

   - memset_after() and memset_startat() for clearing to the end of
     structures

   - DECLARE_FLEX_ARRAY() for using flex arrays in unions or alone in
     structs

  Also included is the beginning of the refactoring of FORTIFY_SOURCE to
  support memcpy() introspection, fix missing and regressed coverage
  under GCC, and to prepare to fix the currently broken Clang support.
  Finishing this work is part of the larger series[0], but depends on
  all the false positives and buffer overflow bug fixes to have landed
  already and those that depend on this series to land.

  As part of the FORTIFY_SOURCE refactoring, a set of both a
  compile-time and run-time tests are added for FORTIFY_SOURCE and the
  mem*()-family functions respectively. The compile time tests have
  found a legitimate (though corner-case) bug[6] already.

  Please note that the appearance of "panic" and "BUG" in the
  FORTIFY_SOURCE refactoring are the result of relocating existing code,
  and no new use of those code-paths are expected nor desired.

  Finally, there are two tree-wide conversions for 0-element arrays and
  flexible array unions to gain sane compiler introspection coverage
  that result in no known object code differences.

  After this series (and the changes that have now landed via netdev and
  usb), we are very close to finally being able to build with
  -Warray-bounds and -Wzero-length-bounds.

  However, due corner cases in GCC[3] and Clang[4], I have not included
  the last two patches that turn on these options, as I don't want to
  introduce any known warnings to the build. Hopefully these can be
  solved soon"

Link: https://lore.kernel.org/lkml/20210818060533.3569517-1-keescook@chromium.org/ [0]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=FORTIFY_SOURCE [1]
Link: https://lore.kernel.org/lkml/202108220107.3E26FE6C9C@keescook/ [2]
Link: https://lore.kernel.org/lkml/3ab153ec-2798-da4c-f7b1-81b0ac8b0c5b@roeck-us.net/ [3]
Link: https://bugs.llvm.org/show_bug.cgi?id=51682 [4]
Link: https://lore.kernel.org/lkml/202109051257.29B29745C0@keescook/ [5]
Link: https://lore.kernel.org/lkml/20211020200039.170424-1-keescook@chromium.org/ [6]

* tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (30 commits)
  fortify: strlen: Avoid shadowing previous locals
  compiler-gcc.h: Define __SANITIZE_ADDRESS__ under hwaddress sanitizer
  treewide: Replace 0-element memcpy() destinations with flexible arrays
  treewide: Replace open-coded flex arrays in unions
  stddef: Introduce DECLARE_FLEX_ARRAY() helper
  btrfs: Use memset_startat() to clear end of struct
  string.h: Introduce memset_startat() for wiping trailing members and padding
  xfrm: Use memset_after() to clear padding
  string.h: Introduce memset_after() for wiping trailing members/padding
  lib: Introduce CONFIG_MEMCPY_KUNIT_TEST
  fortify: Add compile-time FORTIFY_SOURCE tests
  fortify: Allow strlen() and strnlen() to pass compile-time known lengths
  fortify: Prepare to improve strnlen() and strlen() warnings
  fortify: Fix dropped strcpy() compile-time write overflow check
  fortify: Explicitly disable Clang support
  fortify: Move remaining fortify helpers into fortify-string.h
  lib/string: Move helper functions out of string.c
  compiler_types.h: Remove __compiletime_object_size()
  cm4000_cs: Use struct_group() to zero struct cm4000_dev region
  can: flexcan: Use struct_group() to zero struct flexcan_regs regions
  ...
2021-11-01 17:12:56 -07:00
Joanne Koong
8845b4681b bpf: Add alignment padding for "map_extra" + consolidate holes
This patch makes 2 changes regarding alignment padding
for the "map_extra" field.

1) In the kernel header, "map_extra" and "btf_value_type_id"
are rearranged to consolidate the hole.

Before:
struct bpf_map {
	...
        u32		max_entries;	/*    36     4	*/
        u32		map_flags;	/*    40     4	*/

        /* XXX 4 bytes hole, try to pack */

        u64		map_extra;	/*    48     8	*/
        int		spin_lock_off;	/*    56     4	*/
        int		timer_off;	/*    60     4	*/
        /* --- cacheline 1 boundary (64 bytes) --- */
        u32		id;		/*    64     4	*/
        int		numa_node;	/*    68     4	*/
	...
        bool		frozen;		/*   117     1	*/

        /* XXX 10 bytes hole, try to pack */

        /* --- cacheline 2 boundary (128 bytes) --- */
	...
        struct work_struct	work;	/*   144    72	*/

        /* --- cacheline 3 boundary (192 bytes) was 24 bytes ago --- */
	struct mutex	freeze_mutex;	/*   216   144 	*/

        /* --- cacheline 5 boundary (320 bytes) was 40 bytes ago --- */
        u64		writecnt; 	/*   360     8	*/

    /* size: 384, cachelines: 6, members: 26 */
    /* sum members: 354, holes: 2, sum holes: 14 */
    /* padding: 16 */
    /* forced alignments: 2, forced holes: 1, sum forced holes: 10 */

} __attribute__((__aligned__(64)));

After:
struct bpf_map {
	...
        u32		max_entries;	/*    36     4	*/
        u64		map_extra;	/*    40     8 	*/
        u32		map_flags;	/*    48     4	*/
        int		spin_lock_off;	/*    52     4	*/
        int		timer_off;	/*    56     4	*/
        u32		id;		/*    60     4	*/

        /* --- cacheline 1 boundary (64 bytes) --- */
        int		numa_node;	/*    64     4	*/
	...
	bool		frozen		/*   113     1  */

        /* XXX 14 bytes hole, try to pack */

        /* --- cacheline 2 boundary (128 bytes) --- */
	...
        struct work_struct	work;	/*   144    72	*/

        /* --- cacheline 3 boundary (192 bytes) was 24 bytes ago --- */
        struct mutex	freeze_mutex;	/*   216   144	*/

        /* --- cacheline 5 boundary (320 bytes) was 40 bytes ago --- */
        u64		writecnt;       /*   360     8	*/

    /* size: 384, cachelines: 6, members: 26 */
    /* sum members: 354, holes: 1, sum holes: 14 */
    /* padding: 16 */
    /* forced alignments: 2, forced holes: 1, sum forced holes: 14 */

} __attribute__((__aligned__(64)));

2) Add alignment padding to the bpf_map_info struct
More details can be found in commit 36f9814a49 ("bpf: fix uapi hole
for 32 bit compat applications")

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211029224909.1721024-3-joannekoong@fb.com
2021-11-01 14:16:03 -07:00
Linus Torvalds
595b28fb0c Locking updates:
- Move futex code into kernel/futex/ and split up the kitchen sink into
    seperate files to make integration of sys_futex_waitv() simpler.
 
  - Add a new sys_futex_waitv() syscall which allows to wait on multiple
    futexes. The main use case is emulating Windows' WaitForMultipleObjects
    which allows Wine to improve the performance of Windows Games. Also
    native Linux games can benefit from this interface as this is a common
    wait pattern for this kind of applications.
 
  - Add context to ww_mutex_trylock() to provide a path for i915 to rework
    their eviction code step by step without making lockdep upset until the
    final steps of rework are completed. It's also useful for regulator and
    TTM to avoid dropping locks in the non contended path.
 
  - Lockdep and might_sleep() cleanups and improvements
 
  - A few improvements for the RT substitutions.
 
  - The usual small improvements and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/FTITHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVNZD/9vIm3Bu1Coz8tbNXz58AiCYq9Y/vp5
 mzFgSzz+VJTkW5Vh8jo5Uel4rCKZyt+rL276EoaRPzYl8KFtWDbpK3qd3PrXKqTX
 At49JO4ttAMJUHIBQ6vblEkykmfEd9YPU1uSWk5roJ+s7Jmr5VWnu0FEWHP00As5
 tWOca/TM0ei9kof26V2fl5aecTGII4i4Zsvy+LPsXtI+TnmP0gSBcGAS/5UnZTtJ
 vQRWTR3ojoYvh5iTmNqbaURYoQLe2j8yscn1DSW1CABWVmP12eDWs+N7jRP4b5S9
 73xOv5P7vpva41wxrK2ir5iNkpsLE97VL2JOHTW8nm7orblfiuxHLTCkTjEdd2pO
 h8blI2IBizEB3JYn2BMkOAaZQOSjN8hd6Ye/b2B4AMEGWeXEoEv6eVy/orYKCluQ
 XDqGn47Vce/SYmo5vfTB8VMt6nANx8PKvOP3IvjHInYEQBgiT6QrlUw3RRkXBp5s
 clQkjYYwjAMVIXowcCrdhoKjMROzi6STShVwHwGL8MaZXqr8Vl6BUO9ckU0pY+4C
 F000Hzwxi8lGEQ9k+P+BnYOEzH5osCty8lloKiQ/7ciX6T+CZHGJPGK/iY4YL8P5
 C3CJWMsHCqST7DodNFJmdfZt99UfIMmEhshMDduU9AAH0tHCn8vOu0U6WvCtpyBp
 BvHj68zteAtlYg==
 =RZ4x
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Thomas Gleixner:

 - Move futex code into kernel/futex/ and split up the kitchen sink into
   seperate files to make integration of sys_futex_waitv() simpler.

 - Add a new sys_futex_waitv() syscall which allows to wait on multiple
   futexes.

   The main use case is emulating Windows' WaitForMultipleObjects which
   allows Wine to improve the performance of Windows Games. Also native
   Linux games can benefit from this interface as this is a common wait
   pattern for this kind of applications.

 - Add context to ww_mutex_trylock() to provide a path for i915 to
   rework their eviction code step by step without making lockdep upset
   until the final steps of rework are completed. It's also useful for
   regulator and TTM to avoid dropping locks in the non contended path.

 - Lockdep and might_sleep() cleanups and improvements

 - A few improvements for the RT substitutions.

 - The usual small improvements and cleanups.

* tag 'locking-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  locking: Remove spin_lock_flags() etc
  locking/rwsem: Fix comments about reader optimistic lock stealing conditions
  locking: Remove rcu_read_{,un}lock() for preempt_{dis,en}able()
  locking/rwsem: Disable preemption for spinning region
  docs: futex: Fix kernel-doc references
  futex: Fix PREEMPT_RT build
  futex2: Documentation: Document sys_futex_waitv() uAPI
  selftests: futex: Test sys_futex_waitv() wouldblock
  selftests: futex: Test sys_futex_waitv() timeout
  selftests: futex: Add sys_futex_waitv() test
  futex,arm: Wire up sys_futex_waitv()
  futex,x86: Wire up sys_futex_waitv()
  futex: Implement sys_futex_waitv()
  futex: Simplify double_lock_hb()
  futex: Split out wait/wake
  futex: Split out requeue
  futex: Rename mark_wake_futex()
  futex: Rename: match_futex()
  futex: Rename: hb_waiter_{inc,dec,pending}()
  futex: Split out PI futex
  ...
2021-11-01 13:15:36 -07:00
Linus Torvalds
91e1c99e17 perf updates:
core:
 
   - Allow ftrace to instrument parts of the perf core code
 
   - Add a new mem_hops field to perf_mem_data_src which allows to represent
     intra-node/package or inter-node/off-package details to prepare for
     next generation systems which have more hieararchy within the
     node/pacakge level.
 
  tools:
 
   - Update for the new mem_hops field in perf_mem_data_src
 
  arch:
 
   - A set of constraints fixes for the Intel uncore PMU
 
   - The usual set of small fixes and improvements for x86 and PPC
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/GkQTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoaD8D/wLhXR8RxtF4W9HJmHA+5XFsPtg+isp
 ZNU2kOs4gZskFx75NQaRv5ikA8y68TKdIx+NuQvRLYItaMveTToLSsJ55bfGMxIQ
 JHqDvANUNxBmAACnbYQlqf9WgB0i/3fCUHY5lpmN0waKjaswz7WNpycv4ccShVZr
 PKbgEjkeFBhplCqqOF0X5H3V+4q85+nZONm1iSNd4S7/3B6OCxOf1u78usL1bbtW
 yJAMSuTeOVUZCJm7oVywKW/ZlCscT135aKr6xe5QTrjlPuRWzuLaXNezdMnMyoVN
 HVv8a0ClACb8U5KiGfhvaipaIlIAliWJp2qoiNjrspDruhH6Yc+eNh1gUhLbtNpR
 4YZR5jxv4/mS13kzMMQg00cCWQl7N4whPT+ZE9pkpshGt+EwT+Iy3U+v13wDfnnp
 MnDggpWYGEkAck13t/T6DwC3qBIsVujtpiG+tt/ERbTxiuxi1ccQTGY3PDjtHV3k
 tIMH5n7l4jEpfl8VmoSUgz/2h1MLZnQUWp41GXkjkaOt7uunQZen+nAwqpTm28KV
 7U6U0h1q6r7HxOZRxkPPe4HSV+aBNH3H1LeNBfEd3hDCFGf6MY6vLow+2BE9ybk7
 Y6LPbRqq0SN3sd5MND0ZvQEt5Zgol8CMlX+UKoLEEv7RognGbIxkgpK7exv5pC9w
 nWj7TaMfpRzPgw==
 =Oj0G
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Thomas Gleixner:
 "Core:

   - Allow ftrace to instrument parts of the perf core code

   - Add a new mem_hops field to perf_mem_data_src which allows to
     represent intra-node/package or inter-node/off-package details to
     prepare for next generation systems which have more hieararchy
     within the node/pacakge level.

  Tools:

   - Update for the new mem_hops field in perf_mem_data_src

  Arch:

   - A set of constraints fixes for the Intel uncore PMU

   - The usual set of small fixes and improvements for x86 and PPC"

* tag 'perf-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings
  powerpc/perf: Fix data source encodings for L2.1 and L3.1 accesses
  tools/perf: Add mem_hops field in perf_mem_data_src structure
  perf: Add mem_hops field in perf_mem_data_src structure
  perf: Add comment about current state of PERF_MEM_LVL_* namespace and remove an extra line
  perf/core: Allow ftrace for functions in kernel/event/core.c
  perf/x86: Add new event for AUX output counter index
  perf/x86: Add compiler barrier after updating BTS
  perf/x86/intel/uncore: Fix Intel SPR M3UPI event constraints
  perf/x86/intel/uncore: Fix Intel SPR M2PCIE event constraints
  perf/x86/intel/uncore: Fix Intel SPR IIO event constraints
  perf/x86/intel/uncore: Fix Intel SPR CHA event constraints
  perf/x86/intel/uncore: Fix Intel ICX IIO event constraints
  perf/x86/intel/uncore: Fix invalid unit check
  perf/x86/intel/uncore: Support extra IMC channel on Ice Lake server
2021-11-01 13:12:15 -07:00
Linus Torvalds
037c50bfbe for-5.16-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmF/7PAACgkQxWXV+ddt
 WDtp6A//SbVYeuHWpsXkhBiOpJt2PpS1K8VY5LIJc3brua5EZm8IarlR57X9IqYu
 89ZlWnuANrw4d5RRiIO+NYhc+DR6+ydxHesJG+I2B+o5OnR0Ynb06gLhsP1tSK6y
 lYZORQFJZP051ODU/uEc8A0KZN7DySIUmqezAibfyxepF6oPEap0nFp17/B80tWp
 sKdMp2TBN5ymZwsdSK1nZ7ws1ZL57HgkFDPqp8m8CuPTkneG4CtNol6yUpuPExpL
 QzvQsqTygmiFoy0uNTG7Rg7IlKqEuhbR7lwfkmcBZCV66JmhFco5QhxN13QIn42s
 +YSug52SMWc8YVHIEj16xtBgHEqZXWYey8d2ewhc0tDSGDm0HmXCNjcn1vYr0NJr
 5bW/7/3bpkHYejasy1wDEK5P8Uo2xsgpRyAvuEReGoRi8ze66EohahvP3o7YJi/Q
 o0pROXdCT89JbM/T4MTvN/5MUlCSM7rnexXZ39ldGNacPgn9FAUCPw6KtzKKyVRe
 DF19nPOUXSg6SLECbVkRQUwcOjxOTFP+T0Jx61Um8bomFskYJJnmr4SD3pqlzgp7
 NxV5ad0+r7zU0x9MADkyqboObo0ROAfD4hthcZiRN+0UIK+Gq5nATTD5ur6/nwsT
 0PJGOXDPz7cmfqUdmvpA0ctRxbFEqpaz6sDh7nq/iUSmaGITcUM=
 =HvYu
 -----END PGP SIGNATURE-----

Merge tag 'for-5.16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "The updates this time are more under the hood and enhancing existing
  features (subpage with compression and zoned namespaces).

  Performance related:

   - misc small inode logging improvements (+3% throughput, -11% latency
     on sample dbench workload)

   - more efficient directory logging: bulk item insertion, less tree
     searches and locking

   - speed up bulk insertion of items into a b-tree, which is used when
     logging directories, when running delayed items for directories
     (fsync and transaction commits) and when running the slow path
     (full sync) of an fsync (bulk creation run time -4%, deletion -12%)

  Core:

   - continued subpage support
      - make defragmentation work
      - make compression write work

   - zoned mode
      - support ZNS (zoned namespaces), zone capacity is number of
        usable blocks in each zone
      - add dedicated block group (zoned) for relocation, to prevent
        out of order writes in some cases
      - greedy block group reclaim, pick the ones with least usable
        space first

   - preparatory work for send protocol updates

   - error handling improvements

   - cleanups and refactoring

  Fixes:

   - lockdep warnings
      - in show_devname callback, on seeding device
      - device delete on loop device due to conversions to workqueues

   - fix deadlock between chunk allocation and chunk btree modifications

   - fix tracking of missing device count and status"

* tag 'for-5.16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (140 commits)
  btrfs: remove root argument from check_item_in_log()
  btrfs: remove root argument from add_link()
  btrfs: remove root argument from btrfs_unlink_inode()
  btrfs: remove root argument from drop_one_dir_item()
  btrfs: clear MISSING device status bit in btrfs_close_one_device
  btrfs: call btrfs_check_rw_degradable only if there is a missing device
  btrfs: send: prepare for v2 protocol
  btrfs: fix comment about sector sizes supported in 64K systems
  btrfs: update device path inode time instead of bd_inode
  fs: export an inode_update_time helper
  btrfs: fix deadlock when defragging transparent huge pages
  btrfs: sysfs: convert scnprintf and snprintf to sysfs_emit
  btrfs: make btrfs_super_block size match BTRFS_SUPER_INFO_SIZE
  btrfs: update comments for chunk allocation -ENOSPC cases
  btrfs: fix deadlock between chunk allocation and chunk btree modifications
  btrfs: zoned: use greedy gc for auto reclaim
  btrfs: check-integrity: stop storing the block device name in btrfsic_dev_state
  btrfs: use btrfs_get_dev_args_from_path in dev removal ioctls
  btrfs: add a btrfs_get_dev_args_from_path helper
  btrfs: handle device lookup with btrfs_dev_lookup_args
  ...
2021-11-01 12:48:25 -07:00
Linus Torvalds
737f1cd8a8 for-5.16/cdrom-2021-10-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmF8M0EQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpv9yEAC8ZEn5XivbOZ3ZEMxZaFMUK6PzUKZA7tzj
 L94E1h7gbnsu7HcRsNz1jAHgdreMR/66o1R7jZxdB/knM+AKA8bKAMgSt1FIhaA7
 xLOF1CJIeGtEsS8GATb7w8Uz1PRWhy6vsrR+DMeAyXCZsG31Lchm9qvZCYpkZM6c
 3gGY8GhDPXPBBnVRk1R9IhJLxSbP7JFQlGb0ywfeW+AWGZszJDrUI61FCsauyBE8
 e8edQrUtR1CnboqYBKe3rZfh2nn9FYcDyQ/7r2mqnC5fLGyNUWwhEH5aju+QyDan
 H4+T3HYKUZd77OjPeEp5VzHCcJupBpsa/m0LKjRhnc3NqnSyjo0bwEJ8kqz9JjXa
 9DmpF/zCZy0rIK/iuUa09suTHXJ/JnuDKV+P66K5dW7t+4gwwXcBhsaB5bNyK1ha
 +VwkarKOkB/ngGD1Jm0ctykFdwsqKXa6ZjGgePWiNXeO9y8ZeCgKfPbl+4G8N5Db
 0uGGK0IfiHwIcgz2zJL/7k3CEmiuVNhOrFtFOfGKBhOQiEpUR8Yxf4Ox9vgfQ9qL
 FTguI+VWs6zXJAhx4NdxTAR+fprQPt3u2iszs2E1IbXk0leRW75g/sAeoe36KiPK
 8H/CcJbLrN+5n8g+pckz94zOW1K3GilXBPj0ZRualIJKEbrhcAjLNpx+Esx+wl0S
 8ZdzWf6dDQ==
 =NmI1
 -----END PGP SIGNATURE-----

Merge tag 'for-5.16/cdrom-2021-10-29' of git://git.kernel.dk/linux-block

Pull CDROM updates from Jens Axboe:
 "On behalf of Phillip, here are the CDROM updates for the 5.16-rc1
  merge window:

   - Add ioctl for improved media change detection (Lukas)

   - Reformat some documentation (Phillip)

   - Redundant variable removal (luo)"

* tag 'for-5.16/cdrom-2021-10-29' of git://git.kernel.dk/linux-block:
  cdrom: Remove redundant variable and its assignment
  cdrom: docs: reformat table in Documentation/userspace-api/ioctl/cdrom.rst
  drivers/cdrom: improved ioctl for media change detection
2021-11-01 10:09:14 -07:00
Linus Torvalds
8d1f01775f for-5.16/io_uring-2021-10-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmF8KHcQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgphvVEADHMsZP3fOGyJNqnIibIrDL5ZdUGtr5iH3c
 0UIi9It0jo9xOyPX/aY2n1pInXK4vvND9ULC+XGYttSJZXWuYEbMGYQ34du2EP0r
 dypN4JPwO6X+mFkJND6x8IeDCzj/fy6LCFbWbRlDNsndTZ/gavVTOybMpOLdCJx9
 IyXE1iHismaIaD7I3Q77zvN0ei87cEwBfg9R0vRAXKBKUh5raSiLWsOYOiXQkZH4
 8iUeDmOLlaWghgXwweODxARXuWq+gWZgiBMd0tp0QCECXMv+NIpfJYauvLHJDa/u
 QScr9uRMrJS3KgRgt61o+Z2fcpzJF/bL0e0s5Ul9CgflRWucARbgodUMl4rZCi9D
 WOwxPxv8Oab8IT7Qc/ZHdY3ULJsULRgbtmc/9OqPL5Y/Ww9/9E63Is8O4q/QFc7T
 xJ1p5yZKw3G+G7oG0YBYE0U+x3RUzi4b/Ob+ECeLcAAAcp+XFg6epK6Aj8HDWd8K
 kGYlEBKEq1hILM44K59YTwAT/Cp+fkwe+x7pNQ3JjqtPpVpqGT7RoMUuCduofT1J
 ROtB+S8/AwhdABL6KKUYSVF8zlfoXbQpQs3SUKjaBtPVjwXLZwXERy7ttD/4STtT
 QjC+5/qAWnMR8CYADE0E3rlicUkHJm1+AHukYLz0REphDcNO8GuB9PCDzX4SX/ol
 SGJ6hoprYQ==
 =5U4u
 -----END PGP SIGNATURE-----

Merge tag 'for-5.16/io_uring-2021-10-29' of git://git.kernel.dk/linux-block

Pull io_uring updates from Jens Axboe:
 "Light on new features - basically just the hybrid mode support.

  Outside of that it's just fixes, cleanups, and performance
  improvements.

  In detail:

   - Add ring related information to the fdinfo output (Hao)

   - Hybrid async mode (Hao)

   - Support for batched issue on block (me)

   - sqe error trace improvement (me)

   - IOPOLL efficiency improvements (Pavel)

   - submit state cleanups and improvements (Pavel)

   - Completion side improvements (Pavel)

   - Drain improvements (Pavel)

   - Buffer selection cleanups (Pavel)

   - Fixed file node improvements (Pavel)

   - io-wq setup cancelation fix (Pavel)

   - Various other performance improvements and cleanups (Pavel)

   - Misc fixes (Arnd, Bixuan, Changcheng, Hao, me, Noah)"

* tag 'for-5.16/io_uring-2021-10-29' of git://git.kernel.dk/linux-block: (97 commits)
  io-wq: remove worker to owner tw dependency
  io_uring: harder fdinfo sq/cq ring iterating
  io_uring: don't assign write hint in the read path
  io_uring: clusterise ki_flags access in rw_prep
  io_uring: kill unused param from io_file_supports_nowait
  io_uring: clean up timeout async_data allocation
  io_uring: don't try io-wq polling if not supported
  io_uring: check if opcode needs poll first on arming
  io_uring: clean iowq submit work cancellation
  io_uring: clean io_wq_submit_work()'s main loop
  io-wq: use helper for worker refcounting
  io_uring: implement async hybrid mode for pollable requests
  io_uring: Use ERR_CAST() instead of ERR_PTR(PTR_ERR())
  io_uring: split logic of force_nonblock
  io_uring: warning about unused-but-set parameter
  io_uring: inform block layer of how many requests we are submitting
  io_uring: simplify io_file_supports_nowait()
  io_uring: combine REQ_F_NOWAIT_{READ,WRITE} flags
  io_uring: arm poll for non-nowait files
  fs/io_uring: Prioritise checking faster conditions first in io_write
  ...
2021-11-01 09:41:33 -07:00
Linus Torvalds
643a7234e0 for-5.16/drivers-2021-10-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmF8KFsQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgph1ZEACwNuHkAZcIgNzKhzuLP9OjMhv9vV+q254G
 /EcM31e+qgRioMd0ihbVsgW76jOwLEmb3ldKGcN+0Wo5+Sv9Im8+wAWYY1REOZO5
 ZTUBfAzhEh63/EtqTFiU8U+7dmXqy4z7NaICnhlynjwkd3IT+I561os6kcqwJMMr
 G+Q1Cnk9rgCMIoLOCoVThIpjmjyZzF33qJb2VEIkHfkot62iNdpABWaSASF+CCba
 z8LfbvLAYz3YLl4thXlLJFU282T5y7gzgSomGvX4F0rMJSbqFbgoNEPxaYw9CvzC
 uC6MnYCYdCdvVkWVm1b8I8LYzPd5GrpVOSh3JQGvuA4Ppv2IyJCDSruYGgVUlhao
 cVPzuHCqNCfKk0ykYVRZy9oKiBk5wmFeKM/lSHu408y8VNraPNIAEpB6sA9qGr22
 AYr8lNh3JDr0g8dtFsDOq+7u3MANW0KQozfzwTPZo6NjzEE1D2jIg39Ljiijo9+Y
 3pU8pitIAhsKd2KhW1H6LmtJbF4dX756VKYDXOhzgORU0NZYgvGhBIj9tAdpQR0S
 xeae5Kj0/wBGcqR/owf/n1EY/q7rWgNDETnsBhbmzMZyhwH3L6zhT+bfD8YoQCHY
 ueyqhyIUe4YBxTrIpICqwDlqaMYAmQ0jRaci+bK9ovVlQ89FQ9o/BE2COPlI/DGX
 w+rUmmoX4g==
 =HiWU
 -----END PGP SIGNATURE-----

Merge tag 'for-5.16/drivers-2021-10-29' of git://git.kernel.dk/linux-block

Pull block driver updates from Jens Axboe:

 - paride driver cleanups (Christoph)

 - Remove cryptoloop support (Christoph)

 - null_blk poll support (me)

 - Now that add_disk() supports proper error handling, add it to various
   drivers (Luis)

 - Make ataflop actually work again (Michael)

 - s390 dasd fixes (Stefan, Heiko)

 - nbd fixes (Yu, Ye)

 - Remove redundant wq flush in mtip32xx (Christophe)

 - NVMe updates
      - fix a multipath partition scanning deadlock (Hannes Reinecke)
      - generate uevent once a multipath namespace is operational again
        (Hannes Reinecke)
      - support unique discovery controller NQNs (Hannes Reinecke)
      - fix use-after-free when a port is removed (Israel Rukshin)
      - clear shadow doorbell memory on resets (Keith Busch)
      - use struct_size (Len Baker)
      - add error handling support for add_disk (Luis Chamberlain)
      - limit the maximal queue size for RDMA controllers (Max Gurtovoy)
      - use a few more symbolic names (Max Gurtovoy)
      - fix error code in nvme_rdma_setup_ctrl (Max Gurtovoy)
      - add support for ->map_queues on FC (Saurav Kashyap)
      - support the current discovery subsystem entry (Hannes Reinecke)
      - use flex_array_size and struct_size (Len Baker)

 - bcache fixes (Christoph, Coly, Chao, Lin, Qing)

 - MD updates (Christoph, Guoqing, Xiao)

 - Misc fixes (Dan, Ding, Jiapeng, Shin'ichiro, Ye)

* tag 'for-5.16/drivers-2021-10-29' of git://git.kernel.dk/linux-block: (117 commits)
  null_blk: Fix handling of submit_queues and poll_queues attributes
  block: ataflop: Fix warning comparing pointer to 0
  bcache: replace snprintf in show functions with sysfs_emit
  bcache: move uapi header bcache.h to bcache code directory
  nvmet: use flex_array_size and struct_size
  nvmet: register discovery subsystem as 'current'
  nvmet: switch check for subsystem type
  nvme: add new discovery log page entry definitions
  block: ataflop: more blk-mq refactoring fixes
  block: remove support for cryptoloop and the xor transfer
  mtd: add add_disk() error handling
  rnbd: add error handling support for add_disk()
  um/drivers/ubd_kern: add error handling support for add_disk()
  m68k/emu/nfblock: add error handling support for add_disk()
  xen-blkfront: add error handling support for add_disk()
  bcache: add error handling support for add_disk()
  dm: add add_disk() error handling
  block: aoe: fixup coccinelle warnings
  nvmet: use struct_size over open coded arithmetic
  nvme: drop scan_lock and always kick requeue list when removing namespaces
  ...
2021-11-01 09:27:38 -07:00
Taehee Yoo
b9022b53ad amt: add control plane of amt interface
It adds definitions and control plane code for AMT.
this is very similar to udp tunneling interfaces such as gtp, vxlan, etc.
In the next patch, data plane code will be added.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:36:08 +00:00
David S. Miller
894d084434 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Use array_size() in ebtables, from Gustavo A. R. Silva.

2) Attach IPS_ASSURED to internal UDP stream state, reported by
   Maciej Zenczykowski.

3) Add NFT_META_IFTYPE to match on the interface type either
   from ingress or egress.

4) Generalize pktinfo->tprot_set to flags field.

5) Allow to match on inner headers / payload data.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 12:59:58 +00:00
Pablo Neira Ayuso
c46b38dc87 netfilter: nft_payload: support for inner header matching / mangling
Allow to match and mangle on inner headers / payload data after the
transport header. There is a new field in the pktinfo structure that
stores the inner header offset which is calculated only when requested.
Only TCP and UDP supported at this stage.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-11-01 09:31:03 +01:00
Pablo Neira Ayuso
56fa95014a netfilter: nft_meta: add NFT_META_IFTYPE
Generalize NFT_META_IIFTYPE to NFT_META_IFTYPE which allows you to match
on the interface type of the skb->dev field. This field is used by the
netdev family to add an implicit dependency to skip non-ethernet packets
when matching on layer 3 and 4 TCP/IP header fields.

For backward compatibility, add the NFT_META_IIFTYPE alias to
NFT_META_IFTYPE.

Add __NFT_META_IIFTYPE, to be used by userspace in the future to match
specifically on the iiftype.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-11-01 09:29:20 +01:00
Paolo Bonzini
4e33868433 KVM/arm64 updates for Linux 5.16
- More progress on the protected VM front, now with the full
   fixed feature set as well as the limitation of some hypercalls
   after initialisation.
 
 - Cleanup of the RAZ/WI sysreg handling, which was pointlessly
   complicated
 
 - Fixes for the vgic placement in the IPA space, together with a
   bunch of selftests
 
 - More memcg accounting of the memory allocated on behalf of a guest
 
 - Timer and vgic selftests
 
 - Workarounds for the Apple M1 broken vgic implementation
 
 - KConfig cleanups
 
 - New kvmarm.mode=none option, for those who really dislike us
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmF7u5YPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpD6w8QAIKDLJCTqkxv5Vh4ZSmtXxg4gTZMBlg8oSQ8
 sVL639aqBvFe3A6Vmz6IwBm+NT7Sm1zxkuH9qHzVR1gmXq0oLYNrIuyrzRW8PvqO
 hIkSRRoVsf03755TmkxwR7/2jAFxb6FhEVAy6VWdQyI44orihIPvMp8aTIq+jvU+
 XoNGb/rPf9HpSUtvuaHYvZhSZBhoi5dRnkr33R1+VR69n7Axs8lm905xcl6Pt0a0
 QqYZWQvFu/BXPyNflG7LUsegRF/iiV2vNTbNNowkzlV5suqxBpJAp6ApDL/gWrHv
 ya/6cMqicSjBIkWnawhXY98w6/5xfzK4IV/zc00FNWOlUdVP89Thqrgc8EkigS9R
 BGcxFFqj41snr+ensSBBIkNtV+dBX52H3rUE0F9seiTXm8QWI86JobdeNadT8tUP
 TXdOeCUcA+cp4Ngln18lsbOEaBkPA5H1po1nUFPHbKnVOxnqXScB7E/xF6rAbryV
 m+Z+oidU7MyS/Ev/Da0ww/XFx7cs2ez9EgeQvjcdFAvUMqS6kcXEExvgGYlm+KRQ
 GBMKPLCNHKdflMANoSpol7MZUmPJ45XoWKW1rntj2r9X+oJW2Z2hEx32xrWDJdqK
 ixnbjog5kNZb0CjLGsUC90lo2hpRJecaLhAjgTLYaNC1QxGPrt92eat6gnwuMTBc
 mpADqi7w
 =qBAO
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for Linux 5.16

- More progress on the protected VM front, now with the full
  fixed feature set as well as the limitation of some hypercalls
  after initialisation.

- Cleanup of the RAZ/WI sysreg handling, which was pointlessly
  complicated

- Fixes for the vgic placement in the IPA space, together with a
  bunch of selftests

- More memcg accounting of the memory allocated on behalf of a guest

- Timer and vgic selftests

- Workarounds for the Apple M1 broken vgic implementation

- KConfig cleanups

- New kvmarm.mode=none option, for those who really dislike us
2021-10-31 02:28:48 -04:00
Coly Li
cf2197ca4b bcache: move uapi header bcache.h to bcache code directory
The header file include/uapi/linux/bcache.h is not really a user space
API heaer. This file defines the ondisk format of bcache internal meta
data but no one includes it from user space, bcache-tools has its own
copy of this header with minor modification.

Therefore, this patch moves include/uapi/linux/bcache.h to bcache code
directory as drivers/md/bcache/bcache_ondisk.h.

Suggested-by: Arnd Bergmann <arnd@kernel.org>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/r/20211029060930.119923-2-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-29 06:43:21 -06:00
David Sterba
e77fbf9903 btrfs: send: prepare for v2 protocol
This is preparatory work for send protocol update to version 2 and
higher.

We have many pending protocol update requests but still don't have the
basic protocol rev in place, the first thing that must happen is to do
the actual versioning support.

The protocol version is u32 and is a new member in the send ioctl
struct. Validity of the version field is backed by a new flag bit. Old
kernels would fail when a higher version is requested. Version protocol
0 will pick the highest supported version, BTRFS_SEND_STREAM_VERSION,
  that's also exported in sysfs.

The version is still unchanged and will be increased once we have new
incompatible commands or stream updates.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-10-29 12:38:43 +02:00
Kumar Kartikeya Dwivedi
d6aef08a87 bpf: Add bpf_kallsyms_lookup_name helper
This helper allows us to get the address of a kernel symbol from inside
a BPF_PROG_TYPE_SYSCALL prog (used by gen_loader), so that we can
relocate typeless ksym vars.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-2-memxor@gmail.com
2021-10-28 16:30:06 -07:00
Joanne Koong
9330986c03 bpf: Add bloom filter map implementation
This patch adds the kernel-side changes for the implementation of
a bpf bloom filter map.

The bloom filter map supports peek (determining whether an element
is present in the map) and push (adding an element to the map)
operations.These operations are exposed to userspace applications
through the already existing syscalls in the following way:

BPF_MAP_LOOKUP_ELEM -> peek
BPF_MAP_UPDATE_ELEM -> push

The bloom filter map does not have keys, only values. In light of
this, the bloom filter map's API matches that of queue stack maps:
user applications use BPF_MAP_LOOKUP_ELEM/BPF_MAP_UPDATE_ELEM
which correspond internally to bpf_map_peek_elem/bpf_map_push_elem,
and bpf programs must use the bpf_map_peek_elem and bpf_map_push_elem
APIs to query or add an element to the bloom filter map. When the
bloom filter map is created, it must be created with a key_size of 0.

For updates, the user will pass in the element to add to the map
as the value, with a NULL key. For lookups, the user will pass in the
element to query in the map as the value, with a NULL key. In the
verifier layer, this requires us to modify the argument type of
a bloom filter's BPF_FUNC_map_peek_elem call to ARG_PTR_TO_MAP_VALUE;
as well, in the syscall layer, we need to copy over the user value
so that in bpf_map_peek_elem, we know which specific value to query.

A few things to please take note of:
 * If there are any concurrent lookups + updates, the user is
responsible for synchronizing this to ensure no false negative lookups
occur.
 * The number of hashes to use for the bloom filter is configurable from
userspace. If no number is specified, the default used will be 5 hash
functions. The benchmarks later in this patchset can help compare the
performance of using different number of hashes on different entry
sizes. In general, using more hashes decreases both the false positive
rate and the speed of a lookup.
 * Deleting an element in the bloom filter map is not supported.
 * The bloom filter map may be used as an inner map.
 * The "max_entries" size that is specified at map creation time is used
to approximate a reasonable bitmap size for the bloom filter, and is not
otherwise strictly enforced. If the user wishes to insert more entries
into the bloom filter than "max_entries", they may do so but they should
be aware that this may lead to a higher false positive rate.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-2-joannekoong@fb.com
2021-10-28 13:22:49 -07:00
Jeremy Kerr
99ce45d5e7 mctp: Implement extended addressing
This change allows an extended address struct - struct sockaddr_mctp_ext
- to be passed to sendmsg/recvmsg. This allows userspace to specify
output ifindex and physical address information (for sendmsg) or receive
the input ifindex/physaddr for incoming messages (for recvmsg). This is
typically used by userspace for MCTP address discovery and assignment
operations.

The extended addressing facility is conditional on a new sockopt:
MCTP_OPT_ADDR_EXT; userspace must explicitly enable addressing before
the kernel will consume/populate the extended address data.

Includes a fix for an uninitialised var:
Reported-by: kernel test robot <lkp@intel.com>

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-26 14:58:45 +01:00
Luo Jie
1cf4e9a6fb net: phy: add constants for fast retrain related register
Add the constants for 2.5G fast retrain capability
in 10G AN control register, fast retrain status and
control register and THP bypass register into mdio.h.

Signed-off-by: Luo Jie <luoj@codeaurora.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25 14:04:18 +01:00
David Edmondson
e615e35589 KVM: x86: On emulation failure, convey the exit reason, etc. to userspace
Should instruction emulation fail, include the VM exit reason, etc. in
the emulation_failure data passed to userspace, in order that the VMM
can report it as a debugging aid when describing the failure.

Suggested-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210920103737.2696756-4-david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-25 06:48:24 -04:00
David Edmondson
a9d496d8e0 KVM: x86: Clarify the kvm_run.emulation_failure structure layout
Until more flags for kvm_run.emulation_failure flags are defined, it
is undetermined whether new payload elements corresponding to those
flags will be additive or alternative. As a hint to userspace that an
alternative is possible, wrap the current payload elements in a union.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210920103737.2696756-2-david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-25 06:48:24 -04:00
Vincent Mailhol
d99755f71a can: netlink: add interface for CAN-FD Transmitter Delay Compensation (TDC)
Add the netlink interface for TDC parameters of struct can_tdc_const
and can_tdc.

Contrary to the can_bittiming(_const) structures for which there is
just a single IFLA_CAN(_DATA)_BITTMING(_CONST) entry per structure,
here, we create a nested entry IFLA_CAN_TDC. Within this nested entry,
additional IFLA_CAN_TDC_TDC* entries are added for each of the TDC
parameters of the newly introduced struct can_tdc_const and struct
can_tdc.

For struct can_tdc_const, these are:
        IFLA_CAN_TDC_TDCV_MIN
        IFLA_CAN_TDC_TDCV_MAX
        IFLA_CAN_TDC_TDCO_MIN
        IFLA_CAN_TDC_TDCO_MAX
        IFLA_CAN_TDC_TDCF_MIN
        IFLA_CAN_TDC_TDCF_MAX

For struct can_tdc, these are:
        IFLA_CAN_TDC_TDCV
        IFLA_CAN_TDC_TDCO
        IFLA_CAN_TDC_TDCF

This is done so that changes can be applied in the future to the
structures without breaking the netlink interface.

The TDC netlink logic works as follow:

 * CAN_CTRLMODE_FD is not provided:
    - if any TDC parameters are provided: error.

    - TDC parameters not provided: TDC parameters unchanged.

 * CAN_CTRLMODE_FD is provided and is false:
     - TDC is deactivated: both the structure and the
       CAN_CTRLMODE_TDC_{AUTO,MANUAL} flags are flushed.

 * CAN_CTRLMODE_FD provided and is true:
    - CAN_CTRLMODE_TDC_{AUTO,MANUAL} and tdc{v,o,f} not provided: call
      can_calc_tdco() to automatically decide whether TDC should be
      activated and, if so, set CAN_CTRLMODE_TDC_AUTO and uses the
      calculated tdco value.

    - CAN_CTRLMODE_TDC_AUTO and tdco provided: set
      CAN_CTRLMODE_TDC_AUTO and use the provided tdco value. Here,
      tdcv is illegal and tdcf is optional.

    - CAN_CTRLMODE_TDC_MANUAL and both of tdcv and tdco provided: set
      CAN_CTRLMODE_TDC_MANUAL and use the provided tdcv and tdco
      value. Here, tdcf is optional.

    - CAN_CTRLMODE_TDC_{AUTO,MANUAL} are mutually exclusive. Whenever
      one flag is turned on, the other will automatically be turned
      off. Providing both returns an error.

    - Combination other than the one listed above are illegal and will
      return an error.

N.B. above rules mean that whenever CAN_CTRLMODE_FD is provided, the
previous TDC values will be overwritten. The only option to reuse
previous TDC value is to not provide CAN_CTRLMODE_FD.

All the new parameters are defined as u32. This arbitrary choice is
done to mimic the other bittiming values with are also all of type
u32. An u16 would have been sufficient to hold the TDC values.

This patch completes below series (c.f. [1]):
  - commit 289ea9e4ae ("can: add new CAN FD bittiming parameters:
    Transmitter Delay Compensation (TDC)")
  - commit c25cc79932 ("can: bittiming: add calculation for CAN FD
    Transmitter Delay Compensation (TDC)")

[1] https://lore.kernel.org/linux-can/20210224002008.4158-1-mailhol.vincent@wanadoo.fr/T/#t

Link: https://lore.kernel.org/all/20210918095637.20108-5-mailhol.vincent@wanadoo.fr
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-24 16:24:29 +02:00
Vincent Mailhol
63dfe07096 can: bittiming: allow TDC{V,O} to be zero and add can_tdc_const::tdc{v,o,f}_min
ISO 11898-1 specifies in section 11.3.3 "Transmitter delay
compensation" that "the configuration range for [the] SSP position
shall be at least 0 to 63 minimum time quanta."

Because SSP = TDCV + TDCO, it means that we should allow both TDCV and
TDCO to hold zero value in order to honor SSP's minimum possible
value.

However, current implementation assigned special meaning to TDCV and
TDCO's zero values:
  * TDCV = 0 -> TDCV is automatically measured by the transceiver.
  * TDCO = 0 -> TDC is off.

In order to allow for those values to really be zero and to maintain
current features, we introduce two new flags:
  * CAN_CTRLMODE_TDC_AUTO indicates that the controller support
    automatic measurement of TDCV.
  * CAN_CTRLMODE_TDC_MANUAL indicates that the controller support
    manual configuration of TDCV. N.B.: current implementation failed
    to provide an option for the driver to indicate that only manual
    mode was supported.

TDC is disabled if both CAN_CTRLMODE_TDC_AUTO and
CAN_CTRLMODE_TDC_MANUAL flags are off, c.f. the helper function
can_tdc_is_enabled() which is also introduced in this patch.

Also, this patch adds three fields: tdcv_min, tdco_min and tdcf_min to
struct can_tdc_const. While we are not convinced that those three
fields could be anything else than zero, we can imagine that some
controllers might specify a lower bound on these. Thus, those minimums
are really added "just in case".

Comments of struct can_tdc and can_tdc_const are updated accordingly.

Finally, the changes are applied to the etas_es58x driver.

Link: https://lore.kernel.org/all/20210918095637.20108-2-mailhol.vincent@wanadoo.fr
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-24 16:24:28 +02:00
Jakub Kicinski
24f7cf9b85 Quite a few changes:
* the applicable eth_hw_addr_set() and const hw_addr changes
  * various code cleanups/refactorings
  * stack usage reductions across the wireless stack
  * some unstructured find_ie() -> structured find_element()
    changes
  * a few more pieces of multi-BSSID support
  * some 6 GHz regulatory support
  * 6 GHz support in hwsim, for testing userspace code
  * Light Communications (LC, 802.11bb) early band definitions
    to be able to add a first driver soon
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAmFxi7MACgkQB8qZga/f
 l8SGYA/+IRqIfgIcdQW2XkRZanpYHirS4ZELcB7qH2XDAVLALpZx9h9kVhpVrjmh
 BFYaJz3H5cEfwH6+JSfc7dvYlzUN0oIwzs+s0PpUwK13R/NhqCcP0PSJESBtfk/4
 sxerJAHyWNZ2Ji0dq18m17IEU5IC03y8h3xekzeyl5UOmU72sqvqq3ZT8yQ48bu9
 K4BbTFv5/cPKS8EAiFDGQbzVYz94zj+6XZH/kiN3vnhtqhI7knbvwF5zAafXdZOF
 wnCucHbWcyvwxiqIMPZBUy0nP7p2Lrz24MFbDSUK/9umxO7JNyP6A8KOM5/CwsEF
 u0hcGmW8NSRIkMQZoZi41d0eHZa8iPMv9gRYlSDFvElBR/kPp206zOYgzYj9Fi5g
 zuAISwhzWT06BsfLvKLLjK/xZmxXybOvkdie/vT0VAWHBrGR8FZV5ovfac/Qb9FJ
 RwAuyOKpHP1q+Oy+tQdNpnAVx1W0E/VoHXyT0a+G4cXfHG6wZL6zG1F4/kS/g+wC
 184MpQwxZkliNhuHbHScIARhCBCGxSNvU9R9Kz3vGGS9l+syT8ecl2IQRgEvU6FY
 yo6bryWFWXkE7jtjEZWvnqQMOD1EcQeu66Nvg2LS82twcn/lJ/RlkjaXneeNP2wD
 BZMOk3u0wWgWhm7AcaehT7IuMJi3Q0BYask+ZTA8Uv5eqW7YADg=
 =ZXrI
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-net-next-2021-10-21' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
Quite a few changes:
 * the applicable eth_hw_addr_set() and const hw_addr changes
 * various code cleanups/refactorings
 * stack usage reductions across the wireless stack
 * some unstructured find_ie() -> structured find_element()
   changes
 * a few more pieces of multi-BSSID support
 * some 6 GHz regulatory support
 * 6 GHz support in hwsim, for testing userspace code
 * Light Communications (LC, 802.11bb) early band definitions
   to be able to add a first driver soon

* tag 'mac80211-next-for-net-next-2021-10-21' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next: (35 commits)
  cfg80211: fix kernel-doc for MBSSID EMA
  mac80211: Prevent AP probing during suspend
  nl80211: Add LC placeholder band definition to nl80211_band
  ...
====================

Link: https://lore.kernel.org/r/20211021154953.134849-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22 10:20:56 -07:00
David S. Miller
bdfa75ad70 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Lots of simnple overlapping additions.

With a build fix from Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-22 11:41:16 +01:00
Dave Marchevsky
aba64c7da9 bpf: Add verified_insns to bpf_prog_info and fdinfo
This stat is currently printed in the verifier log and not stored
anywhere. To ease consumption of this data, add a field to bpf_prog_aux
so it can be exposed via BPF_OBJ_GET_INFO_BY_FD and fdinfo.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211020074818.1017682-2-davemarchevsky@fb.com
2021-10-21 15:51:47 -07:00
Hengqi Chen
9eeb3aa33a bpf: Add bpf_skc_to_unix_sock() helper
The helper is used in tracing programs to cast a socket
pointer to a unix_sock pointer.
The return value could be NULL if the casting is illegal.

Suggested-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021134752.1223426-2-hengqi.chen@gmail.com
2021-10-21 15:11:06 -07:00
Johannes Berg
f9d366d420 cfg80211: fix kernel-doc for MBSSID EMA
The struct member ema_max_profile_periodicity was listed
with the wrong name in the kernel-doc, fix that.

Link: https://lore.kernel.org/r/20211021173038.18ec2030c66b.Iac731bb299525940948adad2c41f514b7dd81c47@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-21 17:34:10 +02:00
Srinivasan Raju
63fa042666 nl80211: Add LC placeholder band definition to nl80211_band
Define LC band which is a draft under IEEE 802.11bb.
Current NL80211_BAND_LC is a placeholder band and
will be more defined IEEE 802.11bb progresses.

Signed-off-by: Srinivasan Raju <srini.raju@purelifi.com>
Link: https://lore.kernel.org/r/20211018100143.7565-2-srini.raju@purelifi.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-21 17:25:17 +02:00
Emmanuel Grumbach
1add667da2 nl80211: vendor-cmd: intel: add more details for IWL_MVM_VENDOR_CMD_HOST_GET_OWNERSHIP
Explain more the expected flow for this command.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Link: https://lore.kernel.org/r/20211020051147.29297-1-emmanuel.grumbach@intel.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-21 17:25:17 +02:00
Toke Høiland-Jørgensen
dfcb63ce1d fq_codel: generalise ce_threshold marking for subset of traffic
Commit e72aeb9ee0 ("fq_codel: implement L4S style ce_threshold_ect1
marking") expanded the ce_threshold feature of FQ-CoDel so it can
be applied to a subset of the traffic, using the ECT(1) bit of the ECN
field as the classifier. However, hard-coding ECT(1) as the only
classifier for this feature seems limiting, so let's expand it to be more
general.

To this end, change the parameter from a ce_threshold_ect1 boolean, to a
one-byte selector/mask pair (ce_threshold_{selector,mask}) which is applied
to the whole diffserv/ECN field in the IP header. This makes it possible to
classify packets by any value in either the ECN field or the diffserv
field. In particular, setting a selector of INET_ECN_ECT_1 and a mask of
INET_ECN_MASK corresponds to the functionality before this patch, and a
mask of ~INET_ECN_MASK allows using the selector as a straight-forward
match against a diffserv code point:

 # apply ce_threshold to ECT(1) traffic
 tc qdisc replace dev eth0 root fq_codel ce_threshold 1ms ce_threshold_selector 0x1/0x3

 # apply ce_threshold to ECN-capable traffic marked as diffserv AF22
 tc qdisc replace dev eth0 root fq_codel ce_threshold 1ms ce_threshold_selector 0x50/0xfc

Regardless of the selector chosen, the normal rules for ECN-marking of
packets still apply, i.e., the flow must still declare itself ECN-capable
by setting one of the bits in the ECN field to get marked at all.

v2:
- Add tc usage examples to patch description

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20211019174709.69081-1-toke@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-20 15:24:36 -07:00
Michael Tretter
7aea2c0b48 media: allegro: add control to disable encoder buffer
The encoder buffer can have a negative impact on the quality of the
encoded video.

Add a control to allow user space to disable the encoder buffer per
channel if the VCU supports the encoder buffer but the quality is not
sufficient.

Signed-off-by: Michael Tretter <m.tretter@pengutronix.de>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2021-10-20 15:56:40 +01:00
Coly Li
0a2b3e3635 bcache: reserve never used bits from bkey.high
There sre 3 bits in member high of struct bkey are never used, and no
plan to support them in future,
- HEADER_SIZE, start at bit 58, length 2 bits
- KEY_PINNED,  start at bit 55, length 1 bit

No any kernel code, or user space tool references or accesses the three
bits. Therefore it is possible and feasible to reserve the valuable bits
from bkey.high. They can be used in future for other purpose.

Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/r/20211020143812.6403-3-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-20 08:40:54 -06:00
Kajol Jain
fec9cc6175 perf: Add mem_hops field in perf_mem_data_src structure
Going forward, future generation systems can have more hierarchy
within the node/package level but currently we don't have any data source
encoding field in perf, which can be used to represent this level of data.

Add a new field called 'mem_hops' in the perf_mem_data_src structure
which can be used to represent intra-node/package or inter-node/off-package
details. This field is of size 3 bits where PERF_MEM_HOPS_{NA, 0..6} value
can be used to present different hop levels data.

Also add corresponding macros to define mem_hop field values
and shift value.

Currently we define macro for HOPS_0 which corresponds
to data coming from another core but same node.

For ex: Encodings for mem_hops fields with L2 cache:

L2			- local L2
L2 | REMOTE | HOPS_0	- remote core, same node L2

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211006140654.298352-3-kjain@linux.ibm.com
2021-10-19 17:27:00 +02:00
Kajol Jain
f4c6217f7f perf: Add comment about current state of PERF_MEM_LVL_* namespace and remove an extra line
Add a comment about PERF_MEM_LVL_* namespace being depricated
to some extent in favour of added PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_}
fields.

Remove an extra line present in perf_mem__lvl_scnprintf function.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211006140654.298352-2-kjain@linux.ibm.com
2021-10-19 17:27:00 +02:00
Pavel Begunkov
6224590d24 io_uring: add flag to not fail link after timeout
For some reason non-off IORING_OP_TIMEOUT always fails links, it's
pretty inconvenient and unnecessary limits chaining after it to hard
linking, which is far from ideal, e.g. doesn't pair well with timeout
cancellation. Add a flag forcing it to not fail links on -ETIME.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/17c7ec0fb7a6113cc6be8cdaedcada0ba836ac0e.1633199723.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-19 05:49:54 -06:00
Yonghong Song
223f903e9c bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG
Patch set [1] introduced BTF_KIND_TAG to allow tagging
declarations for struct/union, struct/union field, var, func
and func arguments and these tags will be encoded into
dwarf. They are also encoded to btf by llvm for the bpf target.

After BTF_KIND_TAG is introduced, we intended to use it
for kernel __user attributes. But kernel __user is actually
a type attribute. Upstream and internal discussion showed
it is not a good idea to mix declaration attribute and
type attribute. So we proposed to introduce btf_type_tag
as a type attribute and existing btf_tag renamed to
btf_decl_tag ([2]).

This patch renamed BTF_KIND_TAG to BTF_KIND_DECL_TAG and some
other declarations with *_tag to *_decl_tag to make it clear
the tag is for declaration. In the future, BTF_KIND_TYPE_TAG
might be introduced per [3].

 [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
 [2] https://reviews.llvm.org/D111588
 [3] https://reviews.llvm.org/D111199

Fixes: b5ea834dde ("bpf: Support for new btf kind BTF_KIND_TAG")
Fixes: 5b84bd1036 ("libbpf: Add support for BTF_KIND_TAG")
Fixes: 5c07f2fec0 ("bpftool: Add support for BTF_KIND_TAG")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com
2021-10-18 18:35:36 -07:00
Kees Cook
47c662486c treewide: Replace 0-element memcpy() destinations with flexible arrays
The 0-element arrays that are used as memcpy() destinations are actually
flexible arrays. Adjust their structures accordingly so that memcpy()
can better reason able their destination size (i.e. they need to be seen
as "unknown" length rather than "zero").

In some cases, use of the DECLARE_FLEX_ARRAY() helper is needed when a
flexible array is alone in a struct.

Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Manish Rangankar <mrangankar@marvell.com>
Cc: GR-QLogic-Storage-Upstream@marvell.com
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Phillip Potter <phil@philpotter.co.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Florian Schilhabel <florian.c.schilhabel@googlemail.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Fabio Aiuto <fabioaiuto83@gmail.com>
Cc: Ross Schmidt <ross.schm.dev@gmail.com>
Cc: Marco Cesati <marcocesati@gmail.com>
Cc: ath10k@lists.infradead.org
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: linux-staging@lists.linux.dev
Signed-off-by: Kees Cook <keescook@chromium.org>
2021-10-18 12:28:53 -07:00
Kees Cook
3080ea5553 stddef: Introduce DECLARE_FLEX_ARRAY() helper
There are many places where kernel code wants to have several different
typed trailing flexible arrays. This would normally be done with multiple
flexible arrays in a union, but since GCC and Clang don't (on the surface)
allow this, there have been many open-coded workarounds, usually involving
neighboring 0-element arrays at the end of a structure. For example,
instead of something like this:

struct thing {
	...
	union {
		struct type1 foo[];
		struct type2 bar[];
	};
};

code works around the compiler with:

struct thing {
	...
	struct type1 foo[0];
	struct type2 bar[];
};

Another case is when a flexible array is wanted as the single member
within a struct (which itself is usually in a union). For example, this
would be worked around as:

union many {
	...
	struct {
		struct type3 baz[0];
	};
};

These kinds of work-arounds cause problems with size checks against such
zero-element arrays (for example when building with -Warray-bounds and
-Wzero-length-bounds, and with the coming FORTIFY_SOURCE improvements),
so they must all be converted to "real" flexible arrays, avoiding warnings
like this:

fs/hpfs/anode.c: In function 'hpfs_add_sector_to_btree':
fs/hpfs/anode.c:209:27: warning: array subscript 0 is outside the bounds of an interior zero-length array 'struct bplus_internal_node[0]' [-Wzero-length-bounds]
  209 |    anode->btree.u.internal[0].down = cpu_to_le32(a);
      |    ~~~~~~~~~~~~~~~~~~~~~~~^~~
In file included from fs/hpfs/hpfs_fn.h:26,
                 from fs/hpfs/anode.c:10:
fs/hpfs/hpfs.h:412:32: note: while referencing 'internal'
  412 |     struct bplus_internal_node internal[0]; /* (internal) 2-word entries giving
      |                                ^~~~~~~~

drivers/net/can/usb/etas_es58x/es58x_fd.c: In function 'es58x_fd_tx_can_msg':
drivers/net/can/usb/etas_es58x/es58x_fd.c:360:35: warning: array subscript 65535 is outside the bounds of an interior zero-length array 'u8[0]' {aka 'unsigned char[]'} [-Wzero-length-bounds]
  360 |  tx_can_msg = (typeof(tx_can_msg))&es58x_fd_urb_cmd->raw_msg[msg_len];
      |                                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from drivers/net/can/usb/etas_es58x/es58x_core.h:22,
                 from drivers/net/can/usb/etas_es58x/es58x_fd.c:17:
drivers/net/can/usb/etas_es58x/es58x_fd.h:231:6: note: while referencing 'raw_msg'
  231 |   u8 raw_msg[0];
      |      ^~~~~~~

However, it _is_ entirely possible to have one or more flexible arrays
in a struct or union: it just has to be in another struct. And since it
cannot be alone in a struct, such a struct must have at least 1 other
named member -- but that member can be zero sized. Wrap all this nonsense
into the new DECLARE_FLEX_ARRAY() in support of having flexible arrays
in unions (or alone in a struct).

As with struct_group(), since this is needed in UAPI headers as well,
implement the core there, with a non-UAPI wrapper.

Additionally update kernel-doc to understand its existence.

https://github.com/KSPP/linux/issues/137

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
2021-10-18 12:28:52 -07:00
Oliver Upton
c68dc1b577 KVM: x86: Report host tsc and realtime values in KVM_GET_CLOCK
Handling the migration of TSCs correctly is difficult, in part because
Linux does not provide userspace with the ability to retrieve a (TSC,
realtime) clock pair for a single instant in time. In lieu of a more
convenient facility, KVM can report similar information in the kvm_clock
structure.

Provide userspace with a host TSC & realtime pair iff the realtime clock
is based on the TSC. If userspace provides KVM_SET_CLOCK with a valid
realtime value, advance the KVM clock by the amount of elapsed time. Do
not step the KVM clock backwards, though, as it is a monotonic
oscillator.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210916181538.968978-5-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-18 14:43:44 -04:00
David S. Miller
7adaf56edd Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS for net-next:

1) Add new run_estimation toggle to IPVS to stop the estimation_timer
   logic, from Dust Li.

2) Relax superfluous dynset check on NFT_SET_TIMEOUT.

3) Add egress hook, from Lukas Wunner.

4) Nowadays, almost all hook functions in x_table land just call the hook
   evaluation loop. Remove remaining hook wrappers from iptables and IPVS.
   From Florian Westphal.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 14:05:25 +01:00