Commit graph

11089 commits

Author SHA1 Message Date
Toms Atteka
28a3f06017 net: openvswitch: IPv6: Add IPv6 extension header support
This change adds a new OpenFlow field OFPXMT_OFB_IPV6_EXTHDR and
packets can be filtered using ipv6_ext flag.

Signed-off-by: Toms Atteka <cpp.code.lv@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25 10:32:55 +00:00
Subbaraya Sundeep
1241e329ce ethtool: add support to set/get completion queue event size
Add support to set completion queue event size via ethtool -G
parameter and get it via ethtool -g parameter.

~ # ./ethtool -G eth0 cqe-size 512
~ # ./ethtool -g eth0
Ring parameters for eth0:
Pre-set maximums:
RX:             1048576
RX Mini:        n/a
RX Jumbo:       n/a
TX:             1048576
Current hardware settings:
RX:             256
RX Mini:        n/a
RX Jumbo:       n/a
TX:             4096
RX Buf Len:             2048
CQE Size:                128

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-23 20:33:05 -08:00
Hans Schultz
a21d9a670d net: bridge: Add support for bridge port in locked mode
In a 802.1X scenario, clients connected to a bridge port shall not
be allowed to have traffic forwarded until fully authenticated.
A static fdb entry of the clients MAC address for the bridge port
unlocks the client and allows bidirectional communication.

This scenario is facilitated with setting the bridge port in locked
mode, which is also supported by various switchcore chipsets.

Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-23 12:52:34 +00:00
Hangbin Liu
129e3c1bab bonding: add new option ns_ip6_target
This patch add a new bonding option ns_ip6_target, which correspond
to the arp_ip_target. With this we set IPv6 targets and send IPv6 NS
request to determine the health of the link.

For other related options like the validation, we still use
arp_validate, and will change to ns_validate later.

Note: the sysfs configuration support was removed based on
https://lore.kernel.org/netdev/8863.1645071997@famine

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-21 12:13:45 +00:00
Mobashshera Rasool
4b340a5a72 net: ip6mr: add support for passing full packet on wrong mif
This patch adds support for MRT6MSG_WRMIFWHOLE which is used to pass
full packet and real vif id when the incoming interface is wrong.
While the RP and FHR are setting up state we need to be sending the
registers encapsulated with all the data inside otherwise we lose it.
The RP then decapsulates it and forwards it to the interested parties.
Currently with WRONGMIF we can only be sending empty register packets
and will lose that data.
This behaviour can be enabled by using MRT_PIM with
val == MRT6MSG_WRMIFWHOLE. This doesn't prevent MRT6MSG_WRONGMIF from
happening, it happens in addition to it, also it is controlled by the same
throttling parameters as WRONGMIF (i.e. 1 packet per 3 seconds currently).
Both messages are generated to keep backwards compatibily and avoid
breaking someone who was enabling MRT_PIM with val == 4, since any
positive val is accepted and treated the same.

Signed-off-by: Mobashshera Rasool <mobash.rasool.linux@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 16:05:54 +00:00
Jacques de Laval
47f0bd5032 net: Add new protocol attribute to IP addresses
This patch adds a new protocol attribute to IPv4 and IPv6 addresses.
Inspiration was taken from the protocol attribute of routes. User space
applications like iproute2 can set/get the protocol with the Netlink API.

The attribute is stored as an 8-bit unsigned integer.

The protocol attribute is set by kernel for these categories:

- IPv4 and IPv6 loopback addresses
- IPv6 addresses generated from router announcements
- IPv6 link local addresses

User space may pass custom protocols, not defined by the kernel.

Grouping addresses on their origin is useful in scenarios where you want
to distinguish between addresses based on who added them, e.g. kernel
vs. user space.

Tagging addresses with a string label is an existing feature that could be
used as a solution. Unfortunately the max length of a label is
15 characters, and for compatibility reasons the label must be prefixed
with the name of the device followed by a colon. Since device names also
have a max length of 15 characters, only -1 characters is guaranteed to be
available for any origin tag, which is not that much.

A reference implementation of user space setting and getting protocols
is available for iproute2:

9a6ea18bd7

Signed-off-by: Jacques de Laval <Jacques.De.Laval@westermo.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220217150202.80802-1-Jacques.De.Laval@westermo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-18 21:20:06 -08:00
D. Wythe
f9496b7c1b net/smc: Add global configure for handshake limitation by netlink
Although we can control SMC handshake limitation through socket options,
which means that applications who need it must modify their code. It's
quite troublesome for many existing applications. This patch modifies
the global default value of SMC handshake limitation through netlink,
providing a way to put constraint on handshake without modifies any code
for applications.

Suggested-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-11 11:14:58 +00:00
D. Wythe
a6a6fe27ba net/smc: Dynamic control handshake limitation by socket options
This patch aims to add dynamic control for SMC handshake limitation for
every smc sockets, in production environment, it is possible for the
same applications to handle different service types, and may have
different opinion on SMC handshake limitation.

This patch try socket options to complete it, since we don't have socket
option level for SMC yet, which requires us to implement it at the same
time.

This patch does the following:

- add new socket option level: SOL_SMC.
- add new SMC socket option: SMC_LIMIT_HS.
- provide getter/setter for SMC socket options.

Link: https://lore.kernel.org/all/20f504f961e1a803f85d64229ad84260434203bd.1644323503.git.alibuda@linux.alibaba.com/
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-11 11:14:58 +00:00
Jakub Kicinski
5b91c5cc0e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-10 17:29:56 -08:00
Linus Torvalds
f1baf68e13 Networking fixes for 5.17-rc4, including fixes from netfilter and can.
Current release - new code bugs:
 
  - sparx5: fix get_stat64 out-of-bound access and crash
 
  - smc: fix netdev ref tracker misuse
 
 Previous releases - regressions:
 
  - eth: ixgbevf: require large buffers for build_skb on 82599VF,
    avoid overflows
 
  - eth: ocelot: fix all IP traffic getting trapped to CPU with PTP
    over IP
 
  - bonding: fix rare link activation misses in 802.3ad mode
 
 Previous releases - always broken:
 
  - tcp: fix tcp sock mem accounting in zero-copy corner cases
 
  - remove the cached dst when uncloning an skb dst and its metadata,
    since we only have one ref it'd lead to an UaF
 
  - netfilter:
    - conntrack: don't refresh sctp entries in closed state
    - conntrack: re-init state for retransmitted syn-ack, avoid
      connection establishment getting stuck with strange stacks
    - ctnetlink: disable helper autoassign, avoid it getting lost
    - nft_payload: don't allow transport header access for fragments
 
  - dsa: fix use of devres for mdio throughout drivers
 
  - eth: amd-xgbe: disable interrupts during pci removal
 
  - eth: dpaa2-eth: unregister netdev before disconnecting the PHY
 
  - eth: ice: fix IPIP and SIT TSO offload
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmIFcW0ACgkQMUZtbf5S
 IrsloQ/+LrzlXgYrWf60DEFT4+AfJ20YxeakB+SpT0cZdvylU6NHF/U4rSHdUJRn
 xZiHfhfKzNWV5miJ2wzOuvufUvoz173dtrdgJJmp6G43qfAFyqiqtxrbVuuTQk0G
 C2i5M66zJ2svSj3EO5dYKV8zeydd14eVeoaJqfN5rjMARTmvpT/ssdzw0LTf0aXp
 87CF/WyeH8NyfQxQwPmbGRxDpnV2RqDJYSNdA4kOtDrnQmoKet32rhE6liHeP2jD
 OFQo70QXEMVyIZEh4wT17lMqA4M1zIEQtgrB0NsmVNU3jQnDI4UQ0aA2gHzwK31S
 glORZWGqTGGrhVy9uQBGKUK29RW8vb0D2ojZpi3zd0htWpIqpfFf6wuMAdUwEZag
 mTlZ1Yi6YgzAQEdSALoXLKps1GiHQzQNYPK6fNxoSOxnmoQTMQL9KxU/HB9d8W+L
 hjuYOGDw9vtGyUpNF7lktCQR/sWjFCmevLk97d2mhofFqfHBJYzFHGyd0GgNVzgl
 o3CMWnyiqTVapLRMTFPnEADEXGWq4DYAjRPkfjDCHeITB9Neg9S85Geth+xwfdi1
 cwSu64xFb7k7hiAbnymB3xg5sBQjXHrIQo03GHVymQE5p6XxIhn+5UD9CYHGJ2TF
 7PGyHeJJOZyWGQ/iSZJbo+BpsVGthIgk+9tTKPeTToej0tic27w=
 =jlhJ
 -----END PGP SIGNATURE-----

Merge tag 'net-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter and can.

Current release - new code bugs:

   - sparx5: fix get_stat64 out-of-bound access and crash

   - smc: fix netdev ref tracker misuse

  Previous releases - regressions:

   - eth: ixgbevf: require large buffers for build_skb on 82599VF, avoid
     overflows

   - eth: ocelot: fix all IP traffic getting trapped to CPU with PTP
     over IP

   - bonding: fix rare link activation misses in 802.3ad mode

  Previous releases - always broken:

   - tcp: fix tcp sock mem accounting in zero-copy corner cases

   - remove the cached dst when uncloning an skb dst and its metadata,
     since we only have one ref it'd lead to an UaF

   - netfilter:
      - conntrack: don't refresh sctp entries in closed state
      - conntrack: re-init state for retransmitted syn-ack, avoid
        connection establishment getting stuck with strange stacks
      - ctnetlink: disable helper autoassign, avoid it getting lost
      - nft_payload: don't allow transport header access for fragments

   - dsa: fix use of devres for mdio throughout drivers

   - eth: amd-xgbe: disable interrupts during pci removal

   - eth: dpaa2-eth: unregister netdev before disconnecting the PHY

   - eth: ice: fix IPIP and SIT TSO offload"

* tag 'net-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (53 commits)
  net: dsa: mv88e6xxx: fix use-after-free in mv88e6xxx_mdios_unregister
  net: mscc: ocelot: fix mutex lock error during ethtool stats read
  ice: Avoid RTNL lock when re-creating auxiliary device
  ice: Fix KASAN error in LAG NETDEV_UNREGISTER handler
  ice: fix IPIP and SIT TSO offload
  ice: fix an error code in ice_cfg_phy_fec()
  net: mpls: Fix GCC 12 warning
  dpaa2-eth: unregister the netdev before disconnecting from the PHY
  skbuff: cleanup double word in comment
  net: macb: Align the dma and coherent dma masks
  mptcp: netlink: process IPv6 addrs in creating listening sockets
  selftests: mptcp: add missing join check
  net: usb: qmi_wwan: Add support for Dell DW5829e
  vlan: move dev_put into vlan_dev_uninit
  vlan: introduce vlan_dev_free_egress_priority
  ax25: fix UAF bugs of net_device caused by rebinding operation
  net: dsa: fix panic when DSA master device unbinds on shutdown
  net: amd-xgbe: disable interrupts during pci removal
  tipc: rate limit warning for received illegal binding update
  net: mdio: aspeed: Add missing MODULE_DEVICE_TABLE
  ...
2022-02-10 16:01:22 -08:00
Jakub Kicinski
4523082982 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

1) Conntrack sets on CHECKSUM_UNNECESSARY for UDP packet with no checksum,
   from Kevin Mitchell.

2) skb->priority support for nfqueue, from Nicolas Dichtel.

3) Remove conntrack extension register API, from Florian Westphal.

4) Move nat destroy hook to nf_nat_hook instead, to remove
   nf_ct_ext_destroy(), also from Florian.

5) Wrap pptp conntrack NAT hooks into single structure, from Florian Westphal.

6) Support for tcp option set to noop for nf_tables, also from Florian.

7) Do not run x_tables comment match from packet path in nf_tables,
   from Florian Westphal.

8) Replace spinlock by cmpxchg() loop to update missed ct event,
   from Florian Westphal.

9) Wrap cttimeout hooks into single structure, from Florian.

10) Add fast nft_cmp expression for up to 16-bytes.

11) Use cb->ctx to store context in ctnetlink dump, instead of using
    cb->args[], from Florian Westphal.

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: ctnetlink: use dump structure instead of raw args
  nfqueue: enable to set skb->priority
  netfilter: nft_cmp: optimize comparison for 16-bytes
  netfilter: cttimeout: use option structure
  netfilter: ecache: don't use nf_conn spinlock
  netfilter: nft_compat: suppress comment match
  netfilter: exthdr: add support for tcp option removal
  netfilter: conntrack: pptp: use single option structure
  netfilter: conntrack: remove extension register api
  netfilter: conntrack: handle ->destroy hook via nat_ops instead
  netfilter: conntrack: move extension sizes into core
  netfilter: conntrack: make all extensions 8-byte alignned
  netfilter: nfqueue: enable to get skb->priority
  netfilter: conntrack: mark UDP zero checksum as CHECKSUM_UNNECESSARY
====================

Link: https://lore.kernel.org/r/20220209133616.165104-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09 21:35:08 -08:00
Jakub Kicinski
1127170d45 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2022-02-09

We've added 126 non-merge commits during the last 16 day(s) which contain
a total of 201 files changed, 4049 insertions(+), 2215 deletions(-).

The main changes are:

1) Add custom BPF allocator for JITs that pack multiple programs into a huge
   page to reduce iTLB pressure, from Song Liu.

2) Add __user tagging support in vmlinux BTF and utilize it from BPF
   verifier when generating loads, from Yonghong Song.

3) Add per-socket fast path check guarding from cgroup/BPF overhead when
   used by only some sockets, from Pavel Begunkov.

4) Continued libbpf deprecation work of APIs/features and removal of their
   usage from samples, selftests, libbpf & bpftool, from Andrii Nakryiko
   and various others.

5) Improve BPF instruction set documentation by adding byte swap
   instructions and cleaning up load/store section, from Christoph Hellwig.

6) Switch BPF preload infra to light skeleton and remove libbpf dependency
   from it, from Alexei Starovoitov.

7) Fix architecture-agnostic macros in libbpf for accessing syscall
   arguments from BPF progs for non-x86 architectures,
   from Ilya Leoshkevich.

8) Rework port members in struct bpf_sk_lookup and struct bpf_sock to be
   of 16-bit field with anonymous zero padding, from Jakub Sitnicki.

9) Add new bpf_copy_from_user_task() helper to read memory from a different
   task than current. Add ability to create sleepable BPF iterator progs,
   from Kenny Yu.

10) Implement XSK batching for ice's zero-copy driver used by AF_XDP and
    utilize TX batching API from XSK buffer pool, from Maciej Fijalkowski.

11) Generate temporary netns names for BPF selftests to avoid naming
    collisions, from Hangbin Liu.

12) Implement bpf_core_types_are_compat() with limited recursion for
    in-kernel usage, from Matteo Croce.

13) Simplify pahole version detection and finally enable CONFIG_DEBUG_INFO_DWARF5
    to be selected with CONFIG_DEBUG_INFO_BTF, from Nathan Chancellor.

14) Misc minor fixes to libbpf and selftests from various folks.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (126 commits)
  selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup
  bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide
  libbpf: Fix compilation warning due to mismatched printf format
  selftests/bpf: Test BPF_KPROBE_SYSCALL macro
  libbpf: Add BPF_KPROBE_SYSCALL macro
  libbpf: Fix accessing the first syscall argument on s390
  libbpf: Fix accessing the first syscall argument on arm64
  libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL
  selftests/bpf: Skip test_bpf_syscall_macro's syscall_arg1 on arm64 and s390
  libbpf: Fix accessing syscall arguments on riscv
  libbpf: Fix riscv register names
  libbpf: Fix accessing syscall arguments on powerpc
  selftests/bpf: Use PT_REGS_SYSCALL_REGS in bpf_syscall_macro
  libbpf: Add PT_REGS_SYSCALL_REGS macro
  selftests/bpf: Fix an endianness issue in bpf_syscall_macro test
  bpf: Fix bpf_prog_pack build HPAGE_PMD_SIZE
  bpf: Fix leftover header->pages in sparc and powerpc code.
  libbpf: Fix signedness bug in btf_dump_array_data()
  selftests/bpf: Do not export subtest as standalone test
  bpf, x86_64: Fail gracefully on bpf_jit_binary_pack_finalize failures
  ...
====================

Link: https://lore.kernel.org/r/20220209210050.8425-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09 18:40:56 -08:00
Menglong Dong
5cad527d5f net: drop_monitor: support drop reason
In the commit c504e5c2f9 ("net: skb: introduce kfree_skb_reason()")
drop reason is introduced to the tracepoint of kfree_skb. Therefore,
drop_monitor is able to report the drop reason to users by netlink.

The drop reasons are reported as string to users, which is exactly
the same as what we do when reporting it to ftrace.

Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220209060838.55513-1-imagedong@tencent.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09 17:25:57 -08:00
Jakub Sitnicki
9a69e2b385 bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide
remote_port is another case of a BPF context field documented as a 32-bit
value in network byte order for which the BPF context access converter
generates a load of a zero-padded 16-bit integer in network byte order.

First such case was dst_port in bpf_sock which got addressed in commit
4421a58271 ("bpf: Make dst_port field in struct bpf_sock 16-bit wide").

Loading 4-bytes from the remote_port offset and converting the value with
bpf_ntohl() leads to surprising results, as the expected value is shifted
by 16 bits.

Reduce the confusion by splitting the field in two - a 16-bit field holding
a big-endian integer, and a 16-bit zero-padding anonymous field that
follows it.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220209184333.654927-2-jakub@cloudflare.com
2022-02-09 11:40:45 -08:00
Matt Johnston
63ed1aab3d mctp: Add SIOCMCTP{ALLOC,DROP}TAG ioctls for tag control
This change adds a couple of new ioctls for mctp sockets:
SIOCMCTPALLOCTAG and SIOCMCTPDROPTAG.  These ioctls provide facilities
for explicit allocation / release of tags, overriding the automatic
allocate-on-send/release-on-reply and timeout behaviours. This allows
userspace more control over messages that may not fit a simple
request/response model.

In order to indicate a pre-allocated tag to the sendmsg() syscall, we
introduce a new flag to the struct sockaddr_mctp.smctp_tag value:
MCTP_TAG_PREALLOC.

Additional changes from Jeremy Kerr <jk@codeconstruct.com.au>.

Contains a fix that was:
Reported-by: kernel test robot <lkp@intel.com>

Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-09 12:00:11 +00:00
Linus Torvalds
c3bf8a1440 perf/urgent contains 3 fixups:
- Intel/PT: filters could crash the kernel
 
  - Intel: default disable the PMU for SMM, some new-ish EFI firmware has
    started using CPL3 and the PMU CPL filters don't discriminate against
    SMM, meaning that CPL3 (userspace only) events now also count EFI/SMM
    cycles.
 
  - Fixup for perf_event_attr::sig_data
 
 (Peter Zijlstra)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmH/vpgACgkQEsHwGGHe
 VUqIGw/9EWg7Ek89BG9ZZui8EEDAzx3x0s/tyxiz0z18YfvtTnex5I87uJUYpw2s
 hFhxxmGN+rwhcMGQDc0sDLLLxp170Yg0383N6OBBBMWPtNyxMWihBOHQgz8hQzbW
 KtwoiBewmvAycHw0aoOtDMqFZTn5RToONnG9h7yV9rUIGKq75XNh72MBy9sCSE2F
 w8lA3WWVTrv91YTPSMbsrm/tMC6eQCRiJGRMHTapxrWxkVu/H8O42pxJgS6dlo+h
 vw025hXcf0KGBLzwVSHYdZg8jMn7uD2oSMh+wQ+Jy15XjKVWDfF1m3sA5S+zSJsS
 THHtmqni5mF5xn0H7eOK9nYmRXR013zx6weo9miK4SN1pcoJq+PTNdSZOIwBm3Nh
 eUXR/bXFYL0GGuPOk0QHA9AjqbCBPrkiw1nfppbJem2rrZ0uKHyKa8REVcVg/Xzy
 e/nDy8I2y2bnwU9Ugk9BNWBRmn54Q2kb4/egmtLME6oYiqOXumQ4ZB/CmwRaSwxG
 bB99/tBKblrWSA6wcgATkqYFSg4ZJniDxKipnrEYX8ePkGODKHoIQS4EUyjxuPW/
 fO2G4Oe8aO/qYS/yei8XcubyEFaSPyUo+th+ZiPODCt15JKzQCAxeOYxqnEI4I4s
 5afDBmAo47bs9Eem7GRjZOgrDOP88+lISZ1rZidp5paDwWAmL2E=
 =0tH5
 -----END PGP SIGNATURE-----

Merge tag 'perf_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

 - Intel/PT: filters could crash the kernel

 - Intel: default disable the PMU for SMM, some new-ish EFI firmware has
   started using CPL3 and the PMU CPL filters don't discriminate against
   SMM, meaning that CPL3 (userspace only) events now also count EFI/SMM
   cycles.

 - Fixup for perf_event_attr::sig_data

* tag 'perf_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/pt: Fix crash with stop filters in single-range mode
  perf: uapi: Document perf_event_attr::sig_data truncation on 32 bit architectures
  selftests/perf_events: Test modification of perf_event_attr::sig_data
  perf: Copy perf_event_attr::sig_data on modification
  x86/perf: Default set FREEZE_ON_SMI for all
2022-02-06 10:11:14 -08:00
Linus Torvalds
90c9e950c0 xen: branch for v5.17-rc3
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYf5Y3AAKCRCAXGG7T9hj
 vmfRAP9dBr4LfnfLkY+If70xAVdcImOjK7NzTYCaWAFF1evmJgEAueEWUrV7hJQq
 HYiLXPWFsr5eqnzlcWwLPaBxFH+uIAY=
 =Jxjf
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-5.17a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - documentation fixes related to Xen

 - enable x2apic mode when available when running as hardware
   virtualized guest under Xen

 - cleanup and fix a corner case of vcpu enumeration when running a
   paravirtualized Xen guest

* tag 'for-linus-5.17a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/Xen: streamline (and fix) PV CPU enumeration
  xen: update missing ioctl magic numers documentation
  Improve docs for IOCTL_GNTDEV_MAP_GRANT_REF
  xen: xenbus_dev.h: delete incorrect file name
  xen/x2apic: enable x2apic mode when supported for HVM
2022-02-05 10:40:17 -08:00
Linus Torvalds
5fdb26213f ARM:
* A couple of fixes when handling an exception while a SError has been delivered
 
 * Workaround for Cortex-A510's single-step erratum
 
 RISCV:
 
 * Make CY, TM, and IR counters accessible in VU mode
 
 * Fix SBI implementation version
 
 x86:
 
 * Report deprecation of x87 features in supported CPUID
 
 * Preparation for fixing an interrupt delivery race on AMD hardware
 
 * Sparse fix
 
 All except POWER and s390:
 
 * Rework guest entry code to correctly mark noinstr areas and fix vtime'
   accounting (for x86, this was already mostly correct but not entirely;
   for ARM, MIPS and RISC-V it wasn't)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmH+E4AUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNujwf+ON/8pBWyMPdjiY5l5SyNLpRup8Su
 zkQoMEDICI7khYUz2bEAjOazFWHmHPsdogAlG82QeJCbFmCqyMb6iX0uWj53BdGP
 P2bOM/tXbulvKBBeiTritkUUNO+hBmmSF+57AOJSW+Enhc7HFwk54cuft6f30r+d
 JRaEOhPOP34hQ+wFQQhZZh72BaZBqgnrYwZDp1TiC0Wh8v7P4Nf+NFtEgba2nsxC
 xfz5PrEhvegtU8Ee9JAF2bAl7851WJq557P2cOpghtUMgh4t6GzCcUOCKIie67oQ
 0Vaf+OieAopdT+QNazSEWO9zxl7eTpWjk2hrwsDrgKHAL/YmuWJuSyEYIA==
 =C0LZ
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - A couple of fixes when handling an exception while a SError has
     been delivered

   - Workaround for Cortex-A510's single-step erratum

  RISC-V:

   - Make CY, TM, and IR counters accessible in VU mode

   - Fix SBI implementation version

  x86:

   - Report deprecation of x87 features in supported CPUID

   - Preparation for fixing an interrupt delivery race on AMD hardware

   - Sparse fix

  All except POWER and s390:

   - Rework guest entry code to correctly mark noinstr areas and fix
     vtime' accounting (for x86, this was already mostly correct but not
     entirely; for ARM, MIPS and RISC-V it wasn't)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Use ERR_PTR_USR() to return -EFAULT as a __user pointer
  KVM: x86: Report deprecated x87 features in supported CPUID
  KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata
  KVM: arm64: Stop handle_exit() from handling HVC twice when an SError occurs
  KVM: arm64: Avoid consuming a stale esr value when SError occur
  RISC-V: KVM: Fix SBI implementation version
  RISC-V: KVM: make CY, TM, and IR counters accessible in VU mode
  kvm/riscv: rework guest entry logic
  kvm/arm64: rework guest entry logic
  kvm/x86: rework guest entry logic
  kvm/mips: rework guest entry logic
  kvm: add guest_state_{enter,exit}_irqoff()
  KVM: x86: Move delivery of non-APICv interrupt into vendor code
  kvm: Move KVM_GET_XSAVE2 IOCTL definition at the end of kvm.h
2022-02-05 09:55:59 -08:00
Paolo Bonzini
7e6a6b400d KVM/arm64 fixes for 5.17, take #2
- A couple of fixes when handling an exception while a SError has been
   delivered
 
 - Workaround for Cortex-A510's single-step[ erratum
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmH9LlcPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDLTcP/3Ry8CzvPubZquMyNdRUFvEg2EcfTa6vtIGW
 Fw7ap2hwPUaXUgJKDihMFIWj3Wf/wPmXw4t2Sr8R/yq8v9kWe+IG1isnT0yQhY3W
 kLXEqc8Mu4Rf8+jvlFHsp5mLENHIswpWAv/EY49ChgZkNmtkKpnPm1qnD89d8bNv
 tUwooDWidQ/7nXdM3z6zygSROJS24+OGTYTWzOQ1KgV3FGaXbqYiCleoPOpRR/Tc
 DQQWF/tVl8bZCqgkGKZCv3aXT0ZUPrQggARJGai78vP0l2sE/Kyaydgq5I7npZja
 2L2U4kDNoPYIVa8A1jvV3Ef3AqNFs6B7+jXWfYIgAcXjCYzDK3cZcxavf/Inq9F1
 3udVGJGSzH1KkGaihW3BVhsqGORRHKCdksJzWRgqf6vGyJhJw0u0D2u1rTWcT+jw
 Nm4KxShp0CX59HSLnVF5sR0Mct3jNNZ7UCCgH7q10wuBqYRfJT32hCo2ZrT7g9oD
 IQ+pa2dVYa3SaKZ4O6T/lSlbLOuuxtvmcEIfxYpPD6m10S5RrxOdsW3MCtiYM5HQ
 24oo2mk6NIu/va0XxhcW+NBMcYtLQD9JUGbkUkpcRy2mgilTi9b4YPp+muYM7plQ
 /S1gj2kGY8vjMg0H+wysjMJyl2huEwSRsZ/UfxCAgW+MYhHLDxhxAnDWc8EcwGgE
 tUzomowB
 =Mbx/
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 5.17, take #2

- A couple of fixes when handling an exception while a SError has been
  delivered

- Workaround for Cortex-A510's single-step[ erratum
2022-02-05 00:58:25 -05:00
Justin Iurman
be847673cf uapi: ioam: Insertion frequency
Add the insertion frequency uapi for IOAM lwtunnels.

Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-04 20:24:45 -08:00
Linus Torvalds
0a566d43c8 sound fixes for 5.17-rc3
A collection of small fixes.  The major changes are ASoC core fixes,
 addressing the DPCM locking issue after the recent code changes and
 the potentially invalid register accesses via control API.  Also,
 HD-audio got a core fix for Oops at dynamic unbinding.
 The rest are device-specific small fixes, including the usual stuff
 like HD-audio and USB-audio quirks.
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmH85I4OHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE8qfA/+N1mELke3hHJh91uxPLTNMYDaG/NwsnHD1s9y
 wO9+c47Ett5fmY5eq45+55ZhQlskvlx6s7v9VuVkGGBdg+Dif5cnvRzN6irKyywg
 MV/ZrTaDWAHMWO/NEeSkKgtJEicficvxxtqCrPb/INJaexB4ZMsSr06gtxFO8ojx
 O3Y492EiLWzQuDahaq0NfBdLtPA/i6q4Ed4FPkWM2i0zQyP2zFArdO3+SS7eVWQC
 A4PerEC04QZcDSp//pFodw5ecLfndhh0LCklJIQkHqpbP2UDrPcQNa64TfLevxha
 Q6N4qJvq3KOn269wSmdeapYfBUcOE7omPZlg1s7ttAraxGbxY8uafYNXjQPW+got
 l6jcZzI0aZeHFx5ffDgEwBf1GsCsO3vKN4ao74sPJXDlstjlvH5uQmKJLRRo2P/K
 ntL+UEG8QiFZ2lzdF+6rci/V6B9YQunaZvhCDijB31A+CZvCUnE9yqTZQrW8Udef
 sqARmfT5hyXQpwDT8w5DsIwZ9mjCchsX8CuhtE9aaGYhG+NTPeWu0IHPeXmvso/I
 /w40dkZGuNZBpX33m3kkOJToUgBP8Jo+J47Tb9nEErcQ9MnSjYG7q31mnxMR9YXy
 I0K/J94IQMzsR1M7u0AaLCaCvFWqhxDtRViTpk2iCiUDaFWRyvKTAAjzFN6yD8So
 pt8ROzA=
 =491j
 -----END PGP SIGNATURE-----

Merge tag 'sound-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A collection of small fixes.

  The major changes are ASoC core fixes, addressing the DPCM locking
  issue after the recent code changes and the potentially invalid
  register accesses via control API. Also, HD-audio got a core fix for
  Oops at dynamic unbinding.

  The rest are device-specific small fixes, including the usual stuff
  like HD-audio and USB-audio quirks"

* tag 'sound-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (31 commits)
  ALSA: hda: Skip codec shutdown in case the codec is not registered
  ALSA: usb-audio: Correct quirk for VF0770
  ALSA: Replace acpi_bus_get_device()
  Input: wm97xx: Simplify resource management
  ALSA: hda/realtek: Add quirk for ASUS GU603
  ALSA: hda/realtek: Fix silent output on Gigabyte X570 Aorus Xtreme after reboot from Windows
  ALSA: hda/realtek: Fix silent output on Gigabyte X570S Aorus Master (newer chipset)
  ALSA: hda/realtek: Add missing fixup-model entry for Gigabyte X570 ALC1220 quirks
  ALSA: hda: realtek: Fix race at concurrent COEF updates
  ASoC: ops: Check for negative values before reading them
  ASoC: rt5682: Fix deadlock on resume
  ASoC: hdmi-codec: Fix OOB memory accesses
  ASoC: soc-pcm: Move debugfs removal out of spinlock
  ASoC: soc-pcm: Fix DPCM lockdep warning due to nested stream locks
  ASoC: fsl: Add missing error handling in pcm030_fabric_probe
  ALSA: hda: Fix signedness of sscanf() arguments
  ALSA: usb-audio: initialize variables that could ignore errors
  ALSA: hda: Fix UAF of leds class devs at unbinding
  ASoC: qdsp6: q6apm-dai: only stop graphs that are started
  ASoC: codecs: wcd938x: fix return value of mixer put function
  ...
2022-02-04 11:24:28 -08:00
Nicolas Dichtel
8b54136472 netfilter: nfqueue: enable to get skb->priority
This info could be useful to improve traffic analysis.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-04 06:30:27 +01:00
Florian Westphal
d1ca60efc5 netfilter: ctnetlink: disable helper autoassign
When userspace, e.g. conntrackd, inserts an entry with a specified helper,
its possible that the helper is lost immediately after its added:

ctnetlink_create_conntrack
  -> nf_ct_helper_ext_add + assign helper
    -> ctnetlink_setup_nat
      -> ctnetlink_parse_nat_setup
         -> parse_nat_setup -> nfnetlink_parse_nat_setup
	                       -> nf_nat_setup_info
                                 -> nf_conntrack_alter_reply
                                   -> __nf_ct_try_assign_helper

... and __nf_ct_try_assign_helper will zero the helper again.

Set IPS_HELPER bit to bypass auto-assign logic, its unwanted, just like
when helper is assigned via ruleset.

Dropped old 'not strictly necessary' comment, it referred to use of
rcu_assign_pointer() before it got replaced by RCU_INIT_POINTER().

NB: Fixes tag intentionally incorrect, this extends the referenced commit,
but this change won't build without IPS_HELPER introduced there.

Fixes: 6714cf5465 ("netfilter: nf_conntrack: fix explicit helper attachment and NAT")
Reported-by: Pham Thanh Tuyen <phamtyn@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-04 05:39:57 +01:00
Jakub Kicinski
c59400a68c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-03 17:36:16 -08:00
Linus Torvalds
eb2eb5161c Networking fixes for 5.17-rc3, including fixes from bpf, netfilter,
and ieee802154.
 
 Current release - regressions:
 
  - Partially revert "net/smc: Add netlink net namespace support",
    fix uABI breakage
 
  - netfilter:
      - nft_ct: fix use after free when attaching zone template
      - nft_byteorder: track register operations
 
 Previous releases - regressions:
 
  - ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback
 
  - phy: qca8081: fix speeds lower than 2.5Gb/s
 
  - sched: fix use-after-free in tc_new_tfilter()
 
 Previous releases - always broken:
 
  - tcp: fix mem under-charging with zerocopy sendmsg()
 
  - tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data()
 
  - neigh: do not trigger immediate probes on NUD_FAILED from
    neigh_managed_work, avoid a deadlock
 
  - bpf: use VM_MAP instead of VM_ALLOC for ringbuf, avoid KASAN
    false-positives
 
  - netfilter: nft_reject_bridge: fix for missing reply from prerouting
 
  - smc: forward wakeup to smc socket waitqueue after fallback
 
  - ieee802154:
      - return meaningful error codes from the netlink helpers
      - mcr20a: fix lifs/sifs periods
      - at86rf230, ca8210: stop leaking skbs on error paths
 
  - macsec: add missing un-offload call for NETDEV_UNREGISTER of parent
 
  - ax25: add refcount in ax25_dev to avoid UAF bugs
 
  - eth: mlx5e:
      - fix SFP module EEPROM query
      - fix broken SKB allocation in HW-GRO
      - IPsec offload: fix tunnel mode crypto for non-TCP/UDP flows
 
  - eth: amd-xgbe:
      - fix skb data length underflow
      - ensure reset of the tx_timer_active flag, avoid Tx timeouts
 
  - eth: stmmac: fix runtime pm use in stmmac_dvr_remove()
 
  - eth: e1000e: handshake with CSME starts from Alder Lake platforms
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmH8X9UACgkQMUZtbf5S
 IrsxuhAAlAvFHGL6y5Y2gAmhKvVUvCYjiIJBcvk7R66CwYVRxofvlhmxi6GM/Czs
 9SrVSaN4RXu3p3d7UtAl1gAQwHqzLIHH3m2g5dSKVvHZWQgkm/+n74x0aZQ9Fll7
 mWs9uu5fWsQr/qZBnnjoQTvUxRUNVd4trBy7nXGzkNqJL5j0+2TT4BhH4qalhE28
 iPc9YFCyKPdjoWFksteZqD3hAQbXxK/xRRr6xuvFHENlZdEHM6ARftHnJthTG/fY
 32rdn9YUkQ9lNtOBJNMN9yP2z1B7TcxASBqjjk55I7XtT1QAI9/PskszavHC0hOk
 BCSMX779bLNW4+G0wiSKVB4tq4tvswtawq8Hxa6zdU4TKIzfQ84ZL/Nf66GtH+4W
 C0mbZohmyJV9hQFkNT0ZLeihljd7i4BkDttlbK3uz2IL9tHeX3uSo5V7AgS/Xaf6
 frXgbGgjQTaR6IL9AUhfN3GTCx60mzpH/aRpFho8A5xAl3EtHWCJcRhbY/CEhQBR
 zyCndcLcG5mUzbhx/TxlKrrpRCLxqCUG/Tsb2wCh5jMxO1zonW9Hhv4P1ie6EFuI
 h+XiJT2WWObS/KTze9S86WOR0zcqrtRqaOGJlNB+/+K8ClZU8UsDTFXLQ0dqpVZF
 Mvp7VchBzyFFJrrvO8WkkJgLTKdaPJmM9wuWUZb4J6d2MWlmDkE=
 =qKvf
 -----END PGP SIGNATURE-----

Merge tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf, netfilter, and ieee802154.

  Current release - regressions:

   - Partially revert "net/smc: Add netlink net namespace support", fix
     uABI breakage

   - netfilter:
      - nft_ct: fix use after free when attaching zone template
      - nft_byteorder: track register operations

  Previous releases - regressions:

   - ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback

   - phy: qca8081: fix speeds lower than 2.5Gb/s

   - sched: fix use-after-free in tc_new_tfilter()

  Previous releases - always broken:

   - tcp: fix mem under-charging with zerocopy sendmsg()

   - tcp: add missing tcp_skb_can_collapse() test in
     tcp_shift_skb_data()

   - neigh: do not trigger immediate probes on NUD_FAILED from
     neigh_managed_work, avoid a deadlock

   - bpf: use VM_MAP instead of VM_ALLOC for ringbuf, avoid KASAN
     false-positives

   - netfilter: nft_reject_bridge: fix for missing reply from prerouting

   - smc: forward wakeup to smc socket waitqueue after fallback

   - ieee802154:
      - return meaningful error codes from the netlink helpers
      - mcr20a: fix lifs/sifs periods
      - at86rf230, ca8210: stop leaking skbs on error paths

   - macsec: add missing un-offload call for NETDEV_UNREGISTER of parent

   - ax25: add refcount in ax25_dev to avoid UAF bugs

   - eth: mlx5e:
      - fix SFP module EEPROM query
      - fix broken SKB allocation in HW-GRO
      - IPsec offload: fix tunnel mode crypto for non-TCP/UDP flows

   - eth: amd-xgbe:
      - fix skb data length underflow
      - ensure reset of the tx_timer_active flag, avoid Tx timeouts

   - eth: stmmac: fix runtime pm use in stmmac_dvr_remove()

   - eth: e1000e: handshake with CSME starts from Alder Lake platforms"

* tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
  ax25: fix reference count leaks of ax25_dev
  net: stmmac: ensure PTP time register reads are consistent
  net: ipa: request IPA register values be retained
  dt-bindings: net: qcom,ipa: add optional qcom,qmp property
  tools/resolve_btfids: Do not print any commands when building silently
  bpf: Use VM_MAP instead of VM_ALLOC for ringbuf
  net, neigh: Do not trigger immediate probes on NUD_FAILED from neigh_managed_work
  tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data()
  net: sparx5: do not refer to skb after passing it on
  Partially revert "net/smc: Add netlink net namespace support"
  net/mlx5e: Avoid field-overflowing memcpy()
  net/mlx5e: Use struct_group() for memcpy() region
  net/mlx5e: Avoid implicit modify hdr for decap drop rule
  net/mlx5e: IPsec: Fix tunnel mode crypto offload for non TCP/UDP traffic
  net/mlx5e: IPsec: Fix crypto offload for non TCP/UDP encapsulated traffic
  net/mlx5e: Don't treat small ceil values as unlimited in HTB offload
  net/mlx5: E-Switch, Fix uninitialized variable modact
  net/mlx5e: Fix handling of wrong devices during bond netevent
  net/mlx5e: Fix broken SKB allocation in HW-GRO
  net/mlx5e: Fix wrong calculation of header index in HW_GRO
  ...
2022-02-03 16:54:18 -08:00
Demi Marie Obenour
164666fa66 Improve docs for IOCTL_GNTDEV_MAP_GRANT_REF
--------------cKY3Ggs6VDUCSn4I6iN78sHA
Content-Type: multipart/mixed; boundary="------------g0T69ASidFiPhh4eOY4XzIg1"

--------------g0T69ASidFiPhh4eOY4XzIg1
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

The current implementation of gntdev guarantees that the first call to
IOCTL_GNTDEV_MAP_GRANT_REF will set @index to 0.  This is required to
use gntdev for Wayland, which is a future desire of Qubes OS.
Additionally, requesting zero grants results in an error, but this was
not documented either.  Document both of these.

Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/f66c5a4e-2034-00b5-a635-6983bd999c07@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2022-02-03 08:23:32 +01:00
Dmitry V. Levin
c86d86131a Partially revert "net/smc: Add netlink net namespace support"
The change of sizeof(struct smc_diag_linkinfo) by commit 79d39fc503
("net/smc: Add netlink net namespace support") introduced an ABI
regression: since struct smc_diag_lgrinfo contains an object of
type "struct smc_diag_linkinfo", offset of all subsequent members
of struct smc_diag_lgrinfo was changed by that change.

As result, applications compiled with the old version
of struct smc_diag_linkinfo will receive garbage in
struct smc_diag_lgrinfo.role if the kernel implements
this new version of struct smc_diag_linkinfo.

Fix this regression by reverting the part of commit 79d39fc503 that
changes struct smc_diag_linkinfo.  After all, there is SMC_GEN_NETLINK
interface which is good enough, so there is probably no need to touch
the smc_diag ABI in the first place.

Fixes: 79d39fc503 ("net/smc: Add netlink net namespace support")
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Reviewed-by: Karsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/20220202030904.GA9742@altlinux.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-02 07:42:41 -08:00
Marco Elver
ddecd22878 perf: uapi: Document perf_event_attr::sig_data truncation on 32 bit architectures
Due to the alignment requirements of siginfo_t, as described in
3ddb3fd8cd ("signal, perf: Fix siginfo_t by avoiding u64 on 32-bit
architectures"), siginfo_t::si_perf_data is limited to an unsigned long.

However, perf_event_attr::sig_data is an u64, to avoid having to deal
with compat conversions. Due to being an u64, it may not immediately be
clear to users that sig_data is truncated on 32 bit architectures.

Add a comment to explicitly point this out, and hopefully help some
users save time by not having to deduce themselves what's happening.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://lore.kernel.org/r/20220131103407.1971678-3-elver@google.com
2022-02-02 13:11:40 +01:00
Takashi Iwai
52517d9c0c ASoC: Fixes for v5.17
Quite a few fixes here, including an unusually large set in the core
 spurred on by various testing efforts as well as the usual small driver
 fixes.  There are quite a few fixes for out of bounds writes in both the
 core and the various Qualcomm drivers, plus a couple of fixes for
 locking in the DPCM code.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmH5Vi4ACgkQJNaLcl1U
 h9Ah6gf/SdlpKMKgAYiXAfmhb/88UgWkXWwHYgK+uHQXy3VpiG3zOso2uHQTbZ0X
 +QMZ1LwPbcSsz6Ny/w/kQ4MKeklHou0X6GiQ6PRoDukBPQ6yidivwnVXt35qNBwT
 iYmMuLpWsq0CrbUp9cPX+zurX+jUv/CKuwG1JnHxS2+lQ7UjetUOfKzO80X4zUA5
 tm12soU9PSUcSyBICg5Jn+Tt9ZPccJ4cDPLfdRVkKjwb+6o4CnuM56Jq9UeHY6KT
 SGr8/Jt0I61p8UHPUpqJsj2ItHhy80bNraSaNptQljlIYCPidgFlPKyXxgunm8e5
 Z2PUG2mOIR4pk0cj1SxOXwhH9+d3dQ==
 =UdmQ
 -----END PGP SIGNATURE-----

Merge tag 'asoc-fix-v5.17-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v5.17

Quite a few fixes here, including an unusually large set in the core
spurred on by various testing efforts as well as the usual small driver
fixes.  There are quite a few fixes for out of bounds writes in both the
core and the various Qualcomm drivers, plus a couple of fixes for
locking in the DPCM code.
2022-02-01 16:52:54 +01:00
Jakub Sitnicki
4421a58271 bpf: Make dst_port field in struct bpf_sock 16-bit wide
Menglong Dong reports that the documentation for the dst_port field in
struct bpf_sock is inaccurate and confusing. From the BPF program PoV, the
field is a zero-padded 16-bit integer in network byte order. The value
appears to the BPF user as if laid out in memory as so:

  offsetof(struct bpf_sock, dst_port) + 0  <port MSB>
                                      + 8  <port LSB>
                                      +16  0x00
                                      +24  0x00

32-, 16-, and 8-bit wide loads from the field are all allowed, but only if
the offset into the field is 0.

32-bit wide loads from dst_port are especially confusing. The loaded value,
after converting to host byte order with bpf_ntohl(dst_port), contains the
port number in the upper 16-bits.

Remove the confusion by splitting the field into two 16-bit fields. For
backward compatibility, allow 32-bit wide loads from offsetof(struct
bpf_sock, dst_port).

While at it, allow loads 8-bit loads at offset [0] and [1] from dst_port.

Reported-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/r/20220130115518.213259-2-jakub@cloudflare.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-31 12:39:12 -08:00
Akhmat Karakotov
26859240e4 txhash: Add socket option to control TX hash rethink behavior
Add the SO_TXREHASH socket option to control hash rethink behavior per socket.
When default mode is set, sockets disable rehash at initialization and use
sysctl option when entering listen state. setsockopt() overrides default
behavior.

Signed-off-by: Akhmat Karakotov <hmukos@yandex-team.ru>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31 15:05:25 +00:00
Akhmat Karakotov
e187013abe txhash: Make rethinking txhash behavior configurable via sysctl
Add a per ns sysctl that controls the txhash rethink behavior:
net.core.txrehash. When enabled, the same behavior is retained,
when disabled, rethink is not performed. Sysctl is enabled by default.

Signed-off-by: Akhmat Karakotov <hmukos@yandex-team.ru>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31 15:05:24 +00:00
Janosch Frank
f6c6804c43 kvm: Move KVM_GET_XSAVE2 IOCTL definition at the end of kvm.h
This way we can more easily find the next free IOCTL number when
adding new IOCTLs.

Fixes: be50b2065d ("kvm: x86: Add support for getting/setting expanded xstate buffer")
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Message-Id: <20220128154025.102666-1-frankja@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-31 09:21:56 -05:00
Linus Torvalds
bb37101b36 TTY/Serial driver fixes for 5.17-rc2
Here are some small bug fixes and reverts for reported problems with the
 tty core and drivers.  They include:
 	- revert the fifo use for the 8250 console mode.  It caused too
 	  many regressions and problems, and had a bug in it as well.
 	  This is being reworked and should show up in a later -rc1
 	  release, but it's not ready for 5.17
 	- rpmsg tty race fix
 	- restore the cyclades.h uapi header file.  Turns out a compiler
 	  test suite used it for some unknown reason.  Bring it back
 	  just for the parts that are used by the builder test so they
 	  continue to build.  No functionality is restored as no one
 	  actually has this hardware anymore, nor is it really tested.
 	- stm32 driver fixes
 	- n_gsm flow control fixes
 	- pl011 driver fix
 	- rs485 initialization fix
 
 All of these have been in linux-next this week with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYfU8Qw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynkJQCfb/B943LpvBd0x1gDxwCraZ0ruNQAoI1GCT5v
 fP5ATuBpZnS92MCB2961
 =WfuL
 -----END PGP SIGNATURE-----

Merge tag 'tty-5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial driver fixes from Greg KH:
 "Here are some small bug fixes and reverts for reported problems with
  the tty core and drivers. They include:

   - revert the fifo use for the 8250 console mode. It caused too many
     regressions and problems, and had a bug in it as well. This is
     being reworked and should show up in a later -rc1 release, but it's
     not ready for 5.17

   - rpmsg tty race fix

   - restore the cyclades.h uapi header file. Turns out a compiler test
     suite used it for some unknown reason. Bring it back just for the
     parts that are used by the builder test so they continue to build.
     No functionality is restored as no one actually has this hardware
     anymore, nor is it really tested.

   - stm32 driver fixes

   - n_gsm flow control fixes

   - pl011 driver fix

   - rs485 initialization fix

  All of these have been in linux-next this week with no reported
  problems"

* tag 'tty-5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  kbuild: remove include/linux/cyclades.h from header file check
  serial: core: Initialize rs485 RTS polarity already on probe
  serial: pl011: Fix incorrect rs485 RTS polarity on set_mctrl
  serial: stm32: fix software flow control transfer
  serial: stm32: prevent TDR register overwrite when sending x_char
  tty: n_gsm: fix SW flow control encoding/handling
  serial: 8250: of: Fix mapped region size when using reg-offset property
  tty: rpmsg: Fix race condition releasing tty port
  tty: Partially revert the removal of the Cyclades public API
  tty: Add support for Brainboxes UC cards.
  Revert "tty: serial: Use fifo in 8250 console driver"
2022-01-29 15:23:13 +02:00
Linus Torvalds
3cd7cd8a62 Two larger x86 series:
* Redo incorrect fix for SEV/SMAP erratum
 
 * Windows 11 Hyper-V workaround
 
 Other x86 changes:
 
 * Various x86 cleanups
 
 * Re-enable access_tracking_perf_test
 
 * Fix for #GP handling on SVM
 
 * Fix for CPUID leaf 0Dh in KVM_GET_SUPPORTED_CPUID
 
 * Fix for ICEBP in interrupt shadow
 
 * Avoid false-positive RCU splat
 
 * Enable Enlightened MSR-Bitmap support for real
 
 ARM:
 
 * Correctly update the shadow register on exception injection when
 running in nVHE mode
 
 * Correctly use the mm_ops indirection when performing cache invalidation
 from the page-table walker
 
 * Restrict the vgic-v3 workaround for SEIS to the two known broken
 implementations
 
 Generic code changes:
 
 * Dead code cleanup
 
 There will be another pull request for ARM fixes next week, but
 those patches need a bit more soak time.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmHz5eIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNv4wgAopj0Zlutrrtw3KT4/XnmSdMPgN0j
 jQNzysSLTO5wGQCEogycjYXkGUDFu1Gdi+K91QAyjeKja20pIhPLeS2CBDRJyOc5
 73K7sxqz51JnQiVFzkTuA+qzn+lXaJ9LUXtdg8BnQMSKyt2AJOqE8uT10kcYOD5q
 mW4V3QUA0QpVKN0cYHv/G/zvBwQGGSLZetFbuAzwH2EDTpIi1aio5ZN1r0AoH18L
 2x5kYPpqmnoBvo2cB4b7SNmxv3ZPQ5K+wta0uwZ4pO+UuYiRd84RPr5lErywJC3w
 nci0eC0DoXrC6h+35UItqM8RqAGv6LADbDnr1RGojmfogSD0OtbX8y3hjw==
 =iKnI
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Two larger x86 series:

   - Redo incorrect fix for SEV/SMAP erratum

   - Windows 11 Hyper-V workaround

  Other x86 changes:

   - Various x86 cleanups

   - Re-enable access_tracking_perf_test

   - Fix for #GP handling on SVM

   - Fix for CPUID leaf 0Dh in KVM_GET_SUPPORTED_CPUID

   - Fix for ICEBP in interrupt shadow

   - Avoid false-positive RCU splat

   - Enable Enlightened MSR-Bitmap support for real

  ARM:

   - Correctly update the shadow register on exception injection when
     running in nVHE mode

   - Correctly use the mm_ops indirection when performing cache
     invalidation from the page-table walker

   - Restrict the vgic-v3 workaround for SEIS to the two known broken
     implementations

  Generic code changes:

   - Dead code cleanup"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits)
  KVM: eventfd: Fix false positive RCU usage warning
  KVM: nVMX: Allow VMREAD when Enlightened VMCS is in use
  KVM: nVMX: Implement evmcs_field_offset() suitable for handle_vmread()
  KVM: nVMX: Rename vmcs_to_field_offset{,_table}
  KVM: nVMX: eVMCS: Filter out VM_EXIT_SAVE_VMX_PREEMPTION_TIMER
  KVM: nVMX: Also filter MSR_IA32_VMX_TRUE_PINBASED_CTLS when eVMCS
  selftests: kvm: check dynamic bits against KVM_X86_XCOMP_GUEST_SUPP
  KVM: x86: add system attribute to retrieve full set of supported xsave states
  KVM: x86: Add a helper to retrieve userspace address from kvm_device_attr
  selftests: kvm: move vm_xsave_req_perm call to amx_test
  KVM: x86: Sync the states size with the XCR0/IA32_XSS at, any time
  KVM: x86: Update vCPU's runtime CPUID on write to MSR_IA32_XSS
  KVM: x86: Keep MSR_IA32_XSS unchanged for INIT
  KVM: x86: Free kvm_cpuid_entry2 array on post-KVM_RUN KVM_SET_CPUID{,2}
  KVM: nVMX: WARN on any attempt to allocate shadow VMCS for vmcs02
  KVM: selftests: Don't skip L2's VMCALL in SMM test for SVM guest
  KVM: x86: Check .flags in kvm_cpuid_check_equal() too
  KVM: x86: Forcibly leave nested virt when SMM state is toggled
  KVM: SVM: drop unnecessary code in svm_hv_vmcb_dirty_nested_enlightenments()
  KVM: SVM: hyper-v: Enable Enlightened MSR-Bitmap support for real
  ...
2022-01-28 19:00:26 +02:00
Dmitry Osipenko
06feec6005
ASoC: hdmi-codec: Fix OOB memory accesses
Correct size of iec_status array by changing it to the size of status
array of the struct snd_aes_iec958. This fixes out-of-bounds slab
read accesses made by memcpy() of the hdmi-codec driver. This problem
is reported by KASAN.

Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Link: https://lore.kernel.org/r/20220112195039.1329-1-digetx@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-01-28 15:59:19 +00:00
Jakub Kicinski
9690ae6042 ethtool: add header/data split indication
For applications running on a mix of platforms it's useful
to have a clear indication whether host's NIC supports the
geometry requirements of TCP zero-copy. TCP zero-copy Rx
requires data to be neatly placed into memory pages.
Most NICs can't do that.

This patch is adding GET support only, since the NICs
I work with either always have the feature enabled or
enable it whenever MTU is set to jumbo. In other words
I don't need SET. But adding set should be trivial.
(The only note on SET is that we will likely want
the setting to be "sticky" and use 0 / `unknown`
to reset it back to driver default.)

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-28 14:43:47 +00:00
Paolo Bonzini
dd6e631220 KVM: x86: add system attribute to retrieve full set of supported xsave states
Because KVM_GET_SUPPORTED_CPUID is meant to be passed (by simple-minded
VMMs) to KVM_SET_CPUID2, it cannot include any dynamic xsave states that
have not been enabled.  Probing those, for example so that they can be
passed to ARCH_REQ_XCOMP_GUEST_PERM, requires a new ioctl or arch_prctl.
The latter is in fact worse, even though that is what the rest of the
API uses, because it would require supported_xcr0 to be moved from the
KVM module to the kernel just for this use.  In addition, the value
would be nonsensical (or an error would have to be returned) until
the KVM module is loaded in.

Therefore, to limit the growth of system ioctls, add a /dev/kvm
variant of KVM_{GET,HAS}_DEVICE_ATTR, and implement it in x86
with just one group (0) and attribute (KVM_X86_XCOMP_GUEST_SUPP).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-28 07:33:32 -05:00
Jakub Kicinski
72d044e4bf Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-27 12:54:16 -08:00
Maciej W. Rozycki
f23653fe64 tty: Partially revert the removal of the Cyclades public API
Fix a user API regression introduced with commit f76edd8f7c ("tty:
cyclades, remove this orphan"), which removed a part of the API and
caused compilation errors for user programs using said part, such as
GCC 9 in its libsanitizer component[1]:

.../libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc:160:10: fatal error: linux/cyclades.h: No such file or directory
  160 | #include <linux/cyclades.h>
      |          ^~~~~~~~~~~~~~~~~~
compilation terminated.
make[4]: *** [Makefile:664: sanitizer_platform_limits_posix.lo] Error 1

As the absolute minimum required bring `struct cyclades_monitor' and
ioctl numbers back then so as to make the library build again.  Add a
preprocessor warning as to the obsolescence of the features provided.

References:

[1] GCC PR sanitizer/100379, "cyclades.h is removed from linux kernel
    header files", <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100379>

Fixes: f76edd8f7c ("tty: cyclades, remove this orphan")
Cc: stable@vger.kernel.org # v5.13+
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Maciej W. Rozycki <macro@embecosm.com>
Link: https://lore.kernel.org/r/alpine.DEB.2.20.2201260733430.11348@tpp.orcam.me.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-26 14:49:46 +01:00
Kenny Yu
376040e473 bpf: Add bpf_copy_from_user_task() helper
This adds a helper for bpf programs to read the memory of other
tasks.

As an example use case at Meta, we are using a bpf task iterator program
and this new helper to print C++ async stack traces for all threads of
a given process.

Signed-off-by: Kenny Yu <kennyyu@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220124185403.468466-3-kennyyu@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-24 19:59:27 -08:00
Jakub Kicinski
caaba96131 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2022-01-24

We've added 80 non-merge commits during the last 14 day(s) which contain
a total of 128 files changed, 4990 insertions(+), 895 deletions(-).

The main changes are:

1) Add XDP multi-buffer support and implement it for the mvneta driver,
   from Lorenzo Bianconi, Eelco Chaudron and Toke Høiland-Jørgensen.

2) Add unstable conntrack lookup helpers for BPF by using the BPF kfunc
   infra, from Kumar Kartikeya Dwivedi.

3) Extend BPF cgroup programs to export custom ret value to userspace via
   two helpers bpf_get_retval() and bpf_set_retval(), from YiFei Zhu.

4) Add support for AF_UNIX iterator batching, from Kuniyuki Iwashima.

5) Complete missing UAPI BPF helper description and change bpf_doc.py script
   to enforce consistent & complete helper documentation, from Usama Arif.

6) Deprecate libbpf's legacy BPF map definitions and streamline XDP APIs to
   follow tc-based APIs, from Andrii Nakryiko.

7) Support BPF_PROG_QUERY for BPF programs attached to sockmap, from Di Zhu.

8) Deprecate libbpf's bpf_map__def() API and replace users with proper getters
   and setters, from Christy Lee.

9) Extend libbpf's btf__add_btf() with an additional hashmap for strings to
   reduce overhead, from Kui-Feng Lee.

10) Fix bpftool and libbpf error handling related to libbpf's hashmap__new()
    utility function, from Mauricio Vásquez.

11) Add support to BTF program names in bpftool's program dump, from Raman Shukhau.

12) Fix resolve_btfids build to pick up host flags, from Connor O'Brien.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (80 commits)
  selftests, bpf: Do not yet switch to new libbpf XDP APIs
  selftests, xsk: Fix rx_full stats test
  bpf: Fix flexible_array.cocci warnings
  xdp: disable XDP_REDIRECT for xdp frags
  bpf: selftests: add CPUMAP/DEVMAP selftests for xdp frags
  bpf: selftests: introduce bpf_xdp_{load,store}_bytes selftest
  net: xdp: introduce bpf_xdp_pointer utility routine
  bpf: generalise tail call map compatibility check
  libbpf: Add SEC name for xdp frags programs
  bpf: selftests: update xdp_adjust_tail selftest to include xdp frags
  bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature
  bpf: introduce frags support to bpf_prog_test_run_xdp()
  bpf: move user_size out of bpf_test_init
  bpf: add frags support to xdp copy helpers
  bpf: add frags support to the bpf_xdp_adjust_tail() API
  bpf: introduce bpf_xdp_get_buff_len helper
  net: mvneta: enable jumbo frames if the loaded XDP program support frags
  bpf: introduce BPF_F_XDP_HAS_FRAGS flag in prog_flags loading the ebpf program
  net: mvneta: add frags support to XDP_TX
  xdp: add frags support to xdp_return_{buff/frame}
  ...
====================

Link: https://lore.kernel.org/r/20220124221235.18993-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-24 15:42:29 -08:00
Lorenzo Bianconi
3f364222d0 net: xdp: introduce bpf_xdp_pointer utility routine
Similar to skb_header_pointer, introduce bpf_xdp_pointer utility routine
to return a pointer to a given position in the xdp_buff if the requested
area (offset + len) is contained in a contiguous memory area otherwise it
will be copied in a bounce buffer provided by the caller.
Similar to the tc counterpart, introduce the two following xdp helpers:
- bpf_xdp_load_bytes
- bpf_xdp_store_bytes

Reviewed-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/ab285c1efdd5b7a9d361348b1e7d3ef49f6382b3.1642758637.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21 14:14:03 -08:00
Lorenzo Bianconi
0165cc8170 bpf: introduce bpf_xdp_get_buff_len helper
Introduce bpf_xdp_get_buff_len helper in order to return the xdp buffer
total size (linear and paged area)

Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/aac9ac3504c84026cf66a3c71b7c5ae89bc991be.1642758637.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21 14:14:02 -08:00
Lorenzo Bianconi
c2f2cdbeff bpf: introduce BPF_F_XDP_HAS_FRAGS flag in prog_flags loading the ebpf program
Introduce BPF_F_XDP_HAS_FRAGS and the related field in bpf_prog_aux
in order to notify the driver the loaded program support xdp frags.

Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/db2e8075b7032a356003f407d1b0deb99adaa0ed.1642758637.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21 14:14:01 -08:00
Linus Torvalds
64f29d8856 The highlight is the new mount "device" string syntax implemented
by Venky Shankar.  It solves some long-standing issues with using
 different auth entities and/or mounting different CephFS filesystems
 from the same cluster, remounting and also misleading /proc/mounts
 contents.  The existing syntax of course remains to be maintained.
 
 On top of that, there is a couple of fixes for edge cases in quota
 and a new mount option for turning on unbuffered I/O mode globally
 instead of on a per-file basis with ioctl(CEPH_IOC_SYNCIO).
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmHpP5ATHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi0TgB/480i2lPHgA3ujJNqo5Q6z+W0vtTA2+
 Wx+4rAUgIESJVunbFxvecPbzyUXTe7wWFI11TCVHPpf6GyIIDTD+uHd3kKWtLsfL
 Zkk1/2PN9Q5Dh29R+N8rP9NaP8tIaTQjyiO3iqmRZlo+k0Z/lYtWUb+fUP05XlVY
 ML/ktW543tkKeYwl3SWdW5MqAAOVGDbTt+L51CraDhVoiUac5ptkP+cmDmIqsnGa
 ZHVqpwugxgndEIyuBHDLBps+5/LrEaL10xDhGcMtP9hwGYhyNr6Yj+azfGtHWwOi
 jdVsdHDiecUBVtGyZ351Y4pCMOmP0uJif6MOUZFXYYSSeUBUhH8UjgEi
 =jcte
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-5.17-rc1' of git://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
 "The highlight is the new mount "device" string syntax implemented by
  Venky Shankar. It solves some long-standing issues with using
  different auth entities and/or mounting different CephFS filesystems
  from the same cluster, remounting and also misleading /proc/mounts
  contents. The existing syntax of course remains to be maintained.

  On top of that, there is a couple of fixes for edge cases in quota and
  a new mount option for turning on unbuffered I/O mode globally instead
  of on a per-file basis with ioctl(CEPH_IOC_SYNCIO)"

* tag 'ceph-for-5.17-rc1' of git://github.com/ceph/ceph-client:
  ceph: move CEPH_SUPER_MAGIC definition to magic.h
  ceph: remove redundant Lsx caps check
  ceph: add new "nopagecache" option
  ceph: don't check for quotas on MDS stray dirs
  ceph: drop send metrics debug message
  rbd: make const pointer spaces a static const array
  ceph: Fix incorrect statfs report for small quota
  ceph: mount syntax module parameter
  doc: document new CephFS mount device syntax
  ceph: record updated mon_addr on remount
  ceph: new device mount syntax
  libceph: rename parse_fsid() to ceph_parse_fsid() and export
  libceph: generalize addr/ip parsing based on delimiter
2022-01-20 13:46:20 +02:00
Linus Torvalds
f4484d138b Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "55 patches.

  Subsystems affected by this patch series: percpu, procfs, sysctl,
  misc, core-kernel, get_maintainer, lib, checkpatch, binfmt, nilfs2,
  hfs, fat, adfs, panic, delayacct, kconfig, kcov, and ubsan"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (55 commits)
  lib: remove redundant assignment to variable ret
  ubsan: remove CONFIG_UBSAN_OBJECT_SIZE
  kcov: fix generic Kconfig dependencies if ARCH_WANTS_NO_INSTR
  lib/Kconfig.debug: make TEST_KMOD depend on PAGE_SIZE_LESS_THAN_256KB
  btrfs: use generic Kconfig option for 256kB page size limit
  arch/Kconfig: split PAGE_SIZE_LESS_THAN_256KB from PAGE_SIZE_LESS_THAN_64KB
  configs: introduce debug.config for CI-like setup
  delayacct: track delays from memory compact
  Documentation/accounting/delay-accounting.rst: add thrashing page cache and direct compact
  delayacct: cleanup flags in struct task_delay_info and functions use it
  delayacct: fix incomplete disable operation when switch enable to disable
  delayacct: support swapin delay accounting for swapping without blkio
  panic: remove oops_id
  panic: use error_report_end tracepoint on warnings
  fs/adfs: remove unneeded variable make code cleaner
  FAT: use io_schedule_timeout() instead of congestion_wait()
  hfsplus: use struct_group_attr() for memcpy() region
  nilfs2: remove redundant pointer sbufs
  fs/binfmt_elf: use PT_LOAD p_align values for static PIE
  const_structs.checkpatch: add frequently used ops structs
  ...
2022-01-20 10:41:01 +02:00
wangyong
5bf1828153 delayacct: track delays from memory compact
Delay accounting does not track the delay of memory compact.  When there
is not enough free memory, tasks can spend a amount of their time
waiting for compact.

To get the impact of tasks in direct memory compact, measure the delay
when allocating memory through memory compact.

Also update tools/accounting/getdelays.c:

    / # ./getdelays_next  -di -p 304
    print delayacct stats ON
    printing IO accounting
    PID     304

    CPU             count     real total  virtual total    delay total  delay average
                      277      780000000      849039485       18877296          0.068ms
    IO              count    delay total  delay average
                        0              0              0ms
    SWAP            count    delay total  delay average
                        0              0              0ms
    RECLAIM         count    delay total  delay average
                        5    11088812685           2217ms
    THRASHING       count    delay total  delay average
                        0              0              0ms
    COMPACT         count    delay total  delay average
                        3          72758              0ms
    watch: read=0, write=0, cancelled_write=0

Link: https://lkml.kernel.org/r/1638619795-71451-1-git-send-email-wang.yong12@zte.com.cn
Signed-off-by: wangyong <wang.yong12@zte.com.cn>
Reviewed-by: Jiang Xuexin <jiang.xuexin@zte.com.cn>
Reviewed-by: Zhang Wenya <zhang.wenya1@zte.com.cn>
Reviewed-by: Yang Yang <yang.yang29@zte.com.cn>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:55 +02:00
Andy Shevchenko
c7e4289cbe uuid: remove licence boilerplate text from the header
Remove licence boilerplate text from the UAPI header.

Link: https://lkml.kernel.org/r/20211216113552.81199-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:54 +02:00
Andy Shevchenko
8e930a6699 uuid: discourage people from using UAPI header in new code
Discourage people from using UAPI header in new code by adding a note.

Link: https://lkml.kernel.org/r/20211216113552.81199-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-20 08:52:54 +02:00