Commit graph

9253 commits

Author SHA1 Message Date
Yonghong Song
a5cbe05a66 bpf: Implement bpf iterator for map elements
The bpf iterator for map elements are implemented.
The bpf program will receive four parameters:
  bpf_iter_meta *meta: the meta data
  bpf_map *map:        the bpf_map whose elements are traversed
  void *key:           the key of one element
  void *value:         the value of the same element

Here, meta and map pointers are always valid, and
key has register type PTR_TO_RDONLY_BUF_OR_NULL and
value has register type PTR_TO_RDWR_BUF_OR_NULL.
The kernel will track the access range of key and value
during verification time. Later, these values will be compared
against the values in the actual map to ensure all accesses
are within range.

A new field iter_seq_info is added to bpf_map_ops which
is used to add map type specific information, i.e., seq_ops,
init/fini seq_file func and seq_file private data size.
Subsequent patches will have actual implementation
for bpf_map_ops->iter_seq_info.

In user space, BPF_ITER_LINK_MAP_FD needs to be
specified in prog attr->link_create.flags, which indicates
that attr->link_create.target_fd is a map_fd.
The reason for such an explicit flag is for possible
future cases where one bpf iterator may allow more than
one possible customization, e.g., pid and cgroup id for
task_file.

Current kernel internal implementation only allows
the target to register at most one required bpf_iter_link_info.
To support the above case, optional bpf_iter_link_info's
are needed, the target can be extended to register such link
infos, and user provided link_info needs to match one of
target supported ones.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200723184112.590360-1-yhs@fb.com
2020-07-25 20:16:32 -07:00
Dan Williams
6450ddbd5d ACPI: NFIT: Define runtime firmware activation commands
Platform reboots are expensive. Towards reducing downtime to apply
firmware updates the Intel NVDIMM command definition is growing support
for applying live firmware updates that only require temporarily
suspending memory traffic instead of a full reboot.

Follow-on commits add support for triggering firmware activation, this
patch only defines the commands, adds probe support, and validates that
they are blocked via the ioctl path. The ioctl-path block ensures that
the OS is in charge since these commands have side effects only the OS
can handle. Specifically firmware activation may cause the memory
controller to be quiesced on the order of 100s of milliseconds. In that
case Linux ensure the activation only takes place while the OS is in a
suspend state.

Link: https://pmem.io/documents/IntelOptanePMem_DSM_Interface-V2.0.pdf
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-07-25 19:34:47 -06:00
Dan Williams
92fe2aa859 libnvdimm: Validate command family indices
The ND_CMD_CALL format allows for a general passthrough of passlisted
commands targeting a given command set. However there is no validation
of the family index relative to what the bus supports.

- Update the NFIT bus implementation (the only one that supports
  ND_CMD_CALL passthrough) to also passlist the valid set of command
  family indices.

- Update the generic __nd_ioctl() path to validate that field on behalf
  of all implementations.

Fixes: 31eca76ba2 ("nfit, libnvdimm: limited/whitelisted dimm command marshaling mechanism")
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-07-25 19:34:47 -06:00
David S. Miller
a57066b1a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The UDP reuseport conflict was a little bit tricky.

The net-next code, via bpf-next, extracted the reuseport handling
into a helper so that the BPF sk lookup code could invoke it.

At the same time, the logic for reuseport handling of unconnected
sockets changed via commit efc6b6f6c3
which changed the logic to carry on the reuseport result into the
rest of the lookup loop if we do not return immediately.

This requires moving the reuseport_has_conns() logic into the callers.

While we are here, get rid of inline directives as they do not belong
in foo.c files.

The other changes were cases of more straightforward overlapping
modifications.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-25 17:49:04 -07:00
Coly Li
ffa4703275 bcache: add bucket_size_hi into struct cache_sb_disk for large bucket
The large bucket feature is to extend bucket_size from 16bit to 32bit.

When create cache device on zoned device (e.g. zoned NVMe SSD), making
a single bucket cover one or more zones of the zoned device is the
simplest way to support zoned device as cache by bcache.

But current maximum bucket size is 16MB and a typical zone size of zoned
device is 256MB, this is the major motiviation to extend bucket size to
a larger bit width.

This patch is the basic and first change to support large bucket size,
the major changes it makes are,
- Add BCH_FEATURE_INCOMPAT_LARGE_BUCKET for the large bucket feature,
  INCOMPAT means it introduces incompatible on-disk format change.
- Add BCH_FEATURE_INCOMPAT_FUNCS(large_bucket, LARGE_BUCKET) routines.
- Adds __le16 bucket_size_hi into struct cache_sb_disk at offset 0x8d0
  for the on-disk super block format.
- For the in-memory super block struct cache_sb, member bucket_size is
  extended from __u16 to __32.
- Add get_bucket_size() to combine the bucket_size and bucket_size_hi
  from struct cache_sb_disk into an unsigned int value.

Since we already have large bucket size helpers meta_bucket_pages(),
meta_bucket_bytes() and alloc_meta_bucket_pages(), they make sure when
bucket size > 8MB, the memory allocation for bcache meta data bucket
won't fail no matter how large the bucket size extended. So these meta
data buckets are handled properly when the bucket size width increase
from 16bit to 32bit, we don't need to worry about them.

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-25 07:38:21 -06:00
Coly Li
4c1ccd0896 bcache: struct cache_sb is only for in-memory super block now
We have struct cache_sb_disk for on-disk super block already, it is
unnecessary to keep the in-memory super block format exactly mapping
to the on-disk struct layout.

This patch adds code comments to notice that struct cache_sb is not
exactly mapping to cache_sb_disk, and removes the useless member csum
and pad[5].

Although struct cache_sb does not belong to uapi, but there are still
some on-disk format related macros reference it and it is unncessary to
get rid of such dependency now. So struct cache_sb will continue to stay
in include/uapi/linux/bache.h for now.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-25 07:38:20 -06:00
Coly Li
d721a43ff6 bcache: increase super block version for cache device and backing device
The new added super block version BCACHE_SB_VERSION_BDEV_WITH_FEATURES
(5) BCACHE_SB_VERSION_CDEV_WITH_FEATURES value (6), is for the feature
set bits.

Devices have super block version equal to the new version will have
three new members for feature set bits in the on-disk super block,
        __le64                  feature_compat;
        __le64                  feature_incompat;
        __le64                  feature_ro_compat;

They are used for further new features which may introduce on-disk
format change, and avoid unncessary super block version increase.

The very basic features handling code skeleton is also initialized in
this patch.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-25 07:38:20 -06:00
Willem de Bruijn
01370434df icmp6: support rfc 4884
Extend the rfc 4884 read interface introduced for ipv4 in
commit eba75c587e ("icmp: support rfc 4884") to ipv6.

Add socket option SOL_IPV6/IPV6_RECVERR_RFC4884.

Changes v1->v2:
  - make ipv6_icmp_error_rfc4884 static (file scope)

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24 17:12:41 -07:00
Ariel Levkovich
5923b8f7fa net/sched: cls_flower: Add hash info to flow classification
Adding new cls flower keys for hash value and hash
mask and dissect the hash info from the skb into
the flow key towards flow classication.

Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24 15:23:31 -07:00
Pavel Machek
8b603d0715 RDMA/mlx5: Fix typo in enum name
Nnothing uses the enum name, so this is harmless.

Fixes: 3226944124 ("IB/mlx5: Introduce driver create and destroy flow methods")
Link: https://lore.kernel.org/r/20200724084112.GC31930@amd
Signed-off-by: Pavel Machek (CIP) <pavel@denx.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-24 17:02:15 -03:00
Jens Axboe
760618f7a8 Merge branch 'io_uring-5.8' into for-5.9/io_uring
Merge in io_uring-5.8 fixes, as changes/cleanups to how we do locked
mem accounting require a fixup, and only one of the spots are noticed
by git as the other merges cleanly. The flags fix from io_uring-5.8
also causes a merge conflict, the leak fix for recvmsg, the double poll
fix, and the link failure locking fix.

* io_uring-5.8:
  io_uring: fix lockup in io_fail_links()
  io_uring: fix ->work corruption with poll_add
  io_uring: missed req_init_async() for IOSQE_ASYNC
  io_uring: always allow drain/link/hardlink/async sqe flags
  io_uring: ensure double poll additions work with both request types
  io_uring: fix recvmsg memory leak with buffer selection
  io_uring: fix not initialised work->flags
  io_uring: fix missing msg_name assignment
  io_uring: account user memory freed when exit has been queued
  io_uring: fix memleak in io_sqe_files_register()
  io_uring: fix memleak in __io_sqe_files_update()
  io_uring: export cq overflow status to userspace

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-24 12:53:31 -06:00
Ofir Bitton
db491e4f08 habanalabs: Add dropped cs statistics info struct
Add command submission statistics structure which can be obtained
through the info ioctl. Each drop counter describes the reason for
which the command submission was dropped.
This information is needed for the user to be aware of the specific
reason for which the submitted work was dropped. The user can then
utilize the driver more efficiently.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:36 +03:00
Oded Gabbay
3bf1c021e3 uapi/habanalabs: fix some comments
MAP/UNMAP are done also for device memory.

Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2020-07-24 20:31:35 +03:00
Dave Airlie
41206a073c Linux 5.8-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8UzA4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGQ7cH/3v+Gv+SmHJCvaT2
 CSu0+7okVnYbY3UTb3hykk7/aOqb6284KjxR03r0CWFzsEsZVhC5pvvruASSiMQg
 Pi04sLqv6CsGLHd1n+pl4AUYEaxq6k4KS3uU3HHSWxrahDDApQoRUx2F8lpOxyj8
 RiwnoO60IMPA7IFJqzcZuFqsgdxqiiYvnzT461KX8Mrw6fyMXeR2KAj2NwMX8dZN
 At21Sf8+LSoh6q2HnugfiUd/jR10XbfxIIx2lXgIinb15GXgWydEQVrDJ7cUV7ix
 Jd0S+dtOtp+lWtFHDoyjjqqsMV7+G8i/rFNZoxSkyZqsUTaKzaR6JD3moSyoYZgG
 0+eXO4A=
 =9EpR
 -----END PGP SIGNATURE-----

Merge v5.8-rc6 into drm-next

I've got a silent conflict + two trees based on fixes to merge.

Fixes a silent merge with amdgpu

Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-07-24 08:48:05 +10:00
Randy Dunlap
7deff7b5b4 hyperv: hyperv.h: drop a duplicated word
Drop the repeated word "the" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: linux-hyperv@vger.kernel.org
Link: https://lore.kernel.org/r/20200719002841.20369-1-rdunlap@infradead.org
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2020-07-23 17:55:20 +00:00
Randy Dunlap
1859f4ebcf android: binder.h: drop a duplicated word
Drop the repeated word "the" in a comment.

Cc: Arve Hjønnevåg <arve@android.com>
Cc: Todd Kjos <tkjos@android.com>
Cc: Martijn Coenen <maco@android.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: devel@driverdev.osuosl.org
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20200719002738.20210-1-rdunlap@infradead.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-23 09:35:36 +02:00
Greg Kroah-Hartman
cb0cec23ce FPGA Manager changes for 5.9-rc1
Here is the (slightly larger than usual) patch set for the 5.9-rc1 merge
 window.
 
 DFL:
 - Xu's changes add support for AFU interrupt handling and puts them to
   use for error handling.
 - Xu's other change also adds another device-id for the Intel FPGA PAC N3000.
 - John's change converts from using get_user_pages() to
   pin_user_pages().
 - Gustavo's patch cleans up some of the allocation by using
   struct_size().
 
 Xilinx:
 - Luca's changes clean up the xilinx-spi and xilinx-slave-serial drivers
   and updates the comments and dt-bindings to reflect the fact it also
   supports 7 series devices.
 
 Core:
 - Tom cleaned up the fpga-bridge / fpga-mgr core by removing some
   dead-stores.
 
 All patches have been reviewed on the mailing list, and have been in the
 last few linux-next releases (as part of my for-next branch) without issues.
 
 Signed-off-by: Moritz Fischer <mdf@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iIUEABYIAC0WIQRORt0E5Sb/c/mZMgkXxQAtim5VSwUCXxJbZQ8cbWRmQGtlcm5l
 bC5vcmcACgkQF8UALYpuVUsAIgD/eMm7y2Kgw0vnteTR4diLH0T8isv24ZjgRNL1
 2FpscKsBAIp7XJUF9z1I85D476GAlArbxGK8BpscbKPJVw3mP9AC
 =Owcl
 -----END PGP SIGNATURE-----

Merge tag 'fpga-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/mdf/linux-fpga into char-misc-next

Moritz writes:

FPGA Manager changes for 5.9-rc1

Here is the (slightly larger than usual) patch set for the 5.9-rc1 merge
window.

DFL:
- Xu's changes add support for AFU interrupt handling and puts them to
  use for error handling.
- Xu's other change also adds another device-id for the Intel FPGA PAC N3000.
- John's change converts from using get_user_pages() to
  pin_user_pages().
- Gustavo's patch cleans up some of the allocation by using
  struct_size().

Xilinx:
- Luca's changes clean up the xilinx-spi and xilinx-slave-serial drivers
  and updates the comments and dt-bindings to reflect the fact it also
  supports 7 series devices.

Core:
- Tom cleaned up the fpga-bridge / fpga-mgr core by removing some
  dead-stores.

All patches have been reviewed on the mailing list, and have been in the
last few linux-next releases (as part of my for-next branch) without issues.

Signed-off-by: Moritz Fischer <mdf@kernel.org>

* tag 'fpga-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/mdf/linux-fpga:
  fpga: dfl: pci: add device id for Intel FPGA PAC N3000
  Documentation: fpga: dfl: add descriptions for interrupt related interfaces.
  fpga: dfl: afu: add AFU interrupt support
  fpga: dfl: fme: add interrupt support for global error reporting
  fpga: dfl: afu: add interrupt support for port error reporting
  fpga: dfl: introduce interrupt trigger setting API
  fpga: dfl: pci: add irq info for feature devices enumeration
  fpga: dfl: parse interrupt info for feature devices on enumeration
  fpga manager: xilinx-spi: check INIT_B pin during write_init
  dt-bindings: fpga: xilinx-slave-serial: add optional INIT_B GPIO
  fpga: Fix dead store in fpga-bridge.c
  fpga: Fix dead store fpga-mgr.c
  fpga: dfl: Use struct_size() in kzalloc()
  fpga manager: xilinx-spi: remove unneeded, mistyped variables
  fpga manager: xilinx-spi: valid for the 7 Series too
  dt-bindings: fpga: xilinx-slave-serial: valid for the 7 Series too
  fpga: dfl: afu: convert get_user_pages() --> pin_user_pages()
2020-07-23 09:24:26 +02:00
Dave Airlie
2067391195 Merge tag 'amd-drm-next-5.9-2020-07-17' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.9-2020-07-17:

amdgpu:
- SI UVD/VCE clock support
- Updates for Sienna Cichlid
- Expose drm rotation property
- Atomfirmware updates for renoir
- updates to GPUVM hub handling for different register layouts
- swSMU restructuring and cleanups
- RAS fixes
- DC fixes
- mode1 reset support for Sienna Cichlid
- Add support for Navy Flounder GPUs

amdkfd:
- Add SMI events watch interface

UAPI:
- Add amdkfd SMI events watch interface
  Userspace which uses this interface:
  2235ede34c

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200717132022.4014-1-alexander.deucher@amd.com
2020-07-23 15:38:11 +10:00
Dave Airlie
4145cb5416 drm-misc-next for v5.9:
UAPI Changes:
 
 Cross-subsystem Changes:
 - Convert panel-dsi-cm and ingenic bindings to YAML.
 - Add lockdep annotations for dma-fence. \o/
 - Describe why indefinite fences are a bad idea
 - Update binding for rocktech jh057n00900.
 
 Core Changes:
 - Add vblank workers.
 - Use spin_(un)lock_irq instead of the irqsave/restore variants in crtc code.
 - Add managed vram helpers.
 - Convert more logging to drm functions.
 - Replace more http links with https in core and drivers.
 - Cleanup to ttm iomem functions and implementation.
 - Remove TTM CMA memtype as it doesn't work correctly.
 - Remove TTM_MEMTYPE_FLAG_MAPPABLE for many drivers that have no
   unmappable memory resources.
 
 Driver Changes:
 - Add CRC support to nouveau, using the new vblank workers.
 - Dithering and atomic state fix for nouveau.
 - Fixes for Frida FRD350H54004 panel.
 - Add support for OSD mode (sprite planes), IPU (scaling) and multiple
   panels/bridges to ingenic.
 - Use managed vram helpers in ast.
 - Assorted small fixes to ingenic, i810, mxsfb.
 - Remove optional unused ttm dummy functions.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl8YFw0ACgkQ/lWMcqZw
 E8MlixAAk+lI5qZ95xtZL8Evdg4c70wYYLuPKz5JPb/lTYoV0MciFEUCF5J2Df9N
 83oMB1fycCEe396fb0aQlzq7IzMV5RFF+Y4hrDSqq0m7prlK7EphVTmTlaSFovPW
 nQQTXBuET9LzgM7Dnhu4MPbD75IeFZ+pT+58yr3oUjki3r9bf0KIgy9QnkanDwl1
 nH1b2MCCqvVjgca8zk3NpD4H2FwOpLL87/SQmRINR9CshvuACZD6zRLoJuKrWGzs
 XZDVifhTib/ZfONr6tTWgsfv5d4IEifKml6XOV5OPy0K37u9tG0MmjJOm7pswq1/
 F8oyvNWbfP7IOgTeBvT3sDMgVv4v8rvHumYoL+J4v0Sg4Qpsro/KDX9aLQLT1SIA
 ZlHahSxW10H699UrV4Lr6DnW1caTaWLuvJyqmo838MNhskuEmKFWGxOPH8oGqcwW
 2/Hk8Ni+z2q7do+VWwezniy9k2d4AHF40B1ZjzWMR3dCQdz/sCHd7YY4fLOeGEF3
 C5zx6On9+S80iok4zfSATI/uVd+zwsngaqGsxZkoOhLum7xS1s8JwIyeVjTPQ2OX
 iOKH4vFu3sIBzq0dz1Wrha2uBRDur+nzDL2EqD9EuSBDMN0Du5cVic94QoaXNUeO
 9yBiDNF8p8xLX2TdfzeGPx9hxchs+2Tx9GV1B62KNpkbIo3Ix0I=
 =jDr3
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2020-07-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.9:

UAPI Changes:

Cross-subsystem Changes:
- Convert panel-dsi-cm and ingenic bindings to YAML.
- Add lockdep annotations for dma-fence. \o/
- Describe why indefinite fences are a bad idea
- Update binding for rocktech jh057n00900.

Core Changes:
- Add vblank workers.
- Use spin_(un)lock_irq instead of the irqsave/restore variants in crtc code.
- Add managed vram helpers.
- Convert more logging to drm functions.
- Replace more http links with https in core and drivers.
- Cleanup to ttm iomem functions and implementation.
- Remove TTM CMA memtype as it doesn't work correctly.
- Remove TTM_MEMTYPE_FLAG_MAPPABLE for many drivers that have no
  unmappable memory resources.

Driver Changes:
- Add CRC support to nouveau, using the new vblank workers.
- Dithering and atomic state fix for nouveau.
- Fixes for Frida FRD350H54004 panel.
- Add support for OSD mode (sprite planes), IPU (scaling) and multiple
  panels/bridges to ingenic.
- Use managed vram helpers in ast.
- Assorted small fixes to ingenic, i810, mxsfb.
- Remove optional unused ttm dummy functions.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/d6bf269e-ccb2-8a7b-fdae-226e9e3f8274@linux.intel.com
2020-07-23 14:01:45 +10:00
David S. Miller
dee72f8a0c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-07-21

The following pull-request contains BPF updates for your *net-next* tree.

We've added 46 non-merge commits during the last 6 day(s) which contain
a total of 68 files changed, 4929 insertions(+), 526 deletions(-).

The main changes are:

1) Run BPF program on socket lookup, from Jakub.

2) Introduce cpumap, from Lorenzo.

3) s390 JIT fixes, from Ilya.

4) teach riscv JIT to emit compressed insns, from Luke.

5) use build time computed BTF ids in bpf iter, from Yonghong.
====================

Purely independent overlapping changes in both filter.h and xdp.h

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 12:35:33 -07:00
Randy Dunlap
c333f9495c raid: md_p.h: drop duplicated word in a comment
Drop the doubled word "the" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: linux-raid@vger.kernel.org
Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-21 22:05:32 -07:00
Martin Varghese
4787dd582d bareudp: Reverted support to enable & disable rx metadata collection
The commit fe80536acf ("bareudp: Added attribute to enable & disable
rx metadata collection") breaks the the original(5.7) default behavior of
bareudp module to collect RX metadadata at the receive. It was added to
avoid the crash at the kernel neighbour subsytem when packet with metadata
from bareudp is processed. But it is no more needed as the
commit 394de110a7 ("net: Added pointer check for
dst->ops->neigh_lookup in dst_neigh_lookup_skb") solves this crash.

Fixes: fe80536acf ("bareudp: Added attribute to enable & disable rx metadata collection")
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 18:30:47 -07:00
Max Englander
b43870c74f audit: report audit wait metric in audit status reply
In environments where the preservation of audit events and predictable
usage of system memory are prioritized, admins may use a combination of
--backlog_wait_time and -b options at the risk of degraded performance
resulting from backlog waiting. In some cases, this risk may be
preferred to lost events or unbounded memory usage. Ideally, this risk
can be mitigated by making adjustments when backlog waiting is detected.

However, detection can be difficult using the currently available
metrics. For example, an admin attempting to debug degraded performance
may falsely believe a full backlog indicates backlog waiting. It may
turn out the backlog frequently fills up but drains quickly.

To make it easier to reliably track degraded performance to backlog
waiting, this patch makes the following changes:

Add a new field backlog_wait_time_total to the audit status reply.
Initialize this field to zero. Add to this field the total time spent
by the current task on scheduled timeouts while the backlog limit is
exceeded. Reset field to zero upon request via AUDIT_SET.

Tested on Ubuntu 18.04 using complementary changes to the
audit-userspace and audit-testsuite:
- https://github.com/linux-audit/audit-userspace/pull/134
- https://github.com/linux-audit/audit-testsuite/pull/97

Signed-off-by: Max Englander <max.englander@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-07-21 11:21:44 -04:00
Randy Dunlap
8e7eafb816 RDMA: rdma_user_ioctl.h: fix a duplicated word + clarify
Change the repeated word "it" in a comment to "it to".
Also insert a dash in the sentence to add clarity.

Link: https://lore.kernel.org/r/20200719003220.21250-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-20 16:42:08 -03:00
Peter Zijlstra
6c0246a458 perf: Add perf_event_mmap_page::cap_user_time_short ABI
In order to support short clock counters, provide an ABI extension.

As a whole:

    u64 time, delta, cyc = read_cycle_counter();

+   if (cap_user_time_short)
+	cyc = time_cycle + ((cyc - time_cycle) & time_mask);

    delta = mul_u64_u32_shr(cyc, time_mult, time_shift);

    if (cap_user_time_zero)
	time = time_zero + delta;

    delta += time_offset;

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20200716051130.4359-6-leo.yan@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
2020-07-20 11:50:47 +01:00
Alexander A. Klimov
b0487e0d96 drm: Replace HTTP links with HTTPS ones
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
	  If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
            If both the HTTP and HTTPS versions
            return 200 OK and serve the same content:
              Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200719171428.60470-1-grandmaster@al2klimov.de
2020-07-20 11:47:28 +02:00
Greg Kroah-Hartman
c4d41d0055 Merge v5.8-rc6 into char-misc-next
We need the char/misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-20 09:43:40 +02:00
Greg Kroah-Hartman
eed3c957dd Linux 5.8-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8UzA4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGQ7cH/3v+Gv+SmHJCvaT2
 CSu0+7okVnYbY3UTb3hykk7/aOqb6284KjxR03r0CWFzsEsZVhC5pvvruASSiMQg
 Pi04sLqv6CsGLHd1n+pl4AUYEaxq6k4KS3uU3HHSWxrahDDApQoRUx2F8lpOxyj8
 RiwnoO60IMPA7IFJqzcZuFqsgdxqiiYvnzT461KX8Mrw6fyMXeR2KAj2NwMX8dZN
 At21Sf8+LSoh6q2HnugfiUd/jR10XbfxIIx2lXgIinb15GXgWydEQVrDJ7cUV7ix
 Jd0S+dtOtp+lWtFHDoyjjqqsMV7+G8i/rFNZoxSkyZqsUTaKzaR6JD3moSyoYZgG
 0+eXO4A=
 =9EpR
 -----END PGP SIGNATURE-----

Merge 5.8-rc6 into usb-next

We need the USB fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-20 09:41:30 +02:00
Greg Kroah-Hartman
6f2c6599ba Linux 5.8-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8UzA4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGQ7cH/3v+Gv+SmHJCvaT2
 CSu0+7okVnYbY3UTb3hykk7/aOqb6284KjxR03r0CWFzsEsZVhC5pvvruASSiMQg
 Pi04sLqv6CsGLHd1n+pl4AUYEaxq6k4KS3uU3HHSWxrahDDApQoRUx2F8lpOxyj8
 RiwnoO60IMPA7IFJqzcZuFqsgdxqiiYvnzT461KX8Mrw6fyMXeR2KAj2NwMX8dZN
 At21Sf8+LSoh6q2HnugfiUd/jR10XbfxIIx2lXgIinb15GXgWydEQVrDJ7cUV7ix
 Jd0S+dtOtp+lWtFHDoyjjqqsMV7+G8i/rFNZoxSkyZqsUTaKzaR6JD3moSyoYZgG
 0+eXO4A=
 =9EpR
 -----END PGP SIGNATURE-----

Merge 5.8-rc6 into tty-next

We need the serial/tty fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-20 09:39:11 +02:00
Dave Airlie
3ffff3c685 drm-misc-next for v5.9:
UAPI Changes:
 
 Cross-subsystem Changes:
 - Add ckoenig as dma-buf maintainer.
 - Revert invalid fix for dma-fence-chain, and fix selftest.
 - Add fixmes to amifb about APUS support.
 - Use array3_size in fbcon_prepare_logo, and struct_size() in alloc_apertures.
 - Fix leaks in neofb, fb/savage and omapfb.
 - Other small fixes to fb code.
 - Convert some dt bindings to schema for some panels, and fix simple-framebuffer dt example.
 
 Core Changes:
 - Add DRM_FORMAT_MOD_GENERIC_16_16_TILE as alias to DRM_FORMAT_MOD_SAMSUNG_16_16_TILE,
   as it can be used more generic.
 - Add support for multiple DispID extension blocks in edid.
 - Use https instead of http for some of the urls.
 - Use drm_* macros for logging in mipi-dsi and fb-helper.
 - Further cleanup ttm_mem_reg handling.
 - Remove duplicated words in comments.
 
 Driver Changes:
 - Use __drm_atomic_helper_crtc_reset in all atomic drivers.
 - Add Amlogic Video FBC support to meson and fourcc to core.
 - Refactor hisilicon's hibmc_drv_vdac.
 - Create a TXP CRTC for vc4.
 - Rework cursor support in ast.
 - Fix runtime PM in STM.
 - Allow bigger cursors in vkms.
 - Cleanup sg handling in radeon and amdgpu, and stop creating dummy
   gtt nodes with ttm fixed.
 - Rework crtc handling in mgag200.
 - Miscellaneous small fixes to meson, vgem, bridge/dw-hdmi,
   panel/auo,b116xw03, panel/LG LB070WV8, lima, bridge/sil_sii8620,
   virtio, tilcdc.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl8QO6cACgkQ/lWMcqZw
 E8O/xxAAick0N8Q8sf0bL7Emh7iiHeUl2fWzbfQ8VmKEOjoScO9KYz7SSuSW8868
 qsy1pdI+T/ko2shl9w8hZ8aunNDOdCycL7F3WNSKT+SNAP2XY7R57xXjn0NCsfm/
 iY3RsST6vU1gJMFyS/6U+4OddbTcqjDT5dwSD26/kva86s2CiS/P/I6dxoH0bcDg
 Yo9QcNflZKjnP/0imRYDAmm3y7N09VKtYa2Df5dOCJiiijXxMTSQN6TB/TfFtYTi
 CfyIm7dEF1Nnmy+jlxiYxAXVZYlPvIJ/7nTInO/gRGLhiEyIuG9u1lZSna9kRGrn
 5eg+41u4sq3YnDB5+qZmMJ7yBKFIy51+5JweVQeaykBW8p4Z4Qrir2ENPLZWuyeC
 CR1cOUUrUkSaMThy2H6IPe+T6BDzKpceuHnOxv7MmTfBSzLwRR7Bn216zrC33sET
 i6AsS6Ir+lfkH26oGceceEHdL5biMjFuRPiq8MfzzEfnh1o7RZ2wvEg7gHV/QeiE
 ugD7peLR28gJnupFQyBzcbyqKr761W7twgwAOvEOo3Up1LldxYLmQmc3VQeB84j2
 mndhyBfXD6Jniuit2+PxuNXGRcK1oYExRxJKD9msZCkUMe1pezSDrHZcc+emnh2G
 bqy8EPWcpCL0KkO/xICdJx57UwaLfAMsyP1C4u0vxy2GGSirxeg=
 =XKYB
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-2020-07-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

drm-misc-next for v5.9:

UAPI Changes:

Cross-subsystem Changes:
- Add ckoenig as dma-buf maintainer.
- Revert invalid fix for dma-fence-chain, and fix selftest.
- Add fixmes to amifb about APUS support.
- Use array3_size in fbcon_prepare_logo, and struct_size() in alloc_apertures.
- Fix leaks in neofb, fb/savage and omapfb.
- Other small fixes to fb code.
- Convert some dt bindings to schema for some panels, and fix simple-framebuffer dt example.

Core Changes:
- Add DRM_FORMAT_MOD_GENERIC_16_16_TILE as alias to DRM_FORMAT_MOD_SAMSUNG_16_16_TILE,
  as it can be used more generic.
- Add support for multiple DispID extension blocks in edid.
- Use https instead of http for some of the urls.
- Use drm_* macros for logging in mipi-dsi and fb-helper.
- Further cleanup ttm_mem_reg handling.
- Remove duplicated words in comments.

Driver Changes:
- Use __drm_atomic_helper_crtc_reset in all atomic drivers.
- Add Amlogic Video FBC support to meson and fourcc to core.
- Refactor hisilicon's hibmc_drv_vdac.
- Create a TXP CRTC for vc4.
- Rework cursor support in ast.
- Fix runtime PM in STM.
- Allow bigger cursors in vkms.
- Cleanup sg handling in radeon and amdgpu, and stop creating dummy
  gtt nodes with ttm fixed.
- Rework crtc handling in mgag200.
- Miscellaneous small fixes to meson, vgem, bridge/dw-hdmi,
  panel/auo,b116xw03, panel/LG LB070WV8, lima, bridge/sil_sii8620,
  virtio, tilcdc.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8b360d65-f228-9286-d247-3004156a5254@linux.intel.com
2020-07-20 17:30:23 +10:00
Vladimir Oltean
b6bd41363a ptp: introduce a phase offset in the periodic output request
Some PHCs like the ocelot/felix switch cannot emit generic periodic
output, but just PPS (pulse per second) signals, which:
- don't start from arbitrary absolute times, but are rather
  phase-aligned to the beginning of [the closest next] second.
- have an optional phase offset relative to that beginning of the
  second.

For those, it was initially established that they should reject any
other absolute time for the PTP_PEROUT_REQUEST than 0.000000000 [1].

But when it actually came to writing an application [2] that makes use
of this functionality, we realized that we can't really deal generically
with PHCs that support absolute start time, and with PHCs that don't,
without an explicit interface. Namely, in an ideal world, PHC drivers
would ensure that the "perout.start" value written to hardware will
result in a functional output. This means that if the PTP time has
become in the past of this PHC's current time, it should be
automatically fast-forwarded by the driver into a close enough future
time that is known to work (note: this is necessary only if the hardware
doesn't do this fast-forward by itself). But we don't really know what
is the status for PHC drivers in use today, so in the general sense,
user space would be risking to have a non-functional periodic output if
it simply asked for a start time of 0.000000000.

So let's introduce a flag for this type of reduced-functionality
hardware, named PTP_PEROUT_PHASE. The start time is just "soon", the
only thing we know for sure about this signal is that its rising edge
events, Rn, occur at:

Rn = perout.phase + n * perout.period

The "phase" in the periodic output structure is simply an alias to the
"start" time, since both cannot logically be specified at the same time.
Therefore, the binary layout of the structure is not affected.

[1]: https://patchwork.ozlabs.org/project/netdev/patch/20200320103726.32559-7-yangbo.lu@nxp.com/
[2]: https://www.mail-archive.com/linuxptp-devel@lists.sourceforge.net/msg04142.html

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 19:22:56 -07:00
Vladimir Oltean
f65b71aa25 ptp: add ability to configure duty cycle for periodic output
There are external event timestampers (PHCs with support for
PTP_EXTTS_REQUEST) that timestamp both event edges.

When those edges are very close (such as in the case of a short pulse),
there is a chance that the collected timestamp might be of the rising,
or of the falling edge, we never know.

There are also PHCs capable of generating periodic output with a
configurable duty cycle. This is good news, because we can space the
rising and falling edge out enough in time, that the risks to overrun
the 1-entry timestamp FIFO of the extts PHC are lower (example: the
perout PHC can be configured for a period of 1 second, and an "on" time
of 0.5 seconds, resulting in a duty cycle of 50%).

A flag is introduced for signaling that an on time is present in the
perout request structure, for preserving compatibility. Logically
speaking, the duty cycle cannot exceed 100% and the PTP core checks for
this.

PHC drivers that don't support this flag emit a periodic output of an
unspecified duty cycle, same as before.

The duty cycle is encoded as an "on" time, similar to the "start" and
"period" times, and reuses the reserved space while preserving overall
binary layout.

Pahole reported before:

struct ptp_perout_request {
        struct ptp_clock_time start;                     /*     0    16 */
        struct ptp_clock_time period;                    /*    16    16 */
        unsigned int               index;                /*    32     4 */
        unsigned int               flags;                /*    36     4 */
        unsigned int               rsv[4];               /*    40    16 */

        /* size: 56, cachelines: 1, members: 5 */
        /* last cacheline: 56 bytes */
};

And now:

struct ptp_perout_request {
        struct ptp_clock_time start;                     /*     0    16 */
        struct ptp_clock_time period;                    /*    16    16 */
        unsigned int               index;                /*    32     4 */
        unsigned int               flags;                /*    36     4 */
        union {
                struct ptp_clock_time on;                /*    40    16 */
                unsigned int       rsv[4];               /*    40    16 */
        };                                               /*    40    16 */

        /* size: 56, cachelines: 1, members: 5 */
        /* last cacheline: 56 bytes */
};

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 19:22:56 -07:00
Willem de Bruijn
eba75c587e icmp: support rfc 4884
Add setsockopt SOL_IP/IP_RECVERR_4884 to return the offset to an
extension struct if present.

ICMP messages may include an extension structure after the original
datagram. RFC 4884 standardized this behavior. It stores the offset
in words to the extension header in u8 icmphdr.un.reserved[1].

The field is valid only for ICMP types destination unreachable, time
exceeded and parameter problem, if length is at least 128 bytes and
entire packet does not exceed 576 bytes.

Return the offset to the start of the extension struct when reading an
ICMP error from the error queue, if it matches the above constraints.

Do not return the raw u8 field. Return the offset from the start of
the user buffer, in bytes. The kernel does not return the network and
transport headers, so subtract those.

Also validate the headers. Return the offset regardless of validation,
as an invalid extension must still not be misinterpreted as part of
the original datagram. Note that !invalid does not imply valid. If
the extension version does not match, no validation can take place,
for instance.

For backward compatibility, make this optional, set by setsockopt
SOL_IP/IP_RECVERR_RFC4884. For API example and feature test, see
github.com/wdebruij/kerneltools/blob/master/tests/recv_icmp_v2.c

For forward compatibility, reserve only setsockopt value 1, leaving
other bits for additional icmp extensions.

Changes
  v1->v2:
  - convert word offset to byte offset from start of user buffer
    - return in ee_data as u8 may be insufficient
  - define extension struct and object header structs
  - return len only if constraints met
  - if returning len, also validate

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 19:20:22 -07:00
Christoph Hellwig
55db9c0e85 net: remove compat_sys_{get,set}sockopt
Now that the ->compat_{get,set}sockopt proto_ops methods are gone
there is no good reason left to keep the compat syscalls separate.

This fixes the odd use of unsigned int for the compat_setsockopt
optlen and the missing sock_use_custom_sol_socket.

It would also easily allow running the eBPF hooks for the compat
syscalls, but such a large change in behavior does not belong into
a consolidation patch like this one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:16:40 -07:00
Michael Walle
c4471ad9a5 net: phy: add USXGMII link partner ability constants
The constants are taken from the USXGMII Singleport Copper Interface
specification. The naming are based on the SGMII ones, but with an MDIO_
prefix.

Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:05:49 -07:00
Adrian Reber
124ea650d3 capabilities: Introduce CAP_CHECKPOINT_RESTORE
This patch introduces CAP_CHECKPOINT_RESTORE, a new capability facilitating
checkpoint/restore for non-root users.

Over the last years, The CRIU (Checkpoint/Restore In Userspace) team has
been asked numerous times if it is possible to checkpoint/restore a
process as non-root. The answer usually was: 'almost'.

The main blocker to restore a process as non-root was to control the PID
of the restored process. This feature available via the clone3 system
call, or via /proc/sys/kernel/ns_last_pid is unfortunately guarded by
CAP_SYS_ADMIN.

In the past two years, requests for non-root checkpoint/restore have
increased due to the following use cases:
* Checkpoint/Restore in an HPC environment in combination with a
  resource manager distributing jobs where users are always running as
  non-root. There is a desire to provide a way to checkpoint and
  restore long running jobs.
* Container migration as non-root
* We have been in contact with JVM developers who are integrating
  CRIU into a Java VM to decrease the startup time. These
  checkpoint/restore applications are not meant to be running with
  CAP_SYS_ADMIN.

We have seen the following workarounds:
* Use a setuid wrapper around CRIU:
  See https://github.com/FredHutch/slurm-examples/blob/master/checkpointer/lib/checkpointer/checkpointer-suid.c
* Use a setuid helper that writes to ns_last_pid.
  Unfortunately, this helper delegation technique is impossible to use
  with clone3, and is thus prone to races.
  See https://github.com/twosigma/set_ns_last_pid
* Cycle through PIDs with fork() until the desired PID is reached:
  This has been demonstrated to work with cycling rates of 100,000 PIDs/s
  See https://github.com/twosigma/set_ns_last_pid
* Patch out the CAP_SYS_ADMIN check from the kernel
* Run the desired application in a new user and PID namespace to provide
  a local CAP_SYS_ADMIN for controlling PIDs. This technique has limited
  use in typical container environments (e.g., Kubernetes) as /proc is
  typically protected with read-only layers (e.g., /proc/sys) for
  hardening purposes. Read-only layers prevent additional /proc mounts
  (due to proc's SB_I_USERNS_VISIBLE property), making the use of new
  PID namespaces limited as certain applications need access to /proc
  matching their PID namespace.

The introduced capability allows to:
* Control PIDs when the current user is CAP_CHECKPOINT_RESTORE capable
  for the corresponding PID namespace via ns_last_pid/clone3.
* Open files in /proc/pid/map_files when the current user is
  CAP_CHECKPOINT_RESTORE capable in the root namespace, useful for
  recovering files that are unreachable via the file system such as
  deleted files, or memfd files.

See corresponding selftest for an example with clone3().

Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20200719100418.2112740-2-areber@redhat.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-07-19 20:14:42 +02:00
Ezequiel Garcia
b3ab1c6058 media: Add V4L2_TYPE_IS_CAPTURE helper
It's all too easy to get confused by the V4L2_TYPE_IS_OUTPUT
macro, when it's used as !V4L2_TYPE_IS_OUTPUT.

Reduce the risk of confusion with macro to explicitly
check for the CAPTURE queue type case.

This change does not affect functionality, and it's
only intended to make the code more readable.

Suggested-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
[hverkuil-cisco@xs4all.nl: checkpatch: align with parenthesis]
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2020-07-19 08:13:24 +02:00
Jakub Sitnicki
e9ddbb7707 bpf: Introduce SK_LOOKUP program type with a dedicated attach point
Add a new program type BPF_PROG_TYPE_SK_LOOKUP with a dedicated attach type
BPF_SK_LOOKUP. The new program kind is to be invoked by the transport layer
when looking up a listening socket for a new connection request for
connection oriented protocols, or when looking up an unconnected socket for
a packet for connection-less protocols.

When called, SK_LOOKUP BPF program can select a socket that will receive
the packet. This serves as a mechanism to overcome the limits of what
bind() API allows to express. Two use-cases driving this work are:

 (1) steer packets destined to an IP range, on fixed port to a socket

     192.0.2.0/24, port 80 -> NGINX socket

 (2) steer packets destined to an IP address, on any port to a socket

     198.51.100.1, any port -> L7 proxy socket

In its run-time context program receives information about the packet that
triggered the socket lookup. Namely IP version, L4 protocol identifier, and
address 4-tuple. Context can be further extended to include ingress
interface identifier.

To select a socket BPF program fetches it from a map holding socket
references, like SOCKMAP or SOCKHASH, and calls bpf_sk_assign(ctx, sk, ...)
helper to record the selection. Transport layer then uses the selected
socket as a result of socket lookup.

In its basic form, SK_LOOKUP acts as a filter and hence must return either
SK_PASS or SK_DROP. If the program returns with SK_PASS, transport should
look for a socket to receive the packet, or use the one selected by the
program if available, while SK_DROP informs the transport layer that the
lookup should fail.

This patch only enables the user to attach an SK_LOOKUP program to a
network namespace. Subsequent patches hook it up to run on local delivery
path in ipv4 and ipv6 stacks.

Suggested-by: Marek Majkowski <marek@cloudflare.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717103536.397595-3-jakub@cloudflare.com
2020-07-17 20:18:16 -07:00
Priyaranjan Jha
e3a5a1e8b6 tcp: add SNMP counter for no. of duplicate segments reported by DSACK
There are two existing SNMP counters, TCPDSACKRecv and TCPDSACKOfoRecv,
which are incremented depending on whether the DSACKed range is below
the cumulative ACK sequence number or not. Unfortunately, these both
implicitly assume each DSACK covers only one segment. This makes these
counters unusable for estimating spurious retransmit rates,
or real/non-spurious loss rate.

This patch introduces a new SNMP counter, TCPDSACKRecvSegs, which tracks
the estimated number of duplicate segments based on:
(DSACKed sequence range) / MSS. This counter is usable for estimating
spurious retransmit rates, or real/non-spurious loss rate.

Signed-off-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:54:30 -07:00
Michal Kalderon
eb7f84e379 RDMA/qedr: Add EDPM max size to alloc ucontext response
User space should receive the maximum edpm size from kernel driver,
similar to other edpm/ldpm related limits.  Add an additional parameter to
the alloc_ucontext_resp structure for the edpm maximum size.

In addition, pass an indication from user-space to kernel
(and not just kernel to user) that the DPM sizes are supported.

This is for supporting backward-forward compatibility between driver and
lib for everything related to DPM transaction and limit sizes.

This should have been part of commit mentioned in Fixes tag.

Link: https://lore.kernel.org/r/20200707063100.3811-3-michal.kalderon@marvell.com
Fixes: 93a3d05f9d ("RDMA/qedr: Add kernel capability flags for dpm enabled mode")
Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16 16:01:55 -03:00
Michal Kalderon
bbe4f42452 RDMA/qedr: Add EDPM mode type for user-fw compatibility
In older FW versions the completion flag was treated as the ack flag in
edpm messages.  commit ff937b916e ("qed: Add EDPM mode type for user-fw
compatibility") exposed the FW option of setting which mode the QP is in
by adding a flag to the qedr <-> qed API.

This patch adds the qedr <-> libqedr interface so that the libqedr can set
the flag appropriately and qedr can pass it down to FW.  Flag is added for
backward compatibility with libqedr.

For older libs, this flag didn't exist and therefore set to zero.

Fixes: ac1b36e55a ("qedr: Add support for user context verbs")
Link: https://lore.kernel.org/r/20200707063100.3811-2-michal.kalderon@marvell.com
Signed-off-by: Yuval Bason <yuval.bason@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16 16:01:55 -03:00
Randy Dunlap
bfdfa51702 bpf: Drop duplicated words in uapi helper comments
Drop doubled words "will" and "attach".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/6b9f71ae-4f8e-0259-2c5d-187ddaefe6eb@infradead.org
2020-07-16 21:00:09 +02:00
Linus Torvalds
3e543a4d30 Char/Misc fixes for 5.8-rc6
Here are number of small char/misc driver fixes for 5.8-rc6
 
 Not that many complex fixes here, just a number of small fixes for
 reported issues, and some new device ids.  Nothing fancy.
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXxBvPg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynJEQCfTmYNFFZ7WTx1FNsG/ScZLvgEC/QAnA19tK48
 HBVDaLxodkGEa24u582V
 =EcB/
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc into master

Pull char/misc fixes from Greg KH:
 "Here are number of small char/misc driver fixes for 5.8-rc6

  Not that many complex fixes here, just a number of small fixes for
  reported issues, and some new device ids. Nothing fancy.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits)
  virtio: virtio_console: add missing MODULE_DEVICE_TABLE() for rproc serial
  intel_th: Fix a NULL dereference when hub driver is not loaded
  intel_th: pci: Add Emmitsburg PCH support
  intel_th: pci: Add Tiger Lake PCH-H support
  intel_th: pci: Add Jasper Lake CPU support
  virt: vbox: Fix guest capabilities mask check
  virt: vbox: Fix VBGL_IOCTL_VMMDEV_REQUEST_BIG and _LOG req numbers to match upstream
  uio_pdrv_genirq: fix use without device tree and no interrupt
  uio_pdrv_genirq: Remove warning when irq is not specified
  coresight: etmv4: Fix CPU power management setup in probe() function
  coresight: cti: Fix error handling in probe
  Revert "zram: convert remaining CLASS_ATTR() to CLASS_ATTR_RO()"
  mei: bus: don't clean driver pointer
  misc: atmel-ssc: lock with mutex instead of spinlock
  phy: sun4i-usb: fix dereference of pointer phy0 before it is null checked
  phy: rockchip: Fix return value of inno_dsidphy_probe()
  phy: ti: j721e-wiz: Constify structs
  phy: ti: am654-serdes: Constify regmap_config
  phy: intel: fix enum type mismatch warning
  phy: intel: Fix compilation error on FIELD_PREP usage
  ...
2020-07-16 11:26:40 -07:00
Lorenzo Bianconi
9216477449 bpf: cpumap: Add the possibility to attach an eBPF program to cpumap
Introduce the capability to attach an eBPF program to cpumap entries.
The idea behind this feature is to add the possibility to define on
which CPU run the eBPF program if the underlying hw does not support
RSS. Current supported verdicts are XDP_DROP and XDP_PASS.

This patch has been tested on Marvell ESPRESSObin using xdp_redirect_cpu
sample available in the kernel tree to identify possible performance
regressions. Results show there are no observable differences in
packet-per-second:

$./xdp_redirect_cpu --progname xdp_cpu_map0 --dev eth0 --cpu 1
rx: 354.8 Kpps
rx: 356.0 Kpps
rx: 356.8 Kpps
rx: 356.3 Kpps
rx: 356.6 Kpps
rx: 356.6 Kpps
rx: 356.7 Kpps
rx: 355.8 Kpps
rx: 356.8 Kpps
rx: 356.8 Kpps

Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/5c9febdf903d810b3415732e5cd98491d7d9067a.1594734381.git.lorenzo@kernel.org
2020-07-16 17:00:32 +02:00
Lorenzo Bianconi
644bfe51fa cpumap: Formalize map value as a named struct
As it has been already done for devmap, introduce 'struct bpf_cpumap_val'
to formalize the expected values that can be passed in for a CPUMAP.
Update cpumap code to use the struct.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/754f950674665dae6139c061d28c1d982aaf4170.1594734381.git.lorenzo@kernel.org
2020-07-16 17:00:32 +02:00
Randy Dunlap
c201324b54 net: caif: drop duplicate words in comments
Drop doubled words "or" and "the" in several comments.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-15 20:34:11 -07:00
Linus Torvalds
0665a4e9a1 dmaengine fixes for v5.5-rc6
- Update dmaengine tree location to k.org
  - dmatest fix for completing threads
  - Driver fixes for:
    - k3dma
    - fsl-dma
    - idxd
    - tegra
    - and few other in bunch of other drivers
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAl8Oy/IACgkQfBQHDyUj
 g0d73g/+PGhjOBwt+nb4e4Vo/cvXrkThfIe1xXb92ZFIh3AGRct6QPlpPrMautYJ
 i0xglxenmrOblr3OFaEwg5DmhV31UqCWN1U1aagNHe7daNEW0MZQ29dtpJnUT2+B
 MVBhrNcDPIjgWl8ujhO7y+5vkCWc2iGxNt7lgpE2lAelwmJztaxZiubDD9cVJOjb
 FOYPvQFrr4Rs3Xkg8yUOs1T7xbiL5R4xyGXQ12jfl0PCoAaQVi2JKholI9zs0j64
 4TYQjIixxFRxJd3b3Ml0u21IC514+NFlN91MNmlsNXstME4tE9emT9MHVznYPRgA
 cV1Fam9sRiX+zCLpktXaKws/MWTnBvgwx7ypbM7NCKBfOY85Ta7Xi47wixFwcQBB
 PjjloTNIINq3/ye3YEaI4ia8TxmDIUiovvAaYXmgdmQd3fUs+B/iHRGfXOHpTVVM
 sc0S8Q+alZ4Zj/qjz3VmfyCGp2ppDeFC5JcpDYsv0aTkUqpRZEdlZB6/Nh2IysLT
 2U4BFDnkuW5g5Was7hiWtdynqgEPCQFYe69azBQjh+ehj+cLOjPyQxP2yvKJBT5j
 O5sOb7CYTFu3TMH0bAFCbChsWRsK7hhYA+tjoC06Qfj4xm8QXJyGhJXgr60ob+9d
 5rKRY/AxdnR7dHALxICuUktj2q8YzZlCy2Oln0q0abJ4R4KyUm4=
 =F+IX
 -----END PGP SIGNATURE-----

Merge tag 'dmaengine-fix-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine into master

Pull dmaengine fixes from Vinod Koul:

 - update dmaengine tree location to kernel.org

 - dmatest fix for completing threads

 - driver fixes for k3dma, fsl-dma, idxd, ,tegra, and few other drivers

* tag 'dmaengine-fix-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (21 commits)
  dmaengine: ioat setting ioat timeout as module parameter
  dmaengine: fsl-edma: fix wrong tcd endianness for big-endian cpu
  dmaengine: dmatest: stop completed threads when running without set channel
  dmaengine: fsl-edma-common: correct DSIZE_32BYTE
  dmaengine: dw: Initialize channel before each transfer
  dmaengine: idxd: fix misc interrupt handler thread unmasking
  dmaengine: idxd: cleanup workqueue config after disabling
  dmaengine: tegra210-adma: Fix runtime PM imbalance on error
  dmaengine: mcf-edma: Fix NULL pointer exception in mcf_edma_tx_handler
  dmaengine: fsl-edma: Fix NULL pointer exception in fsl_edma_tx_handler
  dmaengine: fsl-edma: Add lockdep assert for exported function
  dmaengine: idxd: fix hw descriptor fields for delta record
  dmaengine: ti: k3-udma: add missing put_device() call in of_xudma_dev_get()
  dmaengine: sh: usb-dmac: set tx_result parameters
  dmaengine: ti: k3-udma: Fix delayed_work usage for tx drain workaround
  dmaengine: idxd: fix cdev locking for open and release
  dmaengine: imx-sdma: Fix: Remove 'always true' comparison
  MAINTAINERS: switch dmaengine tree to kernel.org
  dmaengine: ti: k3-udma: Fix the running channel handling in alloc_chan_resources
  dmaengine: ti: k3-udma: Fix cleanup code for alloc_chan_resources
  ...
2020-07-15 15:58:11 -07:00
Amber Lin
91e2c19192 include/uapi/linux: Update KFD ioctl version
Bump KFD ioctl after adding SMI events support

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-15 13:27:34 -04:00
Amber Lin
938a0650aa drm/amdkfd: Provide SMI events watch
When the compute is malfunctioning or performance drops, the system admin
will use SMI (System Management Interface) tool to monitor/diagnostic what
went wrong. This patch provides an event watch interface for the user
space to register devices and subscribe events they are interested. After
registered, the user can use annoymous file descriptor's poll function
with wait-time specified and wait for events to happen. Once an event
happens, the user can use read() to retrieve information related to the
event.

VM fault event is done in this patch.

v2: - remove UNREGISTER and add event ENABLE/DISABLE
    - correct kfifo usage
    - move event message API to kfd_ioctl.h
v3: send the event msg in text than in binary
v4: support multiple clients
v5: move events enablement from ioctl to fd write
v6: sparse fix

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-15 13:27:34 -04:00
Randy Dunlap
afae47af0c drm: msm_drm.h: delete duplicated words in comments
Drop the doubled word "to" in comments.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200715052349.23319-6-rdunlap@infradead.org
2020-07-15 14:02:58 +02:00