Commit graph

74829 commits

Author SHA1 Message Date
Florian Westphal
482cfc3185 netfilter: xtables: avoid percpu ruleset duplication
We store the rule blob per (possible) cpu.  Unfortunately this means we can
waste lot of memory on big smp machines. ipt_entry structure ('rule head')
is 112 byte, so e.g. with maxcpu=64 one single rule eats
close to 8k RAM.

Since previous patch made counters percpu it appears there is nothing
left in the rule blob that needs to be percpu.

On my test system (144 possible cpus, 400k dummy rules) this
change saves close to 9 Gigabyte of RAM.

Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-12 14:27:10 +02:00
Florian Westphal
71ae0dff02 netfilter: xtables: use percpu rule counters
The binary arp/ip/ip6tables ruleset is stored per cpu.

The only reason left as to why we need percpu duplication are the rule
counters embedded into ipt_entry et al -- since each cpu has its own copy
of the rules, all counters can be lockless.

The downside is that the more cpus are supported, the more memory is
required.  Rules are not just duplicated per online cpu but for each
possible cpu, i.e. if maxcpu is 144, then rule is duplicated 144 times,
not for the e.g. 64 cores present.

To save some memory and also improve utilization of shared caches it
would be preferable to only store the rule blob once.

So we first need to separate counters and the rule blob.

Instead of using entry->counters, allocate this percpu and store the
percpu address in entry->counters.pcnt on CONFIG_SMP.

This change makes no sense as-is; it is merely an intermediate step to
remove the percpu duplication of the rule set in a followup patch.

Suggested-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-12 14:27:09 +02:00
Florian Westphal
33b1f31392 net: ip_fragment: remove BRIDGE_NETFILTER mtu special handling
since commit d6b915e29f
("ip_fragment: don't forward defragmented DF packet") the largest
fragment size is available in the IPCB.

Therefore we no longer need to care about 'encapsulation'
overhead of stripped PPPOE/VLAN headers since ip_do_fragment
doesn't use device mtu in such cases.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-12 14:16:46 +02:00
Bernhard Thaler
efb6de9b4b netfilter: bridge: forward IPv6 fragmented packets
IPv6 fragmented packets are not forwarded on an ethernet bridge
with netfilter ip6_tables loaded. e.g. steps to reproduce

1) create a simple bridge like this

        modprobe br_netfilter
        brctl addbr br0
        brctl addif br0 eth0
        brctl addif br0 eth2
        ifconfig eth0 up
        ifconfig eth2 up
        ifconfig br0 up

2) place a host with an IPv6 address on each side of the bridge

        set IPv6 address on host A:
        ip -6 addr add fd01:2345:6789:1::1/64 dev eth0

        set IPv6 address on host B:
        ip -6 addr add fd01:2345:6789:1::2/64 dev eth0

3) run a simple ping command on host A with packets > MTU

        ping6 -s 4000 fd01:2345:6789:1::2

4) wait some time and run e.g. "ip6tables -t nat -nvL" on the bridge

IPv6 fragmented packets traverse the bridge cleanly until somebody runs.
"ip6tables -t nat -nvL". As soon as it is run (and netfilter modules are
loaded) IPv6 fragmented packets do not traverse the bridge any more (you
see no more responses in ping's output).

After applying this patch IPv6 fragmented packets traverse the bridge
cleanly in above scenario.

Signed-off-by: Bernhard Thaler <bernhard.thaler@wvnet.at>
[pablo@netfilter.org: small changes to br_nf_dev_queue_xmit]
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-12 14:10:12 +02:00
Bernhard Thaler
411ffb4fde netfilter: bridge: refactor frag_max_size
Currently frag_max_size is member of br_input_skb_cb and copied back and
forth using IPCB(skb) and BR_INPUT_SKB_CB(skb) each time it is changed or
used.

Attach frag_max_size to nf_bridge_info and set value in pre_routing and
forward functions. Use its value in forward and xmit functions.

Signed-off-by: Bernhard Thaler <bernhard.thaler@wvnet.at>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-12 14:08:51 +02:00
Bernhard Thaler
72b31f7271 netfilter: bridge: detect NAT66 correctly and change MAC address
IPv4 iptables allows to REDIRECT/DNAT/SNAT any traffic over a bridge.

e.g. REDIRECT
$ sysctl -w net.bridge.bridge-nf-call-iptables=1
$ iptables -t nat -A PREROUTING -p tcp -m tcp --dport 8080 \
  -j REDIRECT --to-ports 81

This does not work with ip6tables on a bridge in NAT66 scenario
because the REDIRECT/DNAT/SNAT is not correctly detected.

The bridge pre-routing (finish) netfilter hook has to check for a possible
redirect and then fix the destination mac address. This allows to use the
ip6tables rules for local REDIRECT/DNAT/SNAT REDIRECT similar to the IPv4
iptables version.

e.g. REDIRECT
$ sysctl -w net.bridge.bridge-nf-call-ip6tables=1
$ ip6tables -t nat -A PREROUTING -p tcp -m tcp --dport 8080 \
  -j REDIRECT --to-ports 81

This patch makes it possible to use IPv6 NAT66 on a bridge. It was tested
on a bridge with two interfaces using SNAT/DNAT NAT66 rules.

Reported-by: Artie Hamilton <artiemhamilton@yahoo.com>
Signed-off-by: Sven Eckelmann <sven@open-mesh.com>
[bernhard.thaler@wvnet.at: rebased, add indirect call to ip6_route_input()]
[bernhard.thaler@wvnet.at: rebased, split into separate patches]
Signed-off-by: Bernhard Thaler <bernhard.thaler@wvnet.at>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-06-12 14:08:07 +02:00
Eric Dumazet
f69ad292cf tcp: fill shinfo->gso_size at last moment
In commit cd7d8498c9 ("tcp: change tcp_skb_pcount() location") we stored
gso_segs in a temporary cache hot location.

This patch does the same for gso_size.

This allows to save 2 cache line misses in tcp xmit path for
the last packet that is considered but not sent because of
various conditions (cwnd, tso defer, receiver window, TSQ...)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-11 16:33:11 -07:00
Saeed Mahameed
fc11fbf9a7 net/mlx5e: Add HW cacheline start padding
Enable HW cacheline start padding and align RX WQE size to cacheline
while considering HW start padding. Also, fix dma_unmap call to use
the correct SKB data buffer size.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-11 15:55:25 -07:00
Saeed Mahameed
facc9699f0 net/mlx5e: Fix HW MTU settings
Previously we configured HW MTU to be netdev->mtu, actually we
need to configure netdev->mtu + (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN).

Also, query MTU can not fail, hence make the relevant helper a
void functionm, add mlx5e_set_dev_port_mtu, helper function to
handle MTU setting.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-11 15:55:25 -07:00
Hadar Hen Zion
a4244b0cf5 net/ethtool: Add current supported tunable options
Add strings array of the current supported tunable options.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-11 00:36:37 -07:00
Florian Fainelli
8bc84b7926 net: phy: broadcom: define Broadcom pseudo-PHY address in brcmphy.h
Define the pseudo-PHY address (30) which is used by all Broadcom
Ethernet switches in a shared header file.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-10 23:33:58 -07:00
Florian Fainelli
4f822c625f net: phy: broadcom: include phy.h for brcmphy.h
We utilize inline functions from the PHY library, make sure that we do
include phy.h in brcmphy.h in order for the code including brcmphy.h not
to have to resolve this inclusion dependency.

Fixes: 705314797b ("net: phy: broadcom: move shadow 0x1C register accessors to brcmphy.h")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-10 23:33:58 -07:00
David S. Miller
1edaa7e8a7 For this round we mostly have fixes:
* mesh fixes from Alexis Green and Chun-Yeow Yeoh,
  * a documentation fix from Jakub Kicinski,
  * a missing channel release (from Michal Kazior),
  * a fix for a signal strength reporting bug (from Sara Sharon),
  * handle deauth while associating (myself),
  * don't report mangled TX SKB back to userspace for status (myself),
  * handle aggregation session timeouts properly in fast-xmit (myself)
 
 However, there are also a few cleanups and one big change that
 affects all drivers (and that required me to pull in your tree)
 to change the mac80211 HW flags to use an unsigned long bitmap
 so that we can extend them more easily - we're running out of
 flags even with a cleanup to remove the two unused ones.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJVeEQ1AAoJEDBSmw7B7bqrebcP/3v7I2ZXAeHag2W4hdD4YH6W
 tuKfs3JKW3GDh84l2AJs2JBpFxR6Tk0Z7zGKrPLzkBTkkJkSLgKuUKR0+YQU6PYH
 VfZ2NkdIHEqouLgMWxGGlp6suqp2yYD9tiIUroICXZ6aFm5trQuZgzv5ePI+lhmX
 cWUYCawE2tcpVdg0NJsFExeCJhw81e/Bet1LCGHo0asWNpIK7phMdltzD7e4tgQS
 4q475FCIkWxbxKgJRrRkz8J7grsjK1wf2W3acOxKMaoVBeqJVW5BWDrTgo0aDPts
 qQ8n8t1s9o/jKQIvaz3RyjkQgX8T4vCMqkouLF4jJOThKIsUSi3Fvm9oKcMg4YhA
 Ju5QWfbCBFhpLZeBzWzKyePTnDru1XDFFVdIATLONKTVg1modzFAs3j5gb4Z3Wtg
 VYLoLWWpRtHKd9pzfZMhyWq64Xb8C+qlyQHr4r4QRm9ADz0Jq+OCh0rTFt+/bncM
 CHxnf0VS9hEOFk0+TxFqi2yXOnv2uMgcN+jnGkEs4QuLfv9ML1Eb23ZjDoHxd1uq
 1Yd4R8IDEY/KU6UJMwksz+gV/ekoB32eAhw56pxehgAMuZL4OgNvmeAQHx7Jq9it
 0/OfAK2BSNH8odqYQbpg89C8keqSInMwUhFyRhyMJAWSKiPRHypsDBWxMKGJIssI
 3mB4d/go+RP1AvZnazeF
 =2wTw
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2015-06-10' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
For this round we mostly have fixes:
 * mesh fixes from Alexis Green and Chun-Yeow Yeoh,
 * a documentation fix from Jakub Kicinski,
 * a missing channel release (from Michal Kazior),
 * a fix for a signal strength reporting bug (from Sara Sharon),
 * handle deauth while associating (myself),
 * don't report mangled TX SKB back to userspace for status (myself),
 * handle aggregation session timeouts properly in fast-xmit (myself)

However, there are also a few cleanups and one big change that
affects all drivers (and that required me to pull in your tree)
to change the mac80211 HW flags to use an unsigned long bitmap
so that we can extend them more easily - we're running out of
flags even with a cleanup to remove the two unused ones.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-10 22:49:49 -07:00
Stephen Smalley
37a9a8df8c net/unix: support SCM_SECURITY for stream sockets
SCM_SECURITY was originally only implemented for datagram sockets,
not for stream sockets.  However, SCM_CREDENTIALS is supported on
Unix stream sockets.  For consistency, implement Unix stream support
for SCM_SECURITY as well.  Also clean up the existing code and get
rid of the superfluous UNIXSID macro.

Motivated by https://bugzilla.redhat.com/show_bug.cgi?id=1224211,
where systemd was using SCM_CREDENTIALS and assumed wrongly that
SCM_SECURITY was also supported on Unix stream sockets.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-10 22:49:20 -07:00
Johannes Berg
30686bf7f5 mac80211: convert HW flags to unsigned long bitmap
As we're running out of hardware capability flags pretty quickly,
convert them to use the regular test_bit() style unsigned long
bitmaps.

This introduces a number of helper functions/macros to set and to
test the bits, along with new debugfs code.

The occurrences of an explicit __clear_bit() are intentional, the
drivers were never supposed to change their supported bits on the
fly. We should investigate changing this to be a per-frame flag.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-10 16:05:36 +02:00
Johannes Berg
206c59d1d7 Merge remote-tracking branch 'net-next/master' into mac80211-next
Merge back net-next to get wireless driver changes (from Kalle)
to be able to create the API change across all trees properly.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-10 12:45:09 +02:00
Jakub Kicinski
c2d3955ba3 mac80211: remove obsolete sentence from documentation
FIF_PROMISC_IN_BSS was removed in commit df1404650c
("mac80211: remove support for IFF_PROMISC").

Signed-off-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-09 21:48:20 +02:00
Oliver Hartkopp
dd895d7f21 can: cangw: introduce optional uid to reference created routing jobs
Similar to referencing iptables rules by their line number this UID allows to
reference created routing jobs, e.g. to alter configured data modifications.

The UID is an optional non-zero value which can be provided at routing job
creation time. When the UID is set the UID replaces the data modification
configuration as job identification attribute e.g. at job removal time.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2015-06-09 09:39:49 +02:00
David S. Miller
941742f497 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-06-08 20:06:56 -07:00
Majd Dibbiny
7cf7fa529d net/mlx5_core: Fix static checker warnings around system guid query flow
Fix static checker warnings in the flow of system guid query.

Fixes: 707c4602cd ('net/mlx5_core: Add new query HCA vport commands')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07 20:11:17 -07:00
Eric Dumazet
b80c0e7858 tcp: get_cookie_sock() consolidation
IPv4 and IPv6 share same implementation of get_cookie_sock(),
and there is no point inlining it.

We add tcp_ prefix to the common helper name and export it.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07 15:19:52 -07:00
Alexei Starovoitov
d691f9e8d4 bpf: allow programs to write to certain skb fields
allow programs read/write skb->mark, tc_index fields and
((struct qdisc_skb_cb *)cb)->data.

mark and tc_index are generically useful in TC.
cb[0]-cb[4] are primarily used to pass arguments from one
program to another called via bpf_tail_call() which can
be seen in sockex3_kern.c example.

All fields of 'struct __sk_buff' are readable to socket and tc_cls_act progs.
mark, tc_index are writeable from tc_cls_act only.
cb[0]-cb[4] are writeable by both sockets and tc_cls_act.

Add verifier tests and improve sample code.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-07 02:01:33 -07:00
Eric Dumazet
90c337da15 inet: add IP_BIND_ADDRESS_NO_PORT to overcome bind(0) limitations
When an application needs to force a source IP on an active TCP socket
it has to use bind(IP, port=x).

As most applications do not want to deal with already used ports, x is
often set to 0, meaning the kernel is in charge to find an available
port.
But kernel does not know yet if this socket is going to be a listener or
be connected.
It has very limited choices (no full knowledge of final 4-tuple for a
connect())

With limited ephemeral port range (about 32K ports), it is very easy to
fill the space.

This patch adds a new SOL_IP socket option, asking kernel to ignore
the 0 port provided by application in bind(IP, port=0) and only
remember the given IP address.

The port will be automatically chosen at connect() time, in a way
that allows sharing a source port as long as the 4-tuples are unique.

This new feature is available for both IPv4 and IPv6 (Thanks Neal)

Tested:

Wrote a test program and checked its behavior on IPv4 and IPv6.

strace(1) shows sequences of bind(IP=127.0.0.2, port=0) followed by
connect().
Also getsockname() show that the port is still 0 right after bind()
but properly allocated after connect().

socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 5
setsockopt(5, SOL_IP, IP_BIND_ADDRESS_NO_PORT, [1], 4) = 0
bind(5, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.2")}, 16) = 0
getsockname(5, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.2")}, [16]) = 0
connect(5, {sa_family=AF_INET, sin_port=htons(53174), sin_addr=inet_addr("127.0.0.3")}, 16) = 0
getsockname(5, {sa_family=AF_INET, sin_port=htons(38050), sin_addr=inet_addr("127.0.0.2")}, [16]) = 0

IPv6 test :

socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = 7
setsockopt(7, SOL_IP, IP_BIND_ADDRESS_NO_PORT, [1], 4) = 0
bind(7, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
getsockname(7, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
connect(7, {sa_family=AF_INET6, sin6_port=htons(57300), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
getsockname(7, {sa_family=AF_INET6, sin6_port=htons(60964), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0

I was able to bind()/connect() a million concurrent IPv4 sockets,
instead of ~32000 before patch.

lpaa23:~# ulimit -n 1000010
lpaa23:~# ./bind --connect --num-flows=1000000 &
1000000 sockets

lpaa23:~# grep TCP /proc/net/sockstat
TCP: inuse 2000063 orphan 0 tw 47 alloc 2000157 mem 66

Check that a given source port is indeed used by many different
connections :

lpaa23:~# ss -t src :40000 | head -10
State      Recv-Q Send-Q   Local Address:Port          Peer Address:Port
ESTAB      0      0           127.0.0.2:40000         127.0.202.33:44983
ESTAB      0      0           127.0.0.2:40000         127.2.27.240:44983
ESTAB      0      0           127.0.0.2:40000           127.2.98.5:44983
ESTAB      0      0           127.0.0.2:40000        127.0.124.196:44983
ESTAB      0      0           127.0.0.2:40000         127.2.139.38:44983
ESTAB      0      0           127.0.0.2:40000          127.1.59.80:44983
ESTAB      0      0           127.0.0.2:40000          127.3.6.228:44983
ESTAB      0      0           127.0.0.2:40000          127.0.38.53:44983
ESTAB      0      0           127.0.0.2:40000         127.1.197.10:44983

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-06 23:57:12 -07:00
Linus Torvalds
37ef1647b7 Driver core fixes for 4.1-rc7
Here are 2 fixes for the driver core that resolve some reported issues,
 one is a regression from 4.0, the other a fixes a reported oops that has
 been there since 3.19.  Both have been in linux-next for a while with no
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlVzguAACgkQMUfUDdst+yltoQCgokCbKeHXhGu+31KjYboiXkhk
 5ikAnRZKyFI8HKr+B9inecb/cMD0jhvR
 =uVNu
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are two fixes for the driver core that resolve some reported
  issues.

  One is a regression from 4.0, the other a fixes a reported oops that
  has been there since 3.19.

  Both have been in linux-next for a while with no problems"

* tag 'driver-core-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  drivers/base: cacheinfo: handle absence of caches
  drivers: of/base: move of_init to driver_init
2015-06-06 22:37:45 -07:00
Linus Torvalds
a0e9c6efa5 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "The biggest chunk of the changes are two regression fixes: a HT
  workaround fix and an event-group scheduling fix.  It's been verified
  with 5 days of fuzzer testing.

  Other fixes:

   - eBPF fix
   - a BIOS breakage detection fix
   - PMU driver fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/pt: Fix a refactoring bug
  perf/x86: Tweak broken BIOS rules during check_hw_exists()
  perf/x86/intel/pt: Untangle pt_buffer_reset_markers()
  perf: Disallow sparse AUX allocations for non-SG PMUs in overwrite mode
  perf/x86: Improve HT workaround GP counter constraint
  perf/x86: Fix event/group validation
  perf: Fix race in BPF program unregister
2015-06-05 10:00:53 -07:00
Majd Dibbiny
a124d13ef5 net/mlx5_core: Add more query port helpers
Add the following helpers:

1. mlx5_query_port_proto_oper -- queries the port speed port mask
2. mlx5_query_port_link_width_oper - queries the port link with bitmask
3. mlx5_query_port_vl_hw_cap - queries the Virtual Lanes supported on this port

These helpers will be used from the IB driver when working in ISSI > 0 mode.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 16:41:02 -07:00
Majd Dibbiny
a05bdefa40 net/mlx5_core: Use port number when querying port ptys
Until now, mlx5_query_port_ptys always queried port number one.

Added new argument in the function's prototype so we can also query
the second port. This will be needed  when thr helper will be invoked
from the IB driver on non FPP (Function-Per-Port) devices.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 16:41:01 -07:00
Majd Dibbiny
e760152d08 net/mlx5_core: Use port number in the query port mtu helpers
Extend the function prototypes for max and operational mtu to take the
local port number. In the Ethernet driver is this hard coded to one,
since ConnectX4 Ethernet devices are always function-per-port.
The IB driver also serves older devices (ConnectIB) which isn't such,
and hence the part can vary.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 16:41:01 -07:00
Majd Dibbiny
211e6c80e5 net/mlx5_core: Get vendor-id using the query adapter command
Add two wrapper functions to the query adapter command:

1. mlx5_query_board_id -- replaces the old mlx5_cmd_query_adapter.

2. mlx5_core_query_vendor_id -- retrieves the vendor_id from the
   query_adapter command.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 16:41:01 -07:00
Majd Dibbiny
707c4602cd net/mlx5_core: Add new query HCA vport commands
Added the implementation for the following commands:

1. QUERY_HCA_VPORT_GID
2. QUERY_HCA_VPORT_PKEY
3. QUERY_HCA_VPORT_CONTEXT

They will be needed when we move to work with ISSI > 0 in the IB driver too.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 16:41:01 -07:00
Majd Dibbiny
d18a9470f8 net/mlx5_core: Make the vport helpers available for the IB driver too
Move the vport header file to be under include/linux/mlx5, such that
the mlx5 IB can use it as well.

Also add nic_ prefix to the vport NIC commands to differeniate between
HCA vport commands and NIC vport commands.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 16:41:01 -07:00
Haggai Abramonvsky
01949d0109 net/mlx5_core: Enable XRCs and SRQs when using ISSI > 0
When working in ISSI > 0 mode, the model exposed by the device for
XRCs and SRQs is different. XRCs use XRC SRQs and plain SRQs are based
on RPM (Receive Memory Pool).

Add helper functions to create, modify, query, and arm XRC SRQs and RMPs.

Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 16:41:01 -07:00
Tom Herbert
b3baa0fbd0 mpls: Add MPLS entropy label in flow_keys
In flow dissector if an MPLS header contains an entropy label this is
saved in the new keyid field of flow_keys. The entropy label is
then represented in the flow hash function input.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 15:44:31 -07:00
Tom Herbert
1fdd512c92 net: Add GRE keyid in flow_keys
In flow dissector if a GRE header contains a keyid this is saved in the
new keyid field of flow_keys. The GRE keyid is then represented
in the flow hash function input.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 15:44:31 -07:00
Tom Herbert
87ee9e52ff net: Add IPv6 flow label to flow_keys
In flow_dissector set the flow label in flow_keys for IPv6. This also
removes the shortcircuiting of flow dissection when a non-zero label
is present, the flow label can be considered to provide additional
entropy for a hash.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 15:44:31 -07:00
Tom Herbert
d34af823ff net: Add VLAN ID to flow_keys
In flow_dissector set vlan_id in flow_keys when VLAN is found.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 15:44:31 -07:00
Tom Herbert
45b47fd00c net: Get rid of IPv6 hash addresses flow keys
We don't need to return the IPv6 address hash as part of flow keys.
In general, using the IPv6 address hash is risky in a hash value
since the underlying use of xor provides no entropy. If someone
really needs the hash value they can get it from the full IPv6
addresses in flow keys (e.g. from flow_get_u32_src).

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 15:44:31 -07:00
Tom Herbert
9f24908901 net: Add keys for TIPC address
Add a new flow key for TIPC addresses.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 15:44:31 -07:00
Tom Herbert
c3f8324188 net: Add full IPv6 addresses to flow_keys
This patch adds full IPv6 addresses into flow_keys and uses them as
input to the flow hash function. The implementation supports either
IPv4 or IPv6 addresses in a union, and selector is used to determine
how may words to input to jhash2.

We also add flow_get_u32_dst and flow_get_u32_src functions which are
used to get a u32 representation of the source and destination
addresses. For IPv6, ipv6_addr_hash is called. These functions retain
getting the legacy values of src and dst in flow_keys.

With this patch, Ethertype and IP protocol are now included in the
flow hash input.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 15:44:30 -07:00
Tom Herbert
42aecaa9bb net: Get skb hash over flow_keys structure
This patch changes flow hashing to use jhash2 over the flow_keys
structure instead just doing jhash_3words over src, dst, and ports.
This method will allow us take more input into the hashing function
so that we can include full IPv6 addresses, VLAN, flow labels etc.
without needing to resort to xor'ing which makes for a poor hash.

Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 15:44:30 -07:00
Tom Herbert
730fc43713 mpls: Add definition for IPPROTO_MPLS
Add uapi define for MPLS over IP.

Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-04 15:44:30 -07:00
David S. Miller
9d1dabfbd0 new driver mt7601u for MediaTek Wi-Fi devices MT7601U
ath10k:
 
 * qca6174 power consumption improvements, enable ASPM etc (Michal)
 
 wil6210:
 
 * support Wi-Fi Simple Configuration in STA mode
 
 iwlwifi:
 
 * a few fixes (re-enablement of interrupts for certain new
   platforms that have special power states)
 * Rework completely the RBD allocation model towards new
   multi RX hardware.
 * cleanups
 * scan reworks continuation (Luca)
 
 mwifiex:
 
 * improve firmware debug functionality
 
 rtlwifi:
 
 * update regulatory database
 
 brcmfmac:
 
 * cleanup and new feature support in PCIe code
 * alternative nvram loading for router support
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJVb1cPAAoJEG4XJFUm622bP0oIAKhUBlC3rtrOJd+9kREAGUJQ
 Dk2xZr/p6hdb4dSHHKKroBr5mfryHknSs+AI5akJMph36DoBMD+Mwb4HlcL9cI5J
 RXIjIvQEADsK+6ME7cqnw2htWlYsX8aJI96/2Eusveo/zHyAG3+eBC3wkyqWBlBK
 EGV5ziClSe5pE5yGWj5tyr9me+qRQiO+dFJK1AoRE3Zq4pjj+5VDZoVQN0GNZGP7
 lgeNOzvPxWt+ZseslP8IeCedN5c+NpacD889NnQJyMXaouSp7LmMod000bjnKK8o
 9sRHsKxI5qHgC4mUa3Tk3cEnFqVYAo8KKOVaBVtKsMc4XoO/Qov6Z0AtXig5Xnk=
 =CM/T
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-next-for-davem-2015-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
new driver mt7601u for MediaTek Wi-Fi devices MT7601U

ath10k:

* qca6174 power consumption improvements, enable ASPM etc (Michal)

wil6210:

* support Wi-Fi Simple Configuration in STA mode

iwlwifi:

* a few fixes (re-enablement of interrupts for certain new
  platforms that have special power states)
* Rework completely the RBD allocation model towards new
  multi RX hardware.
* cleanups
* scan reworks continuation (Luca)

mwifiex:

* improve firmware debug functionality

rtlwifi:

* update regulatory database

brcmfmac:

* cleanup and new feature support in PCIe code
* alternative nvram loading for router support
====================

Conflicts:
	drivers/net/wireless/iwlwifi/Kconfig

Trivial conflict in iwlwifi Kconfig, two commits adding
the same two chip numbers to the help text, but order
transposed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-03 23:44:57 -07:00
Alexei Starovoitov
3896d655f4 bpf: introduce bpf_clone_redirect() helper
Allow eBPF programs attached to classifier/actions to call
bpf_clone_redirect(skb, ifindex, flags) helper which will
mirror or redirect the packet by dynamic ifindex selection
from within the program to a target device either at ingress
or at egress. Can be used for various scenarios, for example,
to load balance skbs into veths, split parts of the traffic
to local taps, etc.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-03 20:16:58 -07:00
Dan Murphy
2a10154abc net: phy: dp83867: Add TI dp83867 phy
Add support for the TI dp83867 Gigabit ethernet phy
device.

The DP83867 is a robust, low power, fully featured
Physical Layer transceiver with integrated PMD
sublayers to support 10BASE-T, 100BASE-TX and
1000BASE-T Ethernet protocols.

Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-03 19:41:04 -07:00
Linus Torvalds
8a7deb362b Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
 "Sending this off now, as I'm not aware of other current bugs, nor do I
  expect further fixes before 4.1 final.  This contains two fixes:

   - a fix for a bdi unregister warning that gets spewed on md, due to a
     regression introduced earlier in this cycle.  From Neil Brown.

   - a fix for a compile warning for NVMe on 32-bit platforms, also a
     regression introduced in this cycle.  From Arnd Bergmann"

* 'for-linus' of git://git.kernel.dk/linux-block:
  NVMe: fix type warning on 32-bit
  block: discard bdi_unregister() in favour of bdi_destroy()
2015-06-03 16:35:00 -07:00
Johannes Berg
c526a46767 mac80211: rename single hw-scan flag to follow naming convention
The naming convention is to always have the flags prefixed with
IEEE80211_HW_ so they're 'namespaced', make this flag follow it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-02 20:32:00 +02:00
Johannes Berg
ea1b2b45f5 mac80211: remove short slot/short preamble incapable flags
There are no drivers setting IEEE80211_HW_2GHZ_SHORT_SLOT_INCAPABLE
or IEEE80211_HW_2GHZ_SHORT_PREAMBLE_INCAPABLE, so any code using the
two flags is dead; it's also exceedingly unlikely that any new driver
could ever need to set these flags.

The wcn36xx code is almost certainly broken, but this preserves the
previous behaviour.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-02 20:28:58 +02:00
Johannes Berg
3b79af973c mac80211: stop using pointers as userspace cookies
Even if the pointers are really only accessible to root and used
pretty much only by wpa_supplicant, this is still not great; even
for debugging it'd be easier to have something that's easier to
read and guaranteed to never get reused.

With the recent change to make mac80211 create an ack_skb for the
mgmt-tx path this becomes possible, only the client probe method
needs to also allocate an ack_skb, and we can store the cookie in
that skb.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-02 13:07:59 +02:00
Johannes Berg
db388a567f mac80211: move TX PN to public part of key struct
For drivers supporting TSO or similar features, but that still have
PN assignment in software, there's a need to have some memory to
store the current PN value. As mac80211 already stores this and it's
somewhat complicated to add a per-driver area to the key struct (due
to the dynamic sizing thereof) it makes sense to just move the TX PN
to the keyconf, i.e. the public part of the key struct.

As TKIP is more complicated and we won't able to offload it in this
way right now (fast-xmit is skipped for TKIP unless the HW does it
all, and our hardware needs MMIC calculation in software) I've not
moved that for now - it's possible but requires exposing a lot of
the internal TKIP state.

As an bonus side effect, we can remove a lot of code by assuming the
keyseq struct has a certain layout - with BUILD_BUG_ON to verify it.

This might also improve performance, since now TX and RX no longer
share a cacheline.

Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-06-02 11:16:35 +02:00
David S. Miller
dda922c831 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/phy/amd-xgbe-phy.c
	drivers/net/wireless/iwlwifi/Kconfig
	include/net/mac80211.h

iwlwifi/Kconfig and mac80211.h were both trivial overlapping
changes.

The drivers/net/phy/amd-xgbe-phy.c file got removed in 'net-next' and
the bug fix that happened on the 'net' side is already integrated
into the rest of the amd-xgbe driver.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-01 22:51:30 -07:00