Commit graph

44249 commits

Author SHA1 Message Date
David S. Miller
ca69d7102f Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next tree.
They are:

* nf_tables set timeout infrastructure from Patrick Mchardy.

1) Add support for set timeout support.

2) Add support for set element timeouts using the new set extension
   infrastructure.

4) Add garbage collection helper functions to get rid of stale elements.
   Elements are accumulated in a batch that are asynchronously released
   via RCU when the batch is full.

5) Add garbage collection synchronization helpers. This introduces a new
   element busy bit to address concurrent access from the netlink API and the
   garbage collector.

5) Add timeout support for the nft_hash set implementation. The garbage
   collector peridically checks for stale elements from the workqueue.

* iptables/nftables cgroup fixes:

6) Ignore non full-socket objects from the input path, otherwise cgroup
   match may crash, from Daniel Borkmann.

7) Fix cgroup in nf_tables.

8) Save some cycles from xt_socket by skipping packet header parsing when
   skb->sk is already set because of early demux. Also from Daniel.

* br_netfilter updates from Florian Westphal.

9) Save frag_max_size and restore it from the forward path too.

10) Use a per-cpu area to restore the original source MAC address when traffic
    is DNAT'ed.

11) Add helper functions to access physical devices.

12) Use these new physdev helper function from xt_physdev.

13) Add another nf_bridge_info_get() helper function to fetch the br_netfilter
    state information.

14) Annotate original layer 2 protocol number in nf_bridge info, instead of
    using kludgy flags.

15) Also annotate the pkttype mangling when the packet travels back and forth
    from the IP to the bridge layer, instead of using a flag.

* More nf_tables set enhancement from Patrick:

16) Fix possible usage of set variant that doesn't support timeouts.

17) Avoid spurious "set is full" errors from Netlink API when there are pending
    stale elements scheduled to be released.

18) Restrict loop checks to set maps.

19) Add support for dynamic set updates from the packet path.

20) Add support to store optional user data (eg. comments) per set element.

BTW, I have also pulled net-next into nf-next to anticipate the conflict
resolution between your okfn() signature changes and Florian's br_netfilter
updates.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09 14:46:04 -04:00
David S. Miller
82740b9ac2 NFC: 4.1 pull request
This is the NFC pull request for 4.1.
 
 This is a shorter one than usual, as the Intel Field Peak NFC
 driver could not make it in time.
 
 We have:
 
 - A new driver for NXP NCI based chipsets, like e.g. the NPC100 or
   the PN7150. It currently only supports an i2c physical layer, but
   could easily be extended to work on top of e.g. SPI.
   This driver also includes support for user space triggered firmware
   updates.
 
 - A few minor st21nfc[ab] fixes, cleanups, and comments improvements.
 
 - A pn533 error return fix.
 
 - A few NFC related logs formatting cleanups.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVJV81AAoJEIqAPN1PVmxKM3sP/R6fKXMxtVxXsiPzBMk+SpBI
 onNXbCx27rp1lHTFznE16JAaaSfcQEwSQmYx7xa6KXqRYVlDfRfC+5R+rPGrCjfH
 kV/eLBNnYpmPw0ViVT7dsWK0b4me0k5pr9ki9mze3YxvuMbA5vdv0tvuRFz5IRF/
 hl+WI5pntGuRtnIyBKIauAMylylUYVvCBGhmHnveiX0Dp8noJLBSV0wvdzujm51S
 +Uio/jHlUEV2lotrQBOfWNtEkwonXSwzZWSzimBCyEGLAwTx4lGXHmQftOtPd3zE
 sOT7Gw77ZCsulHoHiJyC0KpDSDS0NYVrtTI5BiuGgAivGi0YZw2XD3CYRIdy0m2Z
 lMoodYdiCgsxmrU6I6Rd/7DQBxD0Vhc+CGAyk41f7AwU4Fwq105kpSLupTtU+NzT
 ZpdvSXeirU8sxt+3WDOgv8esyYGZVVD8GuBbofMZQZ0vLq+FcDpYAW4w3LKvvi6X
 C+WN8f7SCI0kqpd4leyl6EG3SoQKFyPWobu0IlV520R9b76iBcyqooTIpvVa4Yk7
 az6fKhi9gK/T6FHW78y6fnkczd47JKrC924m+g6P/hhTD5zQ956ferp0uFTjkPtF
 8kNgRT7OIRi1JO783cnQx1uA61axC3GX1HFzsD9paVkTwRNtJ3eFsxtcYt7Nfv8D
 WGNWRQp5LmBsD/SqFBfk
 =MzOJ
 -----END PGP SIGNATURE-----

Merge tag 'nfc-next-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next

Samuel Ortiz says:

====================
NFC: 4.1 pull request

This is the NFC pull request for 4.1.

This is a shorter one than usual, as the Intel Field Peak NFC
driver could not make it in time.

We have:

- A new driver for NXP NCI based chipsets, like e.g. the NPC100 or
  the PN7150. It currently only supports an i2c physical layer, but
  could easily be extended to work on top of e.g. SPI.
  This driver also includes support for user space triggered firmware
  updates.

- A few minor st21nfc[ab] fixes, cleanups, and comments improvements.

- A pn533 error return fix.

- A few NFC related logs formatting cleanups.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-08 17:12:50 -04:00
Pablo Neira Ayuso
aadd51aa71 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Resolve conflicts between 5888b93 ("Merge branch 'nf-hook-compress'") and
Florian Westphal br_netfilter works.

Conflicts:
        net/bridge/br_netfilter.c

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 18:30:21 +02:00
Florian Westphal
a1e67951e6 netfilter: bridge: make BRNF_PKT_TYPE flag a bool
nf_bridge_info->mask is used for several things, for example to
remember if skb->pkt_type was set to OTHER_HOST.

For a bridge, OTHER_HOST is expected case. For ip forward its a non-starter
though -- routing expects PACKET_HOST.

Bridge netfilter thus changes OTHER_HOST to PACKET_HOST before hook
invocation and then un-does it after hook traversal.

This information is irrelevant outside of br_netfilter.

After this change, ->mask now only contains flags that need to be
known outside of br_netfilter in fast-path.

Future patch changes mask into a 2bit state field in sk_buff, so that
we can remove skb->nf_bridge pointer for good and consider all remaining
places that access nf_bridge info content a not-so fastpath.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:49:12 +02:00
Florian Westphal
3eaf402502 netfilter: bridge: start splitting mask into public/private chunks
->mask is a bit info field that mixes various use cases.

In particular, we have flags that are mutually exlusive, and flags that
are only used within br_netfilter while others need to be exposed to
other parts of the kernel.

Remove BRNF_8021Q/PPPoE flags.  They're mutually exclusive and only
needed within br_netfilter context.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:49:11 +02:00
Florian Westphal
c737b7c451 netfilter: bridge: add helpers for fetching physin/outdev
right now we store this in the nf_bridge_info struct, accessible
via skb->nf_bridge.  This patch prepares removal of this pointer from skb:

Instead of using skb->nf_bridge->x, we use helpers to obtain the in/out
device (or ifindexes).

Followup patches to netfilter will then allow nf_bridge_info to be
obtained by a call into the br_netfilter core, rather than keeping a
pointer to it in sk_buff.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:49:08 +02:00
Florian Westphal
e70deecbf8 netfilter: bridge: don't use nf_bridge_info data to store mac header
br_netfilter maintains an extra state, nf_bridge_info, which is attached
to skb via skb->nf_bridge pointer.

Amongst other things we use skb->nf_bridge->data to store the original
mac header for every processed skb.

This is required for ip refragmentation when using conntrack
on top of bridge, because ip_fragment doesn't copy it from original skb.

However there is no need anymore to do this unconditionally.

Move this to the one place where its needed -- when br_netfilter calls
ip_fragment().

Also switch to percpu storage for this so we can handle fragmenting
without accessing nf_bridge meta data.

Only user left is neigh resolution when DNAT is detected, to hold
the original source mac address (neigh resolution builds new mac header
using bridge mac), so rename ->data and reduce its size to whats needed.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-04-08 16:49:07 +02:00
Daniel Lee
2646c831c0 tcp: RFC7413 option support for Fast Open client
Fast Open has been using an experimental option with a magic number
(RFC6994). This patch makes the client by default use the RFC7413
option (34) to get and send Fast Open cookies.  This patch makes
the client solicit cookies from a given server first with the
RFC7413 option. If that fails to elicit a cookie, then it tries
the RFC6994 experimental option. If that also fails, it uses the
RFC7413 option on all subsequent connect attempts.  If the server
returns a Fast Open cookie then the client caches the form of the
option that successfully elicited a cookie, and uses that form on
later connects when it presents that cookie.

The idea is to gradually obsolete the use of experimental options as
the servers and clients upgrade, while keeping the interoperability
meanwhile.

Signed-off-by: Daniel Lee <Longinus00@gmail.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 18:36:39 -04:00
Daniel Lee
7f9b838b71 tcp: RFC7413 option support for Fast Open server
Fast Open has been using the experimental option with a magic number
(RFC6994) to request and grant Fast Open cookies. This patch enables
the server to support the official IANA option 34 in RFC7413 in
addition.

The change has passed all existing Fast Open tests with both
old and new options at Google.

Signed-off-by: Daniel Lee <Longinus00@gmail.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 18:36:39 -04:00
Nicolas Dichtel
388069d302 netdevice.h: remove iflink description
Also move 'group' description to match the order of the net_device structure.

Fixes: 7a66bbc96c ("net: remove iflink field from struct net_device")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 17:30:45 -04:00
David Miller
7026b1ddb6 netfilter: Pass socket pointer down through okfn().
On the output paths in particular, we have to sometimes deal with two
socket contexts.  First, and usually skb->sk, is the local socket that
generated the frame.

And second, is potentially the socket used to control a tunneling
socket, such as one the encapsulates using UDP.

We do not want to disassociate skb->sk when encapsulating in order
to fix this, because that would break socket memory accounting.

The most extreme case where this can cause huge problems is an
AF_PACKET socket transmitting over a vxlan device.  We hit code
paths doing checks that assume they are dealing with an ipv4
socket, but are actually operating upon the AF_PACKET one.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 15:25:55 -04:00
David Miller
1c984f8a5d netfilter: Add socket pointer to nf_hook_state.
It is currently always set to NULL, but nf_queue is adjusted to be
prepared for it being set to a real socket by taking and releasing a
reference to that socket when necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 15:25:55 -04:00
David Miller
107a9f4dc9 netfilter: Add nf_hook_state initializer function.
This way we can consolidate where we setup new nf_hook_state objects,
to make sure the entire thing is initialized.

The only other place an nf_hook_object is instantiated is nf_queue,
wherein a structure copy is used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-07 15:25:55 -04:00
David S. Miller
c85d6975ef Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/mellanox/mlx4/cmd.c
	net/core/fib_rules.c
	net/ipv4/fib_frontend.c

The fib_rules.c and fib_frontend.c conflicts were locking adjustments
in 'net' overlapping addition and removal of code in 'net-next'.

The mlx4 conflict was a bug fix in 'net' happening in the same
place a constant was being replaced with a more suitable macro.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-06 22:34:15 -04:00
Linus Torvalds
442bb4bad9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) In TCP, don't register an FRTO for cumulatively ACK'd data that was
    previously SACK'd, from Neal Cardwell.

 2) Need to hold RNL mutex in ipv4 multicast code namespace cleanup,
    from Cong WANG.

 3) Similarly we have to hold RNL mutex for fib_rules_unregister(), also
    from Cong WANG.

 4) Revert and rework netns nsid allocation fix, from Nicolas Dichtel.

 5) When we encapsulate for a tunnel device, skb->sk still points to the
    user socket.  So this leads to cases where we retraverse the
    ipv4/ipv6 output path with skb->sk being of some other address
    family (f.e. AF_PACKET).  This can cause things to crash since the
    ipv4 output path is dereferencing an AF_PACKET socket as if it were
    an ipv4 one.

    The short term fix for 'net' and -stable is to elide these socket
    checks once we've entered an encapsulation sequence by testing
    xmit_recursion.

    Longer term we have a better solution wherein we pass the tunnel's
    socket down through the output paths, but that is way too invasive
    for 'net' and -stable.

    From Hannes Frederic Sowa.

 6) l2tp_init() failure path forgets to unregister per-net ops, from
    Cong WANG.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net/mlx4_core: Fix error message deprecation for ConnectX-2 cards
  net: dsa: fix filling routing table from OF description
  l2tp: unregister l2tp_net_ops on failure path
  mvneta: dont call mvneta_adjust_link() manually
  ipv6: protect skb->sk accesses from recursive dereference inside the stack
  netns: don't allocate an id for dead netns
  Revert "netns: don't clear nsid too early on removal"
  ip6mr: call del_timer_sync() in ip6mr_free_table()
  net: move fib_rules_unregister() under rtnl lock
  ipv4: take rtnl_lock and mark mrt table as freed on namespace cleanup
  tcp: fix FRTO undo on cumulative ACK of SACKed range
  xen-netfront: transmit fully GSO-sized packets
2015-04-06 15:19:59 -07:00
hannes@stressinduktion.org
f60e5990d9 ipv6: protect skb->sk accesses from recursive dereference inside the stack
We should not consult skb->sk for output decisions in xmit recursion
levels > 0 in the stack. Otherwise local socket settings could influence
the result of e.g. tunnel encapsulation process.

ipv6 does not conform with this in three places:

1) ip6_fragment: we do consult ipv6_npinfo for frag_size

2) sk_mc_loop in ipv6 uses skb->sk and checks if we should
   loop the packet back to the local socket

3) ip6_skb_dst_mtu could query the settings from the user socket and
   force a wrong MTU

Furthermore:
In sk_mc_loop we could potentially land in WARN_ON(1) if we use a
PF_PACKET socket ontop of an IPv6-backed vxlan device.

Reuse xmit_recursion as we are currently only interested in protecting
tunnel devices.

Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-06 16:12:49 -04:00
David S. Miller
b85c3dc9bd netfilter: Pass nf_hook_state through arpt_do_table().
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-04 13:26:52 -04:00
David S. Miller
8f8a37152d netfilter: Pass nf_hook_state through ip6t_do_table().
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-04 12:52:06 -04:00
David S. Miller
1c491ba259 netfilter: Pass nf_hook_state through ipt_do_table().
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-04 12:47:04 -04:00
David S. Miller
238e54c9cb netfilter: Make nf_hookfn use nf_hook_state.
Pass the nf_hook_state all the way down into the hook
functions themselves.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-04 12:31:38 -04:00
David S. Miller
cfdfab3146 netfilter: Create and use nf_hook_state.
Instead of passing a large number of arguments down into the nf_hook()
entry points, create a structure which carries this state down through
the hook processing layers.

This makes is so that if we want to change the types or signatures of
any of these pieces of state, there are less places that need to be
changed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-04 12:17:40 -04:00
Linus Torvalds
57a9d89dc0 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fix from Jens Axboe:
 "Just one patch in this pull request, fixing a regression caused by a
  'mathematically correct' change to lcm()"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: fix blk_stack_limits() regression due to lcm() change
2015-04-03 14:49:26 -07:00
Stas Sergeev
a3bebdce41 add fixed_phy_update_state() - update state of fixed_phy
Currently fixed_phy uses a callback to periodically poll the link state.
This patch adds the fixed_phy_update_state() API.
It solves the following problems:
- On link state interrupt, MAC driver can't update status.
Instead it needs to provide the callback to periodically query
the HW about the link state. It is more efficient to update status
after interrupt.
- The callback needs to be unregistered before phy_disconnect(),
or otherwise it will be called with net_dev==NULL. phy_disconnect()
does not have enough info to unregister the callback automatically.
- The callback needs to be registered before of_phy_connect() to
avoid running with outdated state, but of_phy_connect() returns the
phy_device pointer, which is needed to register the callback. Registering
it before of_phy_connect() will therefore require a hack to get the
pointer earlier.

Overall, this addition makes the subsequent patch that implements
SGMII link status for mvneta, much cleaner.

CC: Florian Fainelli <f.fainelli@gmail.com>
CC: netdev@vger.kernel.org
CC: linux-kernel@vger.kernel.org

Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-03 15:08:20 -04:00
Alexander Duyck
2e7056c433 jhash: Update jhash_[321]words functions to use correct initval
Looking over the implementation for jhash2 and comparing it to jhash_3words
I realized that the two hashes were in fact very different.  Doing a bit of
digging led me to "The new jhash implementation" in which lookup2 was
supposed to have been replaced with lookup3.

In reviewing the patch I noticed that jhash2 had originally initialized a
and b to JHASH_GOLDENRATIO and c to initval, but after the patch a, b, and
c were initialized to initval + (length << 2) + JHASH_INITVAL.  However the
changes in jhash_3words simply replaced the initialization of a and b with
JHASH_INITVAL.

This change corrects what I believe was an oversight so that a, b, and c in
jhash_3words all have the same value added consisting of initval + (length
<< 2) + JHASH_INITVAL so that jhash2 and jhash_3words will now produce the
same hash result given the same inputs.

Fixes: 60d509c823 ("The new jhash implementation")
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-03 12:52:29 -04:00
Rusty Russell
e79d8429aa netdevice: document NETDEV_TX_BUSY deprecation.
This paraphrases DaveM (and steals some of his words) explaining why
a device shouldn't return NETDEV_TX_BUSY, even though it looks so inviting
to driver authors.

See http://www.spinics.net/lists/netdev/msg322350.html

Inspired-by: David Miller <davem@davemloft.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-03 12:37:36 -04:00
Linus Torvalds
4e8a4830dc irqchip fixes for v4.0 (round 2)
- GICv3 ITS
     - Small batch of fixes discovered while writing the kvm ITS emulation
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJVHcHDAAoJEP45WPkGe8ZnrXUP/Rp67onmizG6AtVNTCB5HT0V
 6HNrLS4cnlpkFpWLf75ojZKfjJ/Gi3Bo4aqie9iWbOgGuI4/vxo35FdoBP8992k2
 Stm++FR1jWdxbsTZJxdIqkl2uw7enA/0wpJL2JttdHFb+7Inirh0rphEyNAw24y4
 Qe1hQT+c4CzCLA6ED5bqN7z28X9RoesW6dt8zDk/08IrolVsVSt+ZEdepBBBIZYf
 u3d+mOP0UohAew+g+hVLvwTplcQVQK+ZGdnzoyMf9s/3r2bAobGFZvCQxoJLm7fw
 tAV1Uf/iYFkqBzWYj8VC9crxY7QK2sr68OA+QMAkJoBYW0ypBowvggx+noAd4FO4
 jPKGbmEJrYxbk3woxwZKvihU1fXbsQq8OykEk/urZeZyCPIXz9TIDneiI+RGGh9a
 1o6Iu605bbKttwzlHjzU8C0y34qDLyJpA+zjIQJbXWOExyHynP4jIDEkvV7PGIQn
 Z88QKvkDn0TUdnPkaBUHOI1Y6SJm/ernR0FomIKWyd4innwfjyQCTRXaCryfRVWW
 lLRSakFyaRWjmNX50eDRp8x8yUPJhyg4Dd25Dd4KWZv5QhD7DJctywwjDCf+l9jC
 8U8BbvhbPO2pwXratGNcbjapotAHX9Y46BHr0TrI4nbcaUz5+JvPLMQE6UA/92SQ
 aT8pb4m7zfS5zhZPL06D
 =Q4Dt
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-fixes-4.0-2' of git://git.infradead.org/users/jcooper/linux

Pull irqchip fixes from Jason Cooper:
 "This is the second round of fixes for irqchip.  It contains some fixes
  found while the arm64 guys were writing the kvm gicv3 its emulation.

  GICv3 ITS:
    - Small batch of fixes discovered while writing the kvm ITS emulation"

* tag 'irqchip-fixes-4.0-2' of git://git.infradead.org/users/jcooper/linux:
  irqchip: gicv3-its: Use non-cacheable accesses when no shareability
  irqchip: gicv3-its: Fix PROP/PEND and BASE/CBASE confusion
  irqchip: gicv3-its: Fix device ID encoding
  irqchip: gicv3-its: Fix encoding of collection's target redistributor
2015-04-02 16:37:56 -07:00
Saeed Mahameed
64613d9499 net/mlx5_core: Extend struct mlx5_interface to support multiple protocols
Preparation for ethernet driver.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:43 -04:00
Saeed Mahameed
ce0f750932 net/mlx5_core: Modify arm CQ in preparation for upcoming Ethernet driver
Pass consumer index as a parameter to arm CQ

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:43 -04:00
Saeed Mahameed
233d05d28a net/mlx5_core: Move completion eqs from mlx5_ib to mlx5_core
Preparation for ethernet driver.
These functions will be used in drivers other than mlx5_ib.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:42 -04:00
Saeed Mahameed
302bdf68fc net/mlx5_core: Fix Mellanox copyright note
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:42 -04:00
Eli Cohen
64599cca51 net/mlx5_core: Use coherent memory for command interface page
Use coherent memory for the commands descriptor page. Take measures to make
sure the page is aligned to MLX5_ADAPTER_PAGE_SIZE as required by the hardware.

Reported-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:33:41 -04:00
Muhammad Mahajna
78500b8c03 net/mlx4_en: Add RX-ALL support
Enabled when the device supports KEEP FCS and IGNORE FCS.

When the flag is set, pass all received frames up the stack,
even ones with invalid FCS, controlled by ethtool.

Signed-off-by: Muhammad Mahajna <muhammadm@mellanox.com>
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:25:04 -04:00
Ido Shamay
51af33cfed net/mlx4_en: Add interface identify support
Add support for the interface ethtool identify feature.

Make the physical port LED to blink with green and yellow colors.

The device handles the LED blink by itself (synchrous use of
set_phys_id), by returning 0 to ETHTOOL_ID_ACTIVE command.

Signed-off-by: Eyal Grossman <eyalgr@mellanox.com>
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:25:03 -04:00
Ido Shamay
a130b59057 net/mlx4: Add SET_PORT opcode modifiers enumeration
The calls to SET_PORT used hard-code numbers, when supplying command's
opcode modifiers, fix that to use well defined constants.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:25:03 -04:00
Ido Shamay
3742cc6551 net/mlx4: Warn users of depracated QoS Firmware
A new capability bit was introduced in the past to to differ devices
using the QoS ETS feature. The old was deprecated since then.
If driver sees device which set only the old capabilty, it will print
warning to user suggesting to upgrade the FW.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:25:03 -04:00
Ido Shamay
cda373f484 net/mlx4_en: Enable TX rate limit per VF
Support granular QoS per VF, by implementing the ndo_set_vf_rate.

Enforce a rate limit per VF when called, and enabled only for VFs in
VST mode with user priority supported by the device.

We don't enforce VFs to be in VST mode at the moment of configuration,
but rather save the given rate limit and enforce it when the VF is
moved to VST with user priority which is supported (currently 0).

VST<->VGT or VST qos value state changes are disallowed when a rate
limit is configured. Minimum BW share is not supported yet.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:25:03 -04:00
Ido Shamay
08068cd568 net/mlx4: Added qos_vport QP configuration in VST mode
Granular QoS per VF feature introduce a new QP field, qos_vport.

PF administrator can connect VF QPs to a certain QoS Vport, to
inherit its proporties. Connecting QPs to the default QoS Vport
(defined as 0) is always allowed, even when there are no allocated VPPs.
At this point, only the default vport is connected to QPs.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:25:03 -04:00
Ido Shamay
d019fcb224 net/mlx4: Query device for QoS per VF support
Checks in QUERY_DEV_CAP if the granular QoS per VF feature is
supported by the device. Disabled for guests.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:25:02 -04:00
Ido Shamay
1c29146d38 net/mlx4: Add mlx4_SET_VPORT_QOS implementation
Add the SET_VPORT_QOS device command, which is ntended for virtual
granular QoS configuration per VF in SRIOV mode. The SET_VPORT_QOS
command sets and queries QoS parameters of a VPort. Each priority
allowed for a VPort is assigned with a share of the BW, and a BW
limitation. QoS parameters can be modified at any time, but must be
initialized before any QP is associated with the VPort.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:25:02 -04:00
Ido Shamay
7e95bb99a8 net/mlx4: Add mlx4_ALLOCATE_VPP implementation
Implements device ALLOCATE_VPP command, to be used for granular QoS
configuration of VFs by the PF device. Defines and queries the amount
of VPPs assigned to each port, and the amount of VPPs assigned to each
priority of each port. Once the total VPPs are split between the priorities
of a port, they may be assigned with a share of the BW or a rate limit.

Split into two functions (get/set) whoch are supplied with
mlx4_alloc_vpp_context and physical port number.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:25:02 -04:00
Ido Shamay
12a889c057 net/mlx4: New file for QoS related firmware commands
Create two new files fw_qos.h and fw_qos.c in mlx4_core module.

It gathers all relevant QoS firmware related commands etc, thus improving
encapsulation of the mlx4_core module. For now it contains the QoS existing
commands: mlx4_SET_PORT_SCHEDULER and mlx4_SET_PORT_PRIO2TC.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:25:02 -04:00
Ido Shamay
fccea6436a net/mlx4: Make mlx4_is_eth visible inline funcion
Currently implemented as static function in resource_tracker.c --
this change will allow other files in mlx4_core to use it as well.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:24:51 -04:00
Ido Shamay
802f42a8d9 net/mlx4: Add RSS support for fragmented IP datagrams
Enable RSS support for fragmented IP packets, when device supports it.
Until now, fragmented IP packets were directed only to the default_qpn.
Since IP fragments (datagram) have no upper protocols (L3 IP packets),
hash is performed on 3-tuple - dst MAC, source IP and dest IP. The HW
makes sure that this holds for the 1st fragment too, so all fragments
go to the same QP.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:24:50 -04:00
David S. Miller
9f0d34bc34 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/asix_common.c
	drivers/net/usb/sr9800.c
	drivers/net/usb/usbnet.c
	include/linux/usb/usbnet.h
	net/ipv4/tcp_ipv4.c
	net/ipv6/tcp_ipv6.c

The TCP conflicts were overlapping changes.  In 'net' we added a
READ_ONCE() to the socket cached RX route read, whilst in 'net-next'
Eric Dumazet touched the surrounding code dealing with how mini
sockets are handled.

With USB, it's a case of the same bug fix first going into net-next
and then I cherry picked it back into net.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 16:16:53 -04:00
Linus Torvalds
8172ba51e2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix use-after-free with mac80211 RX A-MPDU reorder timer, from
    Johannes Berg.

 2) iwlwifi leaks memory every module load/unload cycles, fix from Larry
    Finger.

 3) Need to use for_each_netdev_safe() in rtnl_group_changelink()
    otherwise we can crash, from WANG Cong.

 4) mlx4 driver does register_netdev() too early in the probe sequence,
    from Ido Shamay.

 5) Don't allow router discovery hop limit to decrease the interface's
    hop limit, from D.S. Ljungmark.

 6) tx_packets and tx_bytes improperly accounted for certain classes of
    USB network devices, fix from Ben Hutchings.

 7) ip{6}mr_rules_init() mistakenly use plain kfree to release the ipmr
    tables in the error path, they must instead use ip{6}mr_free_table().
    Fix from WANG Cong.

 8) cxgb4 doesn't properly quiesce all RX activity before unregistering
    the netdevice.  Fix from Hariprasad Shenai.

 9) Fix hash corruptions in ipvlan driver, from Jiri Benc.

10) nla_memcpy(), like a real memcpy, should fully initialize the
    destination buffer, even if the source attribute is smaller.  Fix
    from Jiri Benc.

11) Fix wrong error code returned from iucv_sock_sendmsg().  We should
    use whatever sock_alloc_send_skb() put into 'err'.  From Eugene
    Crosser.

12) Fix slab object leak on module unload in TIPC, from Ying Xue.

13) Need a READ_ONCE() when reading the cached RX socket route in
    tcp_v{4,6}_early_demux().  From Michal Kubecek.

14) Still too many problems with TPC support in the ath9k driver, so
    disable it for now.  From Felix Fietkau.

15) When in AP mode the rtlwifi driver can leak DMA mappings, fix from
    Larry Finger.

16) Missing kzalloc() failure check in gs_usb CAN driver, from Colin Ian
    King.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
  cxgb4: Fix to dump devlog, even if FW is crashed
  cxgb4: Firmware macro changes for fw verison 1.13.32.0
  bnx2x: Fix kdump when iommu=on
  bnx2x: Fix kdump on 4-port device
  mac80211: fix RX A-MPDU session reorder timer deletion
  MAINTAINERS: Update Intel Wired Ethernet Driver info
  tipc: fix a slab object leak
  net/usb/r8152: add device id for Lenovo TP USB 3.0 Ethernet
  af_iucv: fix AF_IUCV sendmsg() errno
  openvswitch: Return vport module ref before destruction
  netlink: pad nla_memcpy dest buffer with zeroes
  bonding: Bonding Overriding Configuration logic restored.
  ipvlan: fix check for IP addresses in control path
  ipvlan: do not use rcu operations for address list
  ipvlan: protect against concurrent link removal
  ipvlan: fix addr hash list corruption
  net: fec: setup right value for mdio hold time
  net: tcp6: fix double call of tcp_v6_fill_cb()
  cxgb4vf: Fix sparse warnings
  netns: don't clear nsid too early on removal
  ...
2015-04-02 11:09:41 -07:00
Nicolas Dichtel
7a66bbc96c net: remove iflink field from struct net_device
Now that all users of iflink have the ndo_get_iflink handler available, it's
possible to remove this field.

By default, dev_get_iflink() returns the ifindex of the interface.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 14:05:01 -04:00
Nicolas Dichtel
a54acb3a6f dev: introduce dev_get_iflink()
The goal of this patch is to prepare the removal of the iflink field. It
introduces a new ndo function, which will be implemented by virtual interfaces.

There is no functional change into this patch. All readers of iflink field
now call dev_get_iflink().

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02 14:04:59 -04:00
David S. Miller
45eb516887 Major changes:
ath9k:
 
 * add Active Interference Cancellation, a method implemented in the HW
   to counter WLAN RX > sensitivity degradation when BT is transmitting
   at the same time. This feature is supported by cards like WB222
   based on AR9462.
 
 iwlwifi:
 
 * Location Aware Regulatory was added by Arik
 * 8000 device family work
 * update to the BT Coex firmware API
 
 brmcfmac:
 
 * add new BCM43455 and BCM43457 SDIO device support
 * add new BCM43430 SDIO device support
 
 wil6210:
 
 * take care of AP bridging
 * fix NAPI behavior
 * found approach to achieve 4*n+2 alignment of Rx frames
 
 rt2x00:
 
 * add new rt2800usb device DWA 130
 
 rtlwifi:
 
 * add USB ID for D-Link DWA-131
 * add USB ID ASUS N10 WiFi dongle
 
 mwifiex:
 
 * throughput enhancements
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJVHATOAAoJEG4XJFUm622b4EsIAIMGOLb2GuqtxpN/Ei1TmUFV
 8B3KTLmeau/glqxquySQ2mIthDqijesbl0jyKWJP/+ZsXhjhpagXAIMw6E5KH3HN
 1XKCvyqbkGScTcheaS4HYFmKKoM0OlPnDKhybYtPiScW2kyQf8S9msbeEzLdYYen
 qYKMRq/J/QuobWvtapyGOBtyuVtuTrKuP8LzKQX9JAyKtM2si/iOvG7EiFmcAcis
 DGJBrwhy/i6oBoi6e6KGgxIpXJE9WHsMs2QBYpFkxV2SbzXTOyq1EMFJvMHCq927
 QfEsgOgAyVLfsXqN1uNK49eJtTuWRuJehEV88xjUbMXcYQNiaKdnYY6KvIBQ7XU=
 =POsk
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-next-for-davem-2015-04-01' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
Major changes:

ath9k:

* add Active Interference Cancellation, a method implemented in the HW
  to counter WLAN RX > sensitivity degradation when BT is transmitting
  at the same time. This feature is supported by cards like WB222
  based on AR9462.

iwlwifi:

* Location Aware Regulatory was added by Arik
* 8000 device family work
* update to the BT Coex firmware API

brmcfmac:

* add new BCM43455 and BCM43457 SDIO device support
* add new BCM43430 SDIO device support

wil6210:

* take care of AP bridging
* fix NAPI behavior
* found approach to achieve 4*n+2 alignment of Rx frames

rt2x00:

* add new rt2800usb device DWA 130

rtlwifi:

* add USB ID for D-Link DWA-131
* add USB ID ASUS N10 WiFi dongle

mwifiex:

* throughput enhancements
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-01 14:27:28 -04:00
Linus Torvalds
b6c3a5946c This fixes a problem in the lazy time patches, which can cause
frequently updated inods to never have their timestamps updated.
 These changes guarantee that no timestamp on disk will be stale by
 more than 24 hours.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJVGx6pAAoJEPL5WVaVDYGjh5cIAKQAyGST92IbTkxRZsxMgqnH
 7LQI+fbNn6oHGEjSSnsWLxl6CpwT4WrCmj8WhVmpAoTLU958nBbF7iZAaaeQCGeS
 3EqaNOlKvuOK9M5PKK7a5AWO04uJuj+t6s536OqHyB1zRb1yYMsywllPzu63eigA
 jxu2yZxkFIKjo2ohSaTDRONVCsQGlqgZ2Aq/Ho5vy5QffVJKTN1G/3Kf33xukUyr
 SAnndaax23jMqcFJE3gePYXc3W8EuGoloehKyo04qFeNNVMmSoytXAwMzcTmHn+H
 biOTN5ezSKbYzv1aevRg7UuSPv17/yIo3aEberfLBgsn5O4wJGDdS+LajaI5/x8=
 =0k0d
 -----END PGP SIGNATURE-----

Merge tag 'lazytime_fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull lazytime fixes from Ted Ts'o:
 "This fixes a problem in the lazy time patches, which can cause
  frequently updated inods to never have their timestamps updated.

  These changes guarantee that no timestamp on disk will be stale by
  more than 24 hours"

* tag 'lazytime_fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  fs: add dirtytime_expire_seconds sysctl
  fs: make sure the timestamps for lazytime inodes eventually get written
2015-04-01 10:05:42 -07:00
Linus Torvalds
1e848913f0 Merge branch 'for-4.0' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
 "Two main issues:

   - We found that turning on pNFS by default (when it's configured at
     build time) was too aggressive, so we want to switch the default
     before the 4.0 release.

   - Recent client changes to increase open parallelism uncovered a
     serious bug lurking in the server's open code.

  Also fix a krb5/selinux regression.

  The rest is mainly smaller pNFS fixes"

* 'for-4.0' of git://linux-nfs.org/~bfields/linux:
  sunrpc: make debugfs file creation failure non-fatal
  nfsd: require an explicit option to enable pNFS
  NFSD: Fix bad update of layout in nfsd4_return_file_layout
  NFSD: Take care the return value from nfsd4_encode_stateid
  NFSD: Printk blocklayout length and offset as format 0x%llx
  nfsd: return correct lockowner when there is a race on hash insert
  nfsd: return correct openowner when there is a race to put one in the hash
  NFSD: Put exports after nfsd4_layout_verify fail
  NFSD: Error out when register_shrinker() fail
  NFSD: Take care the return value from nfsd4_decode_stateid
  NFSD: Check layout type when returning client layouts
  NFSD: restore trace event lost in mismerge
2015-04-01 09:45:47 -07:00