Only one of the two callers of ceph_osdc_alloc_request() provides
page or bio data for its payload. And essentially all that function
was doing with those arguments was assigning them to fields in the
osd request structure.
Simplify ceph_osdc_alloc_request() by having the caller take care of
making those assignments
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
The only thing ceph_osdc_alloc_request() really does with the
flags value it is passed is assign it to the newly-created
osd request structure. Do that in the caller instead.
Both callers subsequently call ceph_osdc_build_request(), so have
that function (instead of ceph_osdc_alloc_request()) issue a warning
if a request comes through with neither the read nor write flags set.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
The osdc parameter to ceph_calc_raw_layout() is not used, so get rid
of it. Consequently, the corresponding parameter in calc_layout()
becomes unused, so get rid of that as well.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
A snapshot id must be provided to ceph_calc_raw_layout() even though
it is not needed at all for calculating the layout.
Where the snapshot id *is* needed is when building the request
message for an osd operation.
Drop the snapid parameter from ceph_calc_raw_layout() and pass
that value instead in ceph_osdc_build_request().
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
ceph_calc_file_object_mapping() takes (among other things) a "file"
offset and length, and based on the layout, determines the object
number ("bno") backing the affected portion of the file's data and
the offset into that object where the desired range begins. It also
computes the size that should be used for the request--either the
amount requested or something less if that would exceed the end of
the object.
This patch changes the input length parameter in this function so it
is used only for input. That is, the argument will be passed by
value rather than by address, so the value provided won't get
updated by the function.
The value would only get updated if the length would surpass the
current object, and in that case the value it got updated to would
be exactly that returned in *oxlen.
Only one of the two callers is affected by this change. Update
ceph_calc_raw_layout() so it records any updated value.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
The len argument to ceph_osdc_build_request() is set up to be
passed by address, but that function never updates its value
so there's no need to do this. Tighten up the interface by
passing the length directly.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
An osd request structure contains an optional trail portion, which
if present will contain data to be passed in the payload portion of
the message containing the request. The trail field is a
ceph_pagelist pointer, and if null it indicates there is no trail.
A ceph_pagelist structure contains a length field, and it can
legitimately hold value 0. Make use of this to change the
interpretation of the "trail" of an osd request so that every osd
request has trailing data, it just might have length 0.
This means we change the r_trail field in a ceph_osd_request
structure from a pointer to a structure that is always initialized.
Note that in ceph_osdc_start_request(), the trail pointer (or now
address of that structure) is assigned to a ceph message's trail
field. Here's why that's still OK (looking at net/ceph/messenger.c):
- What would have resulted in a null pointer previously will now
refer to a 0-length page list. That message trail pointer
is used in two functions, write_partial_msg_pages() and
out_msg_pos_next().
- In write_partial_msg_pages(), a null page list pointer is
handled the same as a message with 0-length trail, and both
result in a "in_trail" variable set to false. The trail
pointer is only used if in_trail is true.
- The only other place the message trail pointer is used is
out_msg_pos_next(). That function is only called by
write_partial_msg_pages() and only touches the trail pointer
if the in_trail value it is passed is true.
Therefore a null ceph_msg->trail pointer is equivalent to a non-null
pointer referring to a 0-length page list structure.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
The last two parameters to ceph_osd_build_request() describe the
object id, but the values passed always come from the osd request
structure whose address is also provided. Get rid of those last
two parameters.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
It's kind of a silly macro, but ceph_encode_8_safe() is the only one
missing from an otherwise pretty complete set. It's not used, but
neither are a couple of the others in this set.
While in there, insert some whitespace to tidy up the alignment of
the line-terminating backslashes in some of the macro definitions.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Add libceph support for a new CRUSH tunable recently added to Ceph servers.
Consider the CRUSH rule
step chooseleaf firstn 0 type <node_type>
This rule means that <n> replicas will be chosen in a manner such that
each chosen leaf's branch will contain a unique instance of <node_type>.
When an object is re-replicated after a leaf failure, if the CRUSH map uses
a chooseleaf rule the remapped replica ends up under the <node_type> bucket
that held the failed leaf. This causes uneven data distribution across the
storage cluster, to the point that when all the leaves but one fail under a
particular <node_type> bucket, that remaining leaf holds all the data from
its failed peers.
This behavior also limits the number of peers that can participate in the
re-replication of the data held by the failed leaf, which increases the
time required to re-replicate after a failure.
For a chooseleaf CRUSH rule, the tree descent has two steps: call them the
inner and outer descents.
If the tree descent down to <node_type> is the outer descent, and the descent
from <node_type> down to a leaf is the inner descent, the issue is that a
down leaf is detected on the inner descent, so only the inner descent is
retried.
In order to disperse re-replicated data as widely as possible across a
storage cluster after a failure, we want to retry the outer descent. So,
fix up crush_choose() to allow the inner descent to return immediately on
choosing a failed leaf. Wire this up as a new CRUSH tunable.
Note that after this change, for a chooseleaf rule, if the primary OSD
in a placement group has failed, choosing a replacement may result in
one of the other OSDs in the PG colliding with the new primary. This
requires that OSD's data for that PG to need moving as well. This
seems unavoidable but should be relatively rare.
This corresponds to ceph.git commit 88f218181a9e6d2292e2697fc93797d0f6d6e5dc.
Signed-off-by: Jim Schutt <jaschut@sandia.gov>
Reviewed-by: Sage Weil <sage@inktank.com>
The mds now sends back a created inode if the create request
performed the create. If the file already existed, no inode is
returned in the reply. This allows ceph to set the created flag
in atomic_open so that permissions are properly checked in the case
that the file wasn't created by the create call to the mds.
To ensure compability with previous kernels, a feature for sending
back the inode in the create reply was added, so that the mds will
only send back the inode if the client indicates it supports the
feature.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
This would reset a connection with any OSD that had an outstanding
request that was taking more than N seconds. The idea was that if the
OSD was buggy, the client could compensate by resending the request.
In reality, this only served to hide server bugs, and we haven't
actually seen such a bug in quite a while. Moreover, the userspace
client code never did this.
More importantly, often the request is taking a long time because the
OSD is trying to recover, or overloaded, and killing the connection
and retrying would only make the situation worse by giving the OSD
more work to do.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Define and export function ceph_pg_pool_name_by_id() to supply
the name of a pg pool whose id is given. This will be used by
the next patch.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Add support for getting the the information identifying the parent
image for rbd images that have them. The child image holds a
reference to its parent image specification structure. Create a new
entry "parent" in /sys/bus/rbd/image/N/ to report the identifying
information for the parent image, if any.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
If we encounter an invalid (e.g., zeroed) mapping, return an error
and avoid a divide by zero.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Two small patches:
* One patch to fix the function declarations for
!CONFIG_IOMMU_API. This is causing build errors
in linux-next and should be fixed for v3.6.
* Another patch to fix an IOMMU group related NULL pointer
dereference.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQZfKiAAoJECvwRC2XARrjc7wP/jprdrOfr8xD8jKxmodVpjyE
87yQmncv3YwbrcnHiPBywsFgdfdzgKp/ds7Rm9lR8Qn8bhLqvPI2fNpghAcA+ext
512I89JaUyrqRKcFKWAuFsrhzs58zlN0VfiZgksDpRTjBEqIUk12sHq5w8TRIiu2
SrHVxxUmcYIWyTjPBdCHbjjKlUXdd1XEtyF2l4Dk22JoUmXIwytHzrnit0bW2lW7
TtNBGVApwJomVYpAEkCzHYYgyX1HMSxwopm3DI2UpQGOpO2M11MXhFb1emEdwCeY
QMqRQxO6mqVDzQRC1AVBVT+CltI8ZUWy0mOvAYnp2DEgFFGhfLvtesLpws0FN3ie
IFm2PRahhlkPCBjH+K/AtC5MEnJBSATEg3VgRELWkUTz8pfmuKs4hwgkzb/SZqyX
cJAwuhSgsh8idwxrN78HlBoURMEvvi9MrYU8AguDKBpPPP1OlwE4/cEAFfsQINEL
7fqjUrF+mTmsnLpdTgth65fJZUmTd4jpsfCo5HKNW7FaidxyEFMW6Ws0hSz8B3h+
OcBLwj21CiFz5SLr1MQKt2koretb7GueFOTBTEWMOyQAputdHFjXHgcA3OPEy60O
Ou0cAU50V50DejxROwtJ0j9zPGnmqcAcnqXdZ4cvdCVFEjmEwNE22Ih914bAoT0x
6UxWUqSsIufKL1m6UNI1
=QISV
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v3.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"Two small patches:
* One patch to fix the function declarations for
!CONFIG_IOMMU_API. This is causing build errors
in linux-next and should be fixed for v3.6.
* Another patch to fix an IOMMU group related NULL pointer
dereference."
* tag 'iommu-fixes-v3.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Fix wrong assumption in iommu-group specific code
iommu: static inline iommu group stub functions
Pull NVMe driver fixes from Matthew Wilcox:
"Now that actual hardware has been released (don't have any yet
myself), people are starting to want some of these fixes merged."
Willy doesn't have hardware? Guys...
* git://git.infradead.org/users/willy/linux-nvme:
NVMe: Cancel outstanding IOs on queue deletion
NVMe: Free admin queue memory on initialisation failure
NVMe: Use ida for nvme device instance
NVMe: Fix whitespace damage in nvme_init
NVMe: handle allocation failure in nvme_map_user_pages()
NVMe: Fix uninitialized iod compiler warning
NVMe: Do not set IO queue depth beyond device max
NVMe: Set block queue max sectors
NVMe: use namespace id for nvme_get_features
NVMe: replace nvme_ns with nvme_dev for user admin
NVMe: Fix nvme module init when nvme_major is set
NVMe: Set request queue logical block size
Pull more networking fixes from David Miller:
1) Eric Dumazet discovered and fixed what turned out to be a family of
bugs. These functions were using pskb_may_pull() which might need
to reallocate the linear SKB data buffer, but the callers were not
expecting this possibility. The callers have cached pointers to the
packet header areas, and would need to reload them if we were to
continue using pskb_may_pull().
So they could end up reading garbage.
It's easier to just change these RAW4/RAW6/MIP6 routines to use
skb_header_pointer() instead of pskb_may_pull(), which won't modify
the linear SKB data area.
2) Dave Jone's syscall spammer caught a case where a non-TCP socket can
call down into the TCP keepalive code. The case basically involves
creating a raw socket with sk_protocol == IPPROTO_TCP, then calling
setsockopt(sock_fd, SO_KEEPALIVE, ...)
Fixed by Eric Dumazet.
3) Bluetooth devices do not get configured properly while being powered
on, resulting in always using legacy pairing instead of SSP. Fix
from Andrzej Kaczmarek.
4) Bluetooth cancels delayed work erroneously, put stricter checks in
place. From Andrei Emeltchenko.
5) Fix deadlock between cfg80211_mutex and reg_regdb_search_mutex in
cfg80211, from Luis R. Rodriguez.
6) Fix interrupt double release in iwlwifi, from Emmanuel Grumbach.
7) Missing module license in bcm87xx driver, from Peter Huewe.
8) Team driver can lose port changed events when adding devices to a
team, fix from Jiri Pirko.
9) Fix endless loop when trying ot unregister PPPOE device in zombie
state, from Xiaodong Xu.
10) batman-adv layer needs to set MAC address of software device
earlier, otherwise we call tt_local_add with it uninitialized.
11) Fix handling of KSZ8021 PHYs, it's matched currently by KS8051 but
that doesn't program the device properly. From Marek Vasut.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
ipv6: mip6: fix mip6_mh_filter()
ipv6: raw: fix icmpv6_filter()
net: guard tcp_set_keepalive() to tcp sockets
phy/micrel: Add missing header to micrel_phy.h
phy/micrel: Rename KS80xx to KSZ80xx
phy/micrel: Implement support for KSZ8021
batman-adv: Fix symmetry check / route flapping in multi interface setups
batman-adv: Fix change mac address of soft iface.
pppoe: drop PPPOX_ZOMBIEs in pppoe_release
team: send port changed when added
ipv4: raw: fix icmp_filter()
net/phy/bcm87xx: Add MODULE_LICENSE("GPL") to GPL driver
iwlwifi: don't double free the interrupt in failure path
cfg80211: fix possible circular lock on reg_regdb_search()
Bluetooth: Fix not removing power_off delayed work
Bluetooth: Fix freeing uninitialized delayed works
Bluetooth: mgmt: Fix enabling LE while powered off
Bluetooth: mgmt: Fix enabling SSP while powered off
Commit 1ad75b9e16 ("c/r: prctl: add minimal address test to
PR_SET_MM") added some address checking to prctl_set_mm() used by
checkpoint-restore. This causes a build error for no-MMU systems:
kernel/sys.c: In function 'prctl_set_mm':
kernel/sys.c:1868:34: error: 'mmap_min_addr' undeclared (first use in this function)
The test for mmap_min_addr doesn't make a lot of sense for no-MMU code
as noted in commit 6e14154676 ("NOMMU: Optimise away the
{dac_,}mmap_min_addr tests").
This patch defines mmap_min_addr as 0UL in the no-MMU case so that the
compiler will optimize away tests for "addr < mmap_min_addr".
Signed-off-by: Mark Salter <msalter@redhat.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: <stable@vger.kernel.org> [3.6.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The license header was missing in micrel_phy.h . This patch adds
one.
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David J. Choi <david.choi@micrel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no such part as KS8001, KS8041 or KS8051. There are only
KSZ8001, KSZ8041 and KSZ8051. Rename these parts as such to match
the Micrel naming.
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David J. Choi <david.choi@micrel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Cc: Linux ARM kernel <linux-arm-kernel@lists.infradead.org>
Cc: Fabio Estevam <fabio.estevam@freescale.com>
Cc: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The KSZ8021 PHY was previously caught by KS8051, which is not correct.
This PHY needs additional setup if it is strapped for address 0. In such
case an reserved bit must be written in the 0x16, "Operation Mode Strap
Override" register. According to the KS8051 datasheet, that bit means
"PHY Address 0 in non-broadcast" and it indeed behaves as such on KSZ8021.
The issue where the ethernet controller (Freescale FEC) did not communicate
with network is fixed by writing this bit as 1.
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: David J. Choi <david.choi@micrel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking updates from David Miller:
"More bug fixes, nothing gets past these guys"
1) More kernel info leaks found by Mathias Krause, this time in the
IPSEC configuration layers.
2) When IPSEC policies change, we do not properly make sure that cached
routes (which could now be stale) throughout the system will be
revalidated. Fix this by generalizing the generation count
invalidation scheme used by ipv4. From Nicolas Dichtel.
3) When repairing TCP sockets, we need to allow to restore not just the
send window scale, but the receive one too. Extend the existing
interface to achieve this in a backwards compatible way. From
Andrey Vagin.
4) A fix for FCOE scatter gather feature validation erroneously caused
scatter gather to be disabled for things like AOE too. From Ed L
Cashin.
5) Several cases of mishandling of error pointers, from Mathias Krause,
Wei Yongjun, and Devendra Naga.
6) Fix gianfar build, from Richard Cochran.
7) CAP_NET_* failures should return -EPERM not -EACCES, from Zhao
Hongjiang.
8) Hardware reset fix in janz-ican3 CAN driver, from Ira W Snyder.
9) Fix oops during rmmod in ti_hecc CAN driver, from Marc Kleine-Budde.
10) The removal of the conditional compilation of the clk support code
in the stmmac driver broke things. This is because the interfaces
used are the ones that don't also perform the enable/disable of the
clk. Fix from Stefan Roese.
11) The QFQ packet scheduler can record out of range virtual start
times, resulting later in misbehavior and even crashes. Fix from
Paolo Valente.
12) If MSG_WAITALL is used with IOAT DMA under TCP, we can wedge the
receiver when the advertised receive window goes to zero. Detect
this case and force the processing of the IOAT DMA queue when it
happens to avoid getting stuck. Fix from Michal Kubecek.
13) batman-adv assumes that test_bit() returns only 0 or 1, but this is
not true for x86 (which returns -1 or 0, via the 'sbb' instruction).
Fix from Linus Lussing.
14) Fix small packet corruption in e1000, from Tushar Dave.
15) make_blackhole() in the IPSEC policy code can do one read unlock too
many, fix from Li RongQing.
16) The new tcp_try_coalesce() code introduced a bug in TCP URG
handling, fix from Eric Dumazet.
17) Fix memory leak in __netif_receive_skb() when doing zerocopy and
when hit an OOM condition. From Michael S Tsirkin.
18) netxen blindly deferences pdev->bus->self, which is not guarenteed
to be non-NULL. Fix from Nikolay Aleksandrov.
19) Fix a performance regression caused by mistakes in ipv6 checksum
validation in the bnx2x driver, fix from Michal Schmidt.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits)
net/stmmac: Use clk_prepare_enable and clk_disable_unprepare
net: change return values from -EACCES to -EPERM
net/irda: sh_sir: fix return value check in sh_sir_set_baudrate()
stmmac: fix return value check in stmmac_open_ext_timer()
gianfar: fix phc index build failure
ipv6: fix return value check in fib6_add()
bnx2x: remove false warning regarding interrupt number
can: ti_hecc: fix oops during rmmod
can: janz-ican3: fix support for older hardware revisions
net: do not disable sg for packets requiring no checksum
aoe: assert AoE packets marked as requiring no checksum
at91ether: return PTR_ERR if call to clk_get fails
xfrm_user: don't copy esn replay window twice for new states
xfrm_user: ensure user supplied esn replay window is valid
xfrm_user: fix info leak in copy_to_user_tmpl()
xfrm_user: fix info leak in copy_to_user_policy()
xfrm_user: fix info leak in copy_to_user_state()
xfrm_user: fix info leak in copy_to_user_auth()
net: qmi_wwan: adding Huawei E367, ZTE MF683 and Pantech P4200
tcp: restore rcv_wscale in a repair mode (v2)
...
Bump maximum wait time for applesmc driver (again)
Fix build warning seen with W=1 in include/linux/kernel.h, introduced
with b6d86d3 (Fix DIV_ROUND_CLOSEST to support negative dividends)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQWhFlAAoJEMsfJm/On5mBNisP/iex3oyGvUjyW8ywdrEDZ03C
TPMn3CIajCIA9T9HJh3CBc0bUX/NP7+M2dzNsXl0Nh6voJy6+6u0AgF9SpZpH9ke
VDm5DVW8M66q/g0DRd++UO/KBfTWoQ+lncclhXErdnqIUSII40XE6N0o5VgpT5EJ
V13QlaS8EiEw/TD7tnOgOdLczM6TWYrsKVu2JjQDrRdJuMz0xvTXr4MFpdZuc0G1
oxYlvGI5rdkIfkdhXuyD4yxs34Pl//W6K0nj6M9F3cwZcmh3gdPLaQxeck5sHtL+
63QLdSc1BDmyRS2P0slFNZRmRvresxOSKL5CqXs+AyaQ5R9fiMKY0JOQJb9TME9R
5nND0ZyTbm57IKUxVAdDvdDD7C037vS8UZLyCXLDgNY1WNsMm8puk+cCvFtTxO3w
0wlmdPDLXihtgMkmGHssoRPSlcDrk9P6ovAyatbrEkbwUUzRDdAGN2cHkXuwuVkc
OrD7Bk8aTlJeR8nvL9dORcJtSZ+0xSOsv7/8j+sKpWu0D+i/TIoDPELfe0VvljwA
J46kS4oQR1tZzEZnEE54jWv/22I6WHll6vUzgGoRDp7zfuj/JAWlO9Ik8DUU1uBO
q/8Qf7RyN5p1PbKMO8l+23r4UC3MNczMzVlhLBBHGGUMY0F6u8Nq20Z0TE6fn10q
QITsxQ90n2dAicKhNFMD
=0dS3
-----END PGP SIGNATURE-----
Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- Add missing 'name' sysfs attributes to ad7314 and ads7871 drivers
- Bump maximum wait time for applesmc driver (again)
- Fix build warning seen with W=1 in include/linux/kernel.h, introduced
with commit b6d86d3d6d ("Fix DIV_ROUND_CLOSEST to support negative
dividends")
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
linux/kernel.h: Fix warning seen with W=1 due to change in DIV_ROUND_CLOSEST
hwmon: (applesmc) Bump max wait
hwmon: (ad7314) Add 'name' sysfs attribute
hwmon: (ads7871) Add 'name' sysfs attribute
The current code fails to ensure that the netlink message actually
contains as many bytes as the header indicates. If a user creates a new
state or updates an existing one but does not supply the bytes for the
whole ESN replay window, the kernel copies random heap bytes into the
replay bitmap, the ones happen to follow the XFRMA_REPLAY_ESN_VAL
netlink attribute. This leads to following issues:
1. The replay window has random bits set confusing the replay handling
code later on.
2. A malicious user could use this flaw to leak up to ~3.5kB of heap
memory when she has access to the XFRM netlink interface (requires
CAP_NET_ADMIN).
Known users of the ESN replay window are strongSwan and Steffen's
iproute2 patch (<http://patchwork.ozlabs.org/patch/85962/>). The latter
uses the interface with a bitmap supplied while the former does not.
strongSwan is therefore prone to run into issue 1.
To fix both issues without breaking existing userland allow using the
XFRMA_REPLAY_ESN_VAL netlink attribute with either an empty bitmap or a
fully specified one. For the former case we initialize the in-kernel
bitmap with zero, for the latter we copy the user supplied bitmap. For
state updates the full bitmap must be supplied.
To prevent overflows in the bitmap length calculation the maximum size
of bmp_len is limited to 128 by this patch -- resulting in a maximum
replay window of 4096 packets. This should be sufficient for all real
life scenarios (RFC 4303 recommends a default replay window size of 64).
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Martin Willi <martin@revosec.ch>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit b6d86d3d (Fix DIV_ROUND_CLOSEST to support negative dividends),
the following warning is seen if the kernel is compiled with W=1 (-Wextra):
warning: comparison of unsigned expression >= 0 is always true
The warning is due to the test '((typeof(x))-1) >= 0', which is used to detect
if the variable type is unsigned. Research on the web suggests that the warning
disappears if '>' instead of '>=' is used for the comparison.
Tests after changing the macro along that line show that the warning is gone,
and that the result is still correct:
i=-4: DIV_ROUND_CLOSEST(i, 2)=-2
i=-3: DIV_ROUND_CLOSEST(i, 2)=-2
i=-2: DIV_ROUND_CLOSEST(i, 2)=-1
i=-1: DIV_ROUND_CLOSEST(i, 2)=-1
i=0: DIV_ROUND_CLOSEST(i, 2)=0
i=1: DIV_ROUND_CLOSEST(i, 2)=1
i=2: DIV_ROUND_CLOSEST(i, 2)=1
i=3: DIV_ROUND_CLOSEST(i, 2)=2
i=4: DIV_ROUND_CLOSEST(i, 2)=2
Code size is the same as before.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
IBM reported a soft lockup after applying the fix for the rename_lock
deadlock. Commit c83ce989cb ("VFS: Fix the nfs sillyrename regression
in kernel 2.6.38") was found to be the culprit.
The nfs sillyrename fix used DCACHE_DISCONNECTED to indicate that the
dentry was killed. This flag can be set on non-killed dentries too,
which results in infinite retries when trying to traverse the dentry
tree.
This patch introduces a separate flag: DCACHE_DENTRY_KILLED, which is
only set in d_kill() and makes try_to_ascend() test only this flag.
IBM reported successful test results with this patch.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
gcc 4.6+ has support for a externally_visible attribute that prevents the
optimizer from optimizing unused symbols away. Add a __visible macro to
use it with that compiler version or later.
This is used (at least) by the "Link Time Optimization" patchset.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I found following definition in include/linux/memory.h, in my IA64
platform, SECTION_SIZE_BITS is equal to 32, and MIN_MEMORY_BLOCK_SIZE
will be 0.
#define MIN_MEMORY_BLOCK_SIZE (1 << SECTION_SIZE_BITS)
Because MIN_MEMORY_BLOCK_SIZE is int type and length of 32bits,
so MIN_MEMORY_BLOCK_SIZE(1 << 32) will will equal to 0.
Actually when SECTION_SIZE_BITS >= 31, MIN_MEMORY_BLOCK_SIZE will be wrong.
This will cause wrong system memory infomation in sysfs.
I think it should be:
#define MIN_MEMORY_BLOCK_SIZE (1UL << SECTION_SIZE_BITS)
And "echo offline > memory0/state" will cause following call trace:
kernel BUG at mm/memory_hotplug.c:885!
sh[6455]: bugcheck! 0 [1]
Pid: 6455, CPU 0, comm: sh
psr : 0000101008526030 ifs : 8000000000000fa4 ip : [<a0000001008c40f0>] Not tainted (3.6.0-rc1)
ip is at offline_pages+0x210/0xee0
Call Trace:
show_stack+0x80/0xa0
show_regs+0x640/0x920
die+0x190/0x2c0
die_if_kernel+0x50/0x80
ia64_bad_break+0x3d0/0x6e0
ia64_native_leave_kernel+0x0/0x270
offline_pages+0x210/0xee0
alloc_pages_current+0x180/0x2a0
Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- A tps65217 build error fix.
- A lcp_ich regression fix caused by the MFD driver failing to initialize the
watchdog sub device due to ACPI conflicts.
- 2 MAX77693 interrupt handling bug fixes.
- An MFD core fix, adding an IRQ domain argument to the MFD device addition
API in order to prevent silent and potentially harmful remapping behaviour
changes for drivers supporting non-DT platforms.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJQVQpwAAoJEIqAPN1PVmxKpnQP/iDPakfGT0m00nxk9VvuXfaA
SlNxhhM1ZPxlZcBrVFULZ+dCVuUaBYIfwoT4Wwl2yZsmp8uYb62Rd6LGXDnjO+FH
zKMg3b/KDHq/jicYxqFy4bVq3I850jTLN4Hti6hArn8HM6gxZJQxM810Poxt6M0T
odud6oHSevGpCoL7GU4O+gmx1wKSDRrw6TZPokJDfXaQcFPJEt0YGGtI2zAO1sV+
iBPQE/bI3bF5r/kowbh9ro5WybGNyFrwiYtaTzJWntXoPKN1lmr0F2F2NaCEo1xj
2vJPFSCK3qC9Ft7sPQEBYxrSXzyzgu7G8HGgkcBG5SaSo+W3awe9DEaHfk5E7sdr
iyc01kh4FmGlMAvHG/zcK3YvchR7GJdRI7BrR+MTQ03ZUv/ZrmTYFAHv5fAzBQ+N
6ZUoC/1bqPKpfeFPOKYpDeYJVnFBWLYr+t7McTqqg+kpxUuYnQ0HyJVjDh01ZVQT
AtCjjW7R+Ka44B+xfMOartPZMqZZEHSJ0UjacyEyzMpAAY14eUDMtTqs/jtNp+8Z
B0bZiA8ZhUmNVH7TZoT/u57FgEW34JnM8oH6jLVN+yjInpuugIvqpZn0TA90GL7V
Fh2DALMgUN5z/TXduGSsSPg1hvXFPOujaS+4o78e8PWiVQd5XUk/mc8OOS/mrRwj
+mILnzxJcp8iYrYqESbt
=nP/y
-----END PGP SIGNATURE-----
Merge tag 'mfd-for-linus-3.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6
Pull mfd fixes from Samuel Ortiz:
"This is the remaining MFD fixes for 3.6, with 5 pending fixes:
- A tps65217 build error fix.
- A lcp_ich regression fix caused by the MFD driver failing to
initialize the watchdog sub device due to ACPI conflicts.
- 2 MAX77693 interrupt handling bug fixes.
- An MFD core fix, adding an IRQ domain argument to the MFD device
addition API in order to prevent silent and potentially harmful
remapping behaviour changes for drivers supporting non-DT
platforms."
* tag 'mfd-for-linus-3.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
mfd: MAX77693: Fix NULL pointer error when initializing irqs
mfd: MAX77693: Fix interrupt handling bug
mfd: core: Push irqdomain mapping out into devices
mfd: lpc_ich: Fix a 3.5 kernel regression for iTCO_wdt driver
mfd: Move tps65217 regulator plat data handling to regulator
Yet more (a bunch of) small fixes that slipped from the previous
pull request. Most of commits are pending ASoC fixes, all of which
are fairly trivial commits.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJQVGAHAAoJEGwxgFQ9KSmkyW8P/idw2CiiB2bdU3SwDfg6AXJE
CmvpTHtxAVtYvejq/WIFP9NkUAwZZePg+DQoVTOYmzdrZjNyCIW0rq7TG2PCo4Dx
v6Ek9MITrWbT8dil75SF0JeqglWznQFNkUinBCVIZPEzvpvTmjbnQNvja9iVQ41G
LpWbB0KPTNw88cILnH8YTO0tlvPFhOTx4ZRMZpq26q7nmph5abSjLlkKmYMa59sp
lbq8P9y2HRSLM7YR5WAV7ydg3L+euFe7ppbCqnp0l0mmhYjj3/ltI/wxkGIWNfRN
mSAW3ZM2Xz0ZO0NLuLEMcgoCZAoHy3KUMOJqt+DKe91Vn7DpBU/xWrcwU+wT7I3v
9O4vM6C4h89xxB41n1AejUQivPHIyT1ZmfSRByB5t5l2KwUI2VDD2p7VNHvY6FWF
JkbYfb2c1VB3sUZKDv0dKDfZDsc5ddLVSnujoRjApel9ghVI7wDZr5ZsLPMW8z/Y
6wJ5PsBAf1iPc+CS05mQXrLc8LQB3u3bR7xTEt9yVsj8lQIXmN6W6Vm0N0hut6Vs
snDpKHD0AQ9LjQZysUsX45qPPiSX6PlX2wEFyA49C1ahBKUJ0Nh7wmqvvF9/GA2R
kqK652uM7Mworw26eYrNfbyL2/DFrPea67lks1tqW3s1o7NQ9A1gNmrF0ZIIbaTt
zY0D01eyFWkIeeKqKtoU
=fIXH
-----END PGP SIGNATURE-----
Merge tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull more sound fixes from Takashi Iwai:
"Yet more (a bunch of) small fixes that slipped from the previous pull
request. Most of commits are pending ASoC fixes, all of which are
fairly trivial commits."
* tag 'sound-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: wm8904: correct the index
ALSA: hda - Yet another position_fix quirk for ASUS machines
ASoC: tegra: fix maxburst settings in dmaengine code
ASoC: samsung dma - Don't indicate support for pause/resume.
ASoC: mc13783: Remove mono support
ASoC: arizona: Fix typo in 44.1kHz rates
ASoC: spear: correct the check for NULL dma_buffer pointer
sound: tegra_alc5632: remove HP detect GPIO inversion
ASoC: atmel-ssc: include linux/io.h for raw io
ASoC: dapm: Don't force card bias level to be updated
ASoC: dapm: Make sure we update the bias level for CODECs with no op
ASoC: am3517evm: fix error return code
ASoC: ux500_msp_i2s: better use devm functions and fix error return code
ASoC: imx-sgtl5000: fix error return code
This reverts commit 970e178985.
Nikolay Ulyanitsky reported thatthe 3.6-rc5 kernel has a 15-20%
performance drop on PostgreSQL 9.2 on his machine (running "pgbench").
Borislav Petkov was able to reproduce this, and bisected it to this
commit 970e178985 ("sched: Improve scalability via 'CPU buddies' ...")
apparently because the new single-idle-buddy model simply doesn't find
idle CPU's to reschedule on aggressively enough.
Mike Galbraith suspects that it is likely due to the user-mode spinlocks
in PostgreSQL not reacting well to preemption, but we don't really know
the details - I'll just revert the commit for now.
There are hopefully other approaches to improve scheduler scalability
without it causing these kinds of downsides.
Reported-by: Nikolay Ulyanitsky <lystor@gmail.com>
Bisected-by: Borislav Petkov <bp@alien8.de>
Acked-by: Mike Galbraith <efault@gmx.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently the MFD core supports remapping MFD cell interrupts using an
irqdomain but only if the MFD is being instantiated using device tree
and only if the device tree bindings use the pattern of registering IPs
in the device tree with compatible properties. This will be actively
harmful for drivers which support non-DT platforms and use this pattern
for their DT bindings as it will mean that the core will silently change
remapping behaviour and it is also limiting for drivers which don't do
DT with this particular pattern. There is also a potential fragility if
there are interrupts not associated with MFD cells and all the cells are
omitted from the device tree for some reason.
Instead change the code to take an IRQ domain as an optional argument,
allowing drivers to take the decision about the parent domain for their
interrupts. The one current user of this feature is ab8500-core, it has
the domain lookup pushed out into the driver.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Pull i2c embedded fixes from Wolfram Sang:
"The last bunch of (typical) i2c-embedded driver fixes for 3.6.
Also update the MAINTAINERS file to point to my tree since people keep
asking where to find their patches."
* 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
i2c: algo: pca: Fix mode selection for PCA9665
MAINTAINERS: fix tree for current i2c-embedded development
i2c: mxs: correctly setup speed for non devicetree
i2c: pnx: Fix read transactions of >= 2 bytes
i2c: pnx: Fix bit definitions
Pull perf fixes from Ingo Molnar:
"This tree includes various fixes"
Ingo really needs to improve on the whole "explain git pull" part.
"Various fixes" indeed.
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/hwpb: Invoke __perf_event_disable() if interrupts are already disabled
perf/x86: Enable Intel Cedarview Atom suppport
perf_event: Switch to internal refcount, fix race with close()
oprofile, s390: Fix uninitialized memory access when writing to oprofilefs
perf/x86: Fix microcode revision check for SNB-PEBS
Pull networking fixes from David Miller:
1) Use after free and new device IDs in bluetooth from Andre Guedes,
Yevgeniy Melnichuk, Gustavo Padovan, and Henrik Rydberg.
2) Fix crashes with short packet lengths and VLAN in pktgen, from
Nishank Trivedi.
3) mISDN calls flush_work_sync() with locks held, fix from Karsten
Keil.
4) Packet scheduler gred parameters are reported to userspace
improperly scaled, and WRED idling is not performed correctly. All
from David Ward.
5) Fix TCP socket refcount problem in ipv6, from Julian Anastasov.
6) ibmveth device has RX queue alignment requirements which are not
being explicitly met resulting in sporadic failures, fix from
Santiago Leon.
7) Netfilter needs to take care when interpreting sockets attached to
socket buffers, they could be time-wait minisockets. Fix from Eric
Dumazet.
8) sock_edemux() has the same issue as netfilter did in #7 above, fix
from Eric Dumazet.
9) Avoid infinite loops in CBQ scheduler with some configurations, from
Eric Dumazet.
10) Deal with "Reflection scan: an Off-Path Attack on TCP", from Jozsef
Kadlecsik.
11) SCTP overcharges socket for TX packets, fix from Thomas Graf.
12) CODEL packet scheduler should not reset it's state every time it
builds a new flow, fix from Eric Dumazet.
13) Fix memory leak in nl80211, from Wei Yongjun.
14) NETROM doesn't check skb_copy_datagram_iovec() return values, from
Alan Cox.
15) l2tp ethernet was using sizeof(ETH_HLEN) instead of plain ETH_HLEN,
oops. From Eric Dumazet.
16) Fix selection of ath9k chips on which PA linearization and AM2PM
predistoration are used, from Felix Fietkau.
17) Flow steering settings in mlx4 driver need to be validated properly,
from Hadar Hen Zion.
18) bnx2x doesn't show the correct link duplex setting, from Yaniv
Rosner.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
pktgen: fix crash with vlan and packet size less than 46
bnx2x: Add missing afex code
bnx2x: fix registers dumped
bnx2x: correct advertisement of pause capabilities
bnx2x: display the correct duplex value
bnx2x: prevent timeouts when using PFC
bnx2x: fix stats copying logic
bnx2x: Avoid sending multiple statistics queries
net: qmi_wwan: call subdriver with control intf only
net_sched: gred: actually perform idling in WRED mode
net_sched: gred: fix qave reporting via netlink
net_sched: gred: eliminate redundant DP prio comparisons
net_sched: gred: correct comment about qavg calculation in RIO mode
mISDN: Fix wrong usage of flush_work_sync while holding locks
netfilter: log: Fix log-level processing
net-sched: sch_cbq: avoid infinite loop
net: qmi_wwan: fix Gobi device probing for un2430
net: fix net/core/sock.c build error
ixp4xx_hss: fix build failure due to missing linux/module.h inclusion
caif: move the dereference below the NULL test
...
Here is one fix for 3.6-rc6 for the kobject.h file.
It fixes a reported oops if CONFIG_HOTPLUG is disabled. It's been in
the linux-next tree for a while now.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEYEABECAAYFAlBSFtQACgkQMUfUDdst+yksuQCg0HgFS6eYiprJCRm6OEwYH00F
SkAAn3JW81UB1s72y8GeCFSigemiCGW2
=nHRF
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg Kroah-Hartman:
"Here is one fix for 3.6-rc6 for the kobject.h file.
It fixes a reported oops if CONFIG_HOTPLUG is disabled. It's been in
the linux-next tree for a while now.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'driver-core-3.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
kobject: fix oops with "input0: bad kobj_uevent_env content in show_uevent()"
It is a bad idea to hold a spinlock and call flush_work_sync.
Move the workqueue cleanup outside the spinlock and use cancel_work_sync,
on closing the channel this seems to be the more correct function.
Remove the never used and constant return value of mISDN_freebchannel.
Signed-off-by: Karsten Keil <keil@b1-systems.de>
Cc: <stable@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current implementation simply ignores attribute flags. Thus, there is
no notification to userland of unsupported features. Check syscall's
attribute flags to let userland know if a feature is supported by the
kernel. This is also needed to distinguish between future kernels what
might support a feature.
Cc: <stable@vger.kernel.org> v3.5..
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120910093018.GO8285@erda.amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- Final (hopefully) fix for the range checking code in NFSv4 getacl. This
should fix the Oopses being seen when the acl size is close to PAGE_SIZE.
- Fix a regression with the legacy binary mount code
- Fix a regression in the readdir cookieverf initialisation
- Fix an RPC over UDP regression
- Ensure that we report all errors in the NFSv4 open code
- Ensure that fsync() reports all relevant synchronisation errors.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJQUN+KAAoJEGcL54qWCgDyHGcQAKj7MYVDIjhdmsVGGNWXUCnf
X0LVg/ajh+vjusK+hmquzcJikZqgce5IU5DW4vcFr1X8BgP+R51UVvU0KksByD5H
ourV2JVCztAQzQ4WWOsZAGqN0tooJUjyEjl4lEiDsQCF4Nk1HWbuCHeYuX74OToZ
jrgedj0EZ6zb7TOizvbgU/7lI+FKu3Hlw6+u27M9phtSuefJdYSHZHYVMOX81qPh
k0zgZ4tuLIaDuBB84iCrPwNt9icnevq6cIc+AGluI6xhDw+foPvUaUR+OUI420IZ
tunNzP2So+nNoyjEiyMVENaCdEyA75XAmmGHTUUdBiVOsMV4HF/TqvTtSsjk2mN1
FbZVvtjD6srjsQaKdVmqMIZBdhY9LSMLIQVqb4H2rYP6Mwq06WTuyCxf5YhzFfoy
2tai7JuqBkTAWfKB8ESWywV6Qk/MkUWRAOBO6ksS66gAwpcFDj6nfeAdwaEmoYKc
uzLUIRZaclPMZf661cs1fWeFV5XOnCL7je4owgTRGs7MHooWHPcC3273fEJqnhFz
5MkC7nfmUiGcdO1v0mfYTEtMj9Pp9icBoZcVTGn4eZIHzvhhZOx//8LhyBfS+jll
bKjaLZ1rErvIqwnSGcB7PK2yBYY9P6ZaxWjOrAAncZmiOxfhN0hvCo54jNOr/VZ+
atsDEAuqSTeK7ouBqyO4
=e5yE
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.6-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Final (hopefully) fix for the range checking code in NFSv4 getacl.
This should fix the Oopses being seen when the acl size is close to
PAGE_SIZE.
- Fix a regression with the legacy binary mount code
- Fix a regression in the readdir cookieverf initialisation
- Fix an RPC over UDP regression
- Ensure that we report all errors in the NFSv4 open code
- Ensure that fsync() reports all relevant synchronisation errors.
* tag 'nfs-for-3.6-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: fsync() must exit with an error if page writeback failed
SUNRPC: Fix a UDP transport regression
NFS: return error from decode_getfh in decode open
NFSv4: Fix buffer overflow checking in __nfs4_get_acl_uncached
NFSv4: Fix range checking in __nfs4_get_acl_uncached and __nfs4_proc_set_acl
NFS: Fix a problem with the legacy binary mount code
NFS: Fix the initialisation of the readdir 'cookieverf' array
On transactions with n>=2 bytes, the controller actually wrongly clocks in n+1
bytes. This is caused by the (wrong) assumption that RFE in the Status Register
is 1 iff there is no byte already ordered (via a dummy TX byte). This lead to
the implementation of synchronized byte ordering, e.g.:
Dummy-TX - RX - Dummy-TX - RX - ...
But since RFE actually stays high after some Dummy-TX, it rather looks like:
Dummy-TX - Dummy-TX - RX - Dummy-TX - RX - (RX)
The last RX byte is clocked in by the bus controller, but ignored by the kernel
when filling the userspace buffer.
This patch fixes the issue by asking for RX via Dummy-TX asynchronously.
Introducing a separate counter for TX bytes.
Signed-off-by: Roland Stigge <stigge@antcom.de>
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Since VFs may be mapped to VMs which aren't trusted entities, flow
steering rules attached through the wrapper on behalf of VFs must be
checked to make sure that their L2 specification relate to MAC address
assigned to that VF, and add L2 specification if its missing.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To allow for usage of the flow steering Firmware structures in more locations over the driver,
such as the resource tracker, move them from mcg.c to common header files.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 43cedbf0e8 (SUNRPC: Ensure that
we grab the XPRT_LOCK before calling xprt_alloc_slot) is causing
hangs in the case of NFS over UDP mounts.
Since neither the UDP or the RDMA transport mechanism use dynamic slot
allocation, we can skip grabbing the socket lock for those transports.
Add a new rpc_xprt_op to allow switching between the TCP and UDP/RDMA
case.
Note that the NFSv4.1 back channel assigns the slot directly
through rpc_run_bc_task, so we can ignore that case.
Reported-by: Dick Streefland <dick.streefland@altium.nl>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [>= 3.1]
Fengguang Wu <fengguang.wu@intel.com> writes:
> After the __devinit* removal series, I can still get kernel panic in
> show_uevent(). So there are more sources of bug..
>
> Debug patch:
>
> @@ -343,8 +343,11 @@ static ssize_t show_uevent(struct device
> goto out;
>
> /* copy keys to file */
> - for (i = 0; i < env->envp_idx; i++)
> + dev_err(dev, "uevent %d env[%d]: %s/.../%s\n", env->buflen, env->envp_idx, top_kobj->name, dev->kobj.name);
> + for (i = 0; i < env->envp_idx; i++) {
> + printk(KERN_ERR "uevent %d env[%d]: %s\n", (int)count, i, env->envp[i]);
> count += sprintf(&buf[count], "%s\n", env->envp[i]);
> + }
>
> Oops message, the env[] is again not properly initilized:
>
> [ 44.068623] input input0: uevent 61 env[805306368]: input0/.../input0
> [ 44.069552] uevent 0 env[0]: (null)
This is a completely different CONFIG_HOTPLUG problem, only
demonstrating another reason why CONFIG_HOTPLUG should go away. I had a
hard time trying to disable it anyway ;-)
The problem this time is lots of code assuming that a call to
add_uevent_var() will guarantee that env->buflen > 0. This is not true
if CONFIG_HOTPLUG is unset. So things like this end up overwriting
env->envp_idx because the array index is -1:
if (add_uevent_var(env, "MODALIAS="))
return -ENOMEM;
len = input_print_modalias(&env->buf[env->buflen - 1],
sizeof(env->buf) - env->buflen,
dev, 0);
Don't know what the best action is, given that there seem to be a *lot*
of this around the kernel. This patch "fixes" the problem for me, but I
don't know if it can be considered an appropriate fix.
[ It is the correct fix for now, for 3.7 forcing CONFIG_HOTPLUG to
always be on is the longterm fix, but it's too late for 3.6 and older
kernels to resolve this that way - gregkh ]
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJQSCWfAAoJEMsfJm/On5mBvicP/0CmzUgasB1MVpBpxZeaLrcf
buGw5GZpH/Kh7h+4mdfY5egzvn0J9lFt9gWB58lw1xaQhHgihaus/h63K9nDQpyb
NhsuGDY628st9cJWBFU18KcnSjVKNSEdVOZLtSkqzpbtAiy6zH0pQGfNPCSZaJuo
XngjHAyIHZaAyORgwudGm9hrTqTGNUdaLPp1NbXO1N7+/Upnm5f237XUWqgboTgK
4BGVOG6Prjm6ytJqc+eXg/iUACPgdG8Fe8rQMhRm0HjIEdX58+xfjrOZ3IPqxqcX
F+ri3W615PazVi27wC6Afk9NqssvJagImEzRh7DbbMAeTesh0vAPMb8UihhhX4M6
BsWU8zE1UVBEHJmi0ZT/Q+5v3heLbBd2kbmyorSBvHZyH+zFaAgvWVSqMCWguKO+
CBptQnFhFY209Pi3S79tZFe7g5T6xMFddGW+0Wpp7pdT+BgC9EL7UBJGctCzO7Yq
ipfCtRzZmJO7HnQic9T7XQhQvmCNjZEXHIooFZgZ6uF3GJL0Ipetc3t/uHwZ2C+E
TwX4eNJH/IxgyRJszjCyKWs6iu7RaRQu7lq4a6ZpROmQJIW64pEF/ZWQ2+QULwnN
pX2j2mLPltidjUiJ6uAIyzZr8FL97uLSLDkWhc4fZTsDC+2NOLwaK7nPErO1M6Rc
C9fde4G5GOhkCaGrZX83
=eZQ4
-----END PGP SIGNATURE-----
Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull a hwmon fix from Guenter Roeck:
"One patch, fixing DIV_ROUND_CLOSEST to support negative dividends.
While the changes are not in the drivers/hwmon directory, the problem
primarily affects hwmon drivers, and it makes sense to push the patch
through the hwmon tree."
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
linux/kernel.h: Fix DIV_ROUND_CLOSEST to support negative dividends