From b5f862180d7011d9575d0499fa37f0f25b423b12 Mon Sep 17 00:00:00 2001 From: Hangbin Liu Date: Mon, 6 Nov 2017 09:01:57 +0800 Subject: bonding: discard lowest hash bit for 802.3ad layer3+4 After commit 07f4c90062f8 ("tcp/dccp: try to not exhaust ip_local_port_range in connect()"), we will try to use even ports for connect(). Then if an application (seen clearly with iperf) opens multiple streams to the same destination IP and port, each stream will be given an even source port. So the bonding driver's simple xmit_hash_policy based on layer3+4 addressing will always hash all these streams to the same interface. And the total throughput will limited to a single slave. Change the tcp code will impact the whole tcp behavior, only for bonding usage. Paolo Abeni suggested fix this by changing the bonding code only, which should be more reasonable, and less impact. Fix this by discarding the lowest hash bit because it contains little entropy. After the fix we can re-balance between slaves. Signed-off-by: Paolo Abeni Signed-off-by: Hangbin Liu Signed-off-by: David S. Miller --- drivers/net/bonding/bond_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'drivers/net/bonding/bond_main.c') diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index c99dc59d729b..76e8054bfc4e 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -3253,7 +3253,7 @@ u32 bond_xmit_hash(struct bonding *bond, struct sk_buff *skb) hash ^= (hash >> 16); hash ^= (hash >> 8); - return hash; + return hash >> 1; } /*-------------------------- Device entry points ----------------------------*/ -- cgit v1.2.3-55-g7522 From 055db6957e4735b16cd2fa94a5bbfb754c9b8023 Mon Sep 17 00:00:00 2001 From: Jay Vosburgh Date: Tue, 7 Nov 2017 19:50:07 +0900 Subject: bonding: fix slave stuck in BOND_LINK_FAIL state The bonding miimon logic has a flaw, in that a failure of the rtnl_trylock can cause a slave to become permanently stuck in BOND_LINK_FAIL state. The sequence of events to cause this is as follows: 1) bond_miimon_inspect finds that a slave's link is down, and so calls bond_propose_link_state, setting slave->new_link_state to BOND_LINK_FAIL, then sets slave->new_link to BOND_LINK_DOWN and returns non-zero. 2) In bond_mii_monitor, the rtnl_trylock fails, and the timer is rescheduled. No change is committed. 3) bond_miimon_inspect is called again, but this time the slave from step 1 has recovered. slave->new_link is reset to NOCHANGE, and, as slave->link was never changed, the switch enters the BOND_LINK_UP case, and does nothing. The pending BOND_LINK_FAIL state from step 1 remains pending, as new_link_state is not reset. 4) The state from step 3 persists until another slave changes link state and causes bond_miimon_inspect to return non-zero. At this point, the BOND_LINK_FAIL state change on the slave from steps 1-3 is committed, and the slave will remain stuck in BOND_LINK_FAIL state even though it is actually link up. The remedy for this is to initialize new_link_state on each entry to bond_miimon_inspect, as is already done with new_link. Fixes: fb9eb899a6dc ("bonding: handle link transition from FAIL to UP correctly") Reported-by: Alex Sidorenko Reviewed-by: Jarod Wilson Signed-off-by: Jay Vosburgh Acked-by: Mahesh Bandewar Signed-off-by: David S. Miller --- drivers/net/bonding/bond_main.c | 1 + 1 file changed, 1 insertion(+) (limited to 'drivers/net/bonding/bond_main.c') diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 76e8054bfc4e..b2db581131b2 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -2042,6 +2042,7 @@ static int bond_miimon_inspect(struct bonding *bond) bond_for_each_slave_rcu(bond, slave, iter) { slave->new_link = BOND_LINK_NOCHANGE; + slave->link_new_state = slave->link; link_state = bond_check_dev_link(bond, slave->dev, 0); -- cgit v1.2.3-55-g7522