From: Marc Kleine-Budde Date: Wed, 5 Mar 2014 00:49:47 +0100 Subject: [PATCH 033/354] net: sched: Use msleep() instead of yield() Origin: https://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit?id=c872ce657d413044d7a779f3ae549565df0e0bee On PREEMPT_RT enabled systems the interrupt handler run as threads at prio 50 (by default). If a high priority userspace process tries to shut down a busy network interface it might spin in a yield loop waiting for the device to become idle. With the interrupt thread having a lower priority than the looping process it might never be scheduled and so result in a deadlock on UP systems. With Magic SysRq the following backtrace can be produced: > test_app R running 0 174 168 0x00000000 > [] (__schedule+0x220/0x3fc) from [] (preempt_schedule_irq+0x48/0x80) > [] (preempt_schedule_irq+0x48/0x80) from [] (svc_preempt+0x8/0x20) > [] (svc_preempt+0x8/0x20) from [] (local_bh_enable+0x18/0x88) > [] (local_bh_enable+0x18/0x88) from [] (dev_deactivate_many+0x220/0x264) > [] (dev_deactivate_many+0x220/0x264) from [] (__dev_close_many+0x64/0xd4) > [] (__dev_close_many+0x64/0xd4) from [] (__dev_close+0x28/0x3c) > [] (__dev_close+0x28/0x3c) from [] (__dev_change_flags+0x88/0x130) > [] (__dev_change_flags+0x88/0x130) from [] (dev_change_flags+0x10/0x48) > [] (dev_change_flags+0x10/0x48) from [] (do_setlink+0x370/0x7ec) > [] (do_setlink+0x370/0x7ec) from [] (rtnl_newlink+0x2b4/0x450) > [] (rtnl_newlink+0x2b4/0x450) from [] (rtnetlink_rcv_msg+0x158/0x1f4) > [] (rtnetlink_rcv_msg+0x158/0x1f4) from [] (netlink_rcv_skb+0xac/0xc0) > [] (netlink_rcv_skb+0xac/0xc0) from [] (rtnetlink_rcv+0x18/0x24) > [] (rtnetlink_rcv+0x18/0x24) from [] (netlink_unicast+0x13c/0x198) > [] (netlink_unicast+0x13c/0x198) from [] (netlink_sendmsg+0x264/0x2e0) > [] (netlink_sendmsg+0x264/0x2e0) from [] (sock_sendmsg+0x78/0x98) > [] (sock_sendmsg+0x78/0x98) from [] (___sys_sendmsg.part.25+0x268/0x278) > [] (___sys_sendmsg.part.25+0x268/0x278) from [] (__sys_sendmsg+0x48/0x78) > [] (__sys_sendmsg+0x48/0x78) from [] (ret_fast_syscall+0x0/0x2c) This patch works around the problem by replacing yield() by msleep(1), giving the interrupt thread time to finish, similar to other changes contained in the rt patch set. Using wait_for_completion() instead would probably be a better solution. Signed-off-by: Marc Kleine-Budde Signed-off-by: Sebastian Andrzej Siewior --- net/sched/sch_generic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index c966dacf1130..a5262b2ba536 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -1254,7 +1254,7 @@ void dev_deactivate_many(struct list_head *head) /* Wait for outstanding qdisc_run calls. */ list_for_each_entry(dev, head, close_list) { while (some_qdisc_is_busy(dev)) - yield(); + msleep(1); /* The new qdisc is assigned at this point so we can safely * unwind stale skb lists and qdisc statistics */