Centos-kernel-stream-9

Commit Graph

Author	SHA1	Message	Date
Lucas Zampieri	f6029bf351	Merge: workqueue: Backport workqueue commits to v6.9 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3910 JIRA: https://issues.redhat.com/browse/RHEL-25103 MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3910 Depends: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/3847 The primary purpose of this MR is to backport those upstream workqueue commits which enables ordered workqueues and rescuers to follow changes in workqueue unbound cpumask which is necessary to make sure that isolated CPUs won't be disturbed due to unbound work items being handled by those CPUs. These upstream commits were merged into the v6.9 kernel which also contains some major changes in workqueue code. This makes the required commits dependent on some of the v6.9 workqueue commits. It is less risky to sync the workqueue code up to v6.9 instead of selective backports of some dependent commits. This MR also includes some miscellaneous commits in other subsystems due to changes in the underlying workqueue implementations. A follow-up proactive workqueue fixes MR will be created later on, if necessary. Signed-off-by: Waiman Long <longman@redhat.com> Approved-by: Tony Camuso <tcamuso@redhat.com> Approved-by: Steve Best <sbest@redhat.com> Approved-by: Vladis Dronov <vdronov@redhat.com> Approved-by: Prarit Bhargava <prarit@redhat.com> Approved-by: Wander Lairson Costa <wander@redhat.com> Approved-by: Phil Auld <pauld@redhat.com> Approved-by: Radu Rendec <rrendec@redhat.com> Approved-by: Chris von Recklinghausen <crecklin@redhat.com> Approved-by: CKI KWF Bot <cki-ci-bot+kwf-gitlab-com@redhat.com> Merged-by: Lucas Zampieri <lzampier@redhat.com>	2024-06-13 13:07:43 +00:00
Waiman Long	efd17b3bb7	workqueue: Drain BH work items on hot-unplugged CPUs JIRA: https://issues.redhat.com/browse/RHEL-25103 commit 1acd92d95fa24edca8f0292b21870025da93e24f Author: Tejun Heo <tj@kernel.org> Date: Mon, 26 Feb 2024 15:38:55 -1000 workqueue: Drain BH work items on hot-unplugged CPUs Boqun pointed out that workqueues aren't handling BH work items on offlined CPUs. Unlike tasklet which transfers out the pending tasks from CPUHP_SOFTIRQ_DEAD, BH workqueue would just leave them pending which is problematic. Note that this behavior is specific to BH workqueues as the non-BH per-CPU workers just become unbound when the CPU goes offline. This patch fixes the issue by draining the pending BH work items from an offlined CPU from CPUHP_SOFTIRQ_DEAD. Because work items carry more context, it's not as easy to transfer the pending work items from one pool to another. Instead, run BH work items which execute the offlined pools on an online CPU. Note that this assumes that no further BH work items will be queued on the offlined CPUs. This assumption is shared with tasklet and should be fine for conversions. However, this issue also exists for per-CPU workqueues which will just keep executing work items queued after CPU offline on unbound workers and workqueue should reject per-CPU and BH work items queued on offline CPUs. This will be addressed separately later. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-and-reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: http://lkml.kernel.org/r/Zdvw0HdSXcU3JZ4g@boqun-archlinux Signed-off-by: Waiman Long <longman@redhat.com>	2024-05-03 13:39:33 -04:00
Waiman Long	da3eaa2838	workqueue: Implement BH workqueues to eventually replace tasklets JIRA: https://issues.redhat.com/browse/RHEL-25103 Conflicts: A minor context diff in kernel/workqueue.c due to missing upstream commit 68279f9c9f59 ("treewide: mark stuff as __ro_after_init"). commit 4cb1ef64609f9b0254184b2947824f4b46ccab22 Author: Tejun Heo <tj@kernel.org> Date: Sun, 4 Feb 2024 11:28:06 -1000 workqueue: Implement BH workqueues to eventually replace tasklets The only generic interface to execute asynchronously in the BH context is tasklet; however, it's marked deprecated and has some design flaws such as the execution code accessing the tasklet item after the execution is complete which can lead to subtle use-after-free in certain usage scenarios and less-developed flush and cancel mechanisms. This patch implements BH workqueues which share the same semantics and features of regular workqueues but execute their work items in the softirq context. As there is always only one BH execution context per CPU, none of the concurrency management mechanisms applies and a BH workqueue can be thought of as a convenience wrapper around softirq. Except for the inability to sleep while executing and lack of max_active adjustments, BH workqueues and work items should behave the same as regular workqueues and work items. Currently, the execution is hooked to tasklet[_hi]. However, the goal is to convert all tasklet users over to BH workqueues. Once the conversion is complete, tasklet can be removed and BH workqueues can directly take over the tasklet softirqs. system_bh[_highpri]_wq are added. As queue-wide flushing doesn't exist in tasklet, all existing tasklet users should be able to use the system BH workqueues without creating their own workqueues. v3: - Add missing interrupt.h include. v2: - Instead of using tasklets, hook directly into its softirq action functions - tasklet[_hi]_action(). This is slightly cheaper and closer to the eventual code structure we want to arrive at. Suggested by Lai. - Lai also pointed out several places which need NULL worker->task handling or can use clarification. Updated. Signed-off-by: Tejun Heo <tj@kernel.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/CAHk-=wjDW53w4-YcSmgKC5RruiRLHmJ1sXeYdp_ZgVoBw=5byA@mail.gmail.com Tested-by: Allen Pais <allen.lkml@gmail.com> Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Waiman Long <longman@redhat.com>	2024-05-03 13:39:31 -04:00
Phil Auld	5664854fb4	sched/core: introduce sched_core_idle_cpu() JIRA: https://issues.redhat.com/browse/RHEL-25535 commit 548796e2e70b44b4661fd7feee6eb239245ff1f8 Author: Cruz Zhao <CruzZhao@linux.alibaba.com> Date: Thu Jun 29 12:02:04 2023 +0800 sched/core: introduce sched_core_idle_cpu() As core scheduling introduced, a new state of idle is defined as force idle, running idle task but nr_running greater than zero. If a cpu is in force idle state, idle_cpu() will return zero. This result makes sense in some scenarios, e.g., load balance, showacpu when dumping, and judge the RCU boost kthread is starving. But this will cause error in other scenarios, e.g., tick_irq_exit(): When force idle, rq->curr == rq->idle but rq->nr_running > 0, results that idle_cpu() returns 0. In function tick_irq_exit(), if idle_cpu() is 0, tick_nohz_irq_exit() will not be called, and ts->idle_active will not become 1, which became 0 in tick_nohz_irq_enter(). ts->idle_sleeptime won't update in function update_ts_time_stats(), if ts->idle_active is 0, which should be 1. And this bug will result that ts->idle_sleeptime is less than the actual value, and finally will result that the idle time in /proc/stat is less than the actual value. To solve this problem, we introduce sched_core_idle_cpu(), which returns 1 when force idle. We audit all users of idle_cpu(), and change idle_cpu() into sched_core_idle_cpu() in function tick_irq_exit(). v2-->v3: Only replace idle_cpu() with sched_core_idle_cpu() in function tick_irq_exit(). And modify the corresponding commit log. Signed-off-by: Cruz Zhao <CruzZhao@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Joel Fernandes <joel@joelfernandes.org> Link: https://lore.kernel.org/r/1688011324-42406-1-git-send-email-CruzZhao@linux.alibaba.com Signed-off-by: Phil Auld <pauld@redhat.com>	2024-04-05 09:49:13 -04:00
Phil Auld	7d0cddf221	genirq, softirq: Use in_hardirq() instead of in_irq() JIRA: https://issues.redhat.com/browse/RHEL-25535 Conflicts: Dropped one hunk because it was removed by a later commit which is already in rhel 792ea6a074ae ("genirq: Remove WARN_ON_ONCE() in generic_handle_domain_irq()"). commit fe13889c390e14205e064d7e159e61eb5da4b1c3 Author: Changbin Du <changbin.du@intel.com> Date: Fri Jan 28 19:07:27 2022 +0800 genirq, softirq: Use in_hardirq() instead of in_irq() Replace the obsolete and ambiguos macro in_irq() with the new macro in_hardirq(). Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20220128110727.5110-1-changbin.du@gmail.com Signed-off-by: Phil Auld <pauld@redhat.com>	2024-04-05 09:49:13 -04:00
Oleg Nesterov	7ca1a24ee7	Revert "softirq: Let ksoftirqd do its job" Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2196764 Upstream-status: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git commit d15121be7485655129101f3960ae6add40204463 Author: Paolo Abeni <pabeni@redhat.com> Date: Mon May 8 08:17:44 2023 +0200 Revert "softirq: Let ksoftirqd do its job" This reverts the following commits: `4cd13c21b2` ("softirq: Let ksoftirqd do its job") `3c53776e29` ("Mark HI and TASKLET softirq synchronous") `1342d8080f` ("softirq: Don't skip softirq execution when softirq thread is parking") in a single change to avoid known bad intermediate states introduced by a patch series reverting them individually. Due to the mentioned commit, when the ksoftirqd threads take charge of softirq processing, the system can experience high latencies. In the past a few workarounds have been implemented for specific side-effects of the initial ksoftirqd enforcement commit: commit `1ff688209e` ("watchdog: core: make sure the watchdog_worker is not deferred") commit `8d5755b3f7` ("watchdog: softdog: fire watchdog even if softirqs do not get to run") commit `217f697436` ("net: busy-poll: allow preemption in sk_busy_loop()") commit `3c53776e29` ("Mark HI and TASKLET softirq synchronous") But the latency problem still exists in real-life workloads, see the link below. The reverted commit intended to solve a live-lock scenario that can now be addressed with the NAPI threaded mode, introduced with commit `29863d41bb` ("net: implement threaded-able napi poll loop support"), which is nowadays in a pretty stable status. While a complete solution to put softirq processing under nice resource control would be preferable, that has proven to be a very hard task. In the short term, remove the main pain point, and also simplify a bit the current softirq implementation. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/netdev/305d7742212cbe98621b16be782b0562f1012cb6.camel@redhat.com Link: https://lore.kernel.org/r/57e66b364f1b6f09c9bc0316742c3b14f4ce83bd.1683526542.git.pabeni@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com>	2023-05-24 12:07:54 +02:00
Eder Zulian	921a5c7f51	softirq: Wake ktimers thread also in softirq. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2187979 Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git commit 40efd84fb5de11104d690043dd7247edd9c5f632 Author: Junxiao Chang <junxiao.chang@intel.com> Date: Mon Feb 20 09:12:20 2023 +0100 softirq: Wake ktimers thread also in softirq. If the hrtimer is raised while a softirq is processed then it does not wake the corresponding ktimers thread. This is due to the optimisation in the irq-exit path which is also used to wake the ktimers thread. For the other softirqs, this is okay because the additional softirq bits will be handled by the currently running softirq handler. The timer related softirq bits are added to a different variable and rely on the ktimers thread. As a consuequence the wake up of ktimersd is delayed until the next timer tick. Always wake the ktimers thread if a timer related softirq is pending. Reported-by: Peh, Hock Zhang <hock.zhang.peh@intel.com> Signed-off-by: Junxiao Chang <junxiao.chang@intel.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Eder Zulian <ezulian@redhat.com>	2023-05-16 20:09:33 +02:00
Waiman Long	4cabb4dcd7	context_tracking: Take IRQ eqs entrypoints over RCU Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2169516 Conflicts: The drivers/cpuidle/cpuidle-riscv-sbi.c hunk is dropped as RISC-V is not a supported arch and the file is not currently present in Centos-Stream-9. commit 6f0e6c1598b1a3d19fc30db86b6e26d6f881b43d Author: Frederic Weisbecker <frederic@kernel.org> Date: Wed, 8 Jun 2022 16:40:26 +0200 context_tracking: Take IRQ eqs entrypoints over RCU The RCU dynticks counter is going to be merged into the context tracking subsystem. Prepare with moving the IRQ extended quiescent states entrypoints to context tracking. For now those are dumb redirection to existing RCU calls. [ paulmck: Apply Stephen Rothwell feedback from -next. ] [ paulmck: Apply Nathan Chancellor feedback. ] Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com> Cc: Uladzislau Rezki <uladzislau.rezki@sony.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Nicolas Saenz Julienne <nsaenz@kernel.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com> Cc: Yu Liao <liaoyu15@huawei.com> Cc: Phil Auld <pauld@redhat.com> Cc: Paul Gortmaker<paul.gortmaker@windriver.com> Cc: Alex Belits <abelits@marvell.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Tested-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Signed-off-by: Waiman Long <longman@redhat.com>	2023-03-30 08:36:16 -04:00
Juri Lelli	5a2f14b035	tick: Fix timer storm since introduction of timersd Bugzilla: https://bugzilla.redhat.com/2171995 Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git commit 4891aae98ee80dd177d32088bb2cda4a8445a2ba Author: Frederic Weisbecker <frederic@kernel.org> Date: Tue Apr 5 03:07:52 2022 +0200 tick: Fix timer storm since introduction of timersd If timers are pending while the tick is reprogrammed on nohz_mode, the next expiry is not armed to fire now, it is delayed one jiffy forward instead so as not to raise an inextinguishable timer storm with such scenario: 1) IRQ triggers and queue a timer 2) ksoftirqd() is woken up 3) IRQ tail: timer is reprogrammed to fire now 4) IRQ exit 5) TIMER interrupt 6) goto 3) ...all that until we finally reach ksoftirqd. Unfortunately we are checking the wrong softirq vector bitmask since timersd kthread has split from ksoftirqd. Timers now have their own vector state field that must be checked separately. As a result, the old timer storm is back. This shows up early on boot with extremely long initcalls: [ 333.004807] initcall dquot_init+0x0/0x111 returned 0 after 323822879 usecs and the cause is uncovered with the right trace events showing just 10 microseconds between ticks (~100 000 Hz): \|swapper/-1 1dn.h111 60818582us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415486608 \|swapper/-1 1dn.h111 60818592us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415496082 \|swapper/-1 1dn.h111 60818601us : hrtimer_expire_entry: hrtimer=00000000e0ef0f6b function=tick_sched_timer now=60415505550 Fix this by checking the right timer vector state from the nohz code. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lkml.kernel.org/r/20220405010752.1347437-2-frederic@kernel.org Signed-off-by: Juri Lelli <juri.lelli@redhat.com>	2023-02-27 13:46:09 +01:00
Juri Lelli	688e3bbf5b	rcutorture: Also force sched priority to timersd on boosting test. Bugzilla: https://bugzilla.redhat.com/2171995 Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git commit 2eeb264f1e03aa2969b5017861d316f4810dd2f9 Author: Frederic Weisbecker <frederic@kernel.org> Date: Tue Apr 5 03:07:51 2022 +0200 rcutorture: Also force sched priority to timersd on boosting test. ksoftirqd is statically boosted to the priority level right above the one of rcu_torture_boost() so that timers, which torture readers rely on, get a chance to run while rcu_torture_boost() is polling. However timers processing got split from ksoftirqd into their own kthread (timersd) that isn't boosted. It has the same SCHED_FIFO low prio as rcu_torture_boost() and therefore timers can't preempt it and may starve. The issue can be triggered in practice on v5.17.1-rt17 using: ./kvm.sh --allcpus --configs TREE04 --duration 10m --kconfig "CONFIG_EXPERT=y CONFIG_PREEMPT_RT=y" Fix this with statically boosting timersd just like is done with ksoftirqd in commit `ea6d962e80` ("rcutorture: Judge RCU priority boosting on grace periods, not callbacks") Suggested-by: Mel Gorman <mgorman@suse.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lkml.kernel.org/r/20220405010752.1347437-1-frederic@kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Juri Lelli <juri.lelli@redhat.com>	2023-02-27 13:46:09 +01:00
Juri Lelli	87ec2dea19	softirq: Use a dedicated thread for timer wakeups. Bugzilla: https://bugzilla.redhat.com/2171995 Upstream Status: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git commit 4335d431bec7d67faea9a92afa66f4bbbcfde7f8 Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Wed Dec 1 17:41:09 2021 +0100 softirq: Use a dedicated thread for timer wakeups. A timer/hrtimer softirq is raised in-IRQ context. With threaded interrupts enabled or on PREEMPT_RT this leads to waking the ksoftirqd for the processing of the softirq. Once the ksoftirqd is marked as pending (or is running) it will collect all raised softirqs. This in turn means that a softirq which would have been processed at the end of the threaded interrupt, which runs at an elevated priority, is now moved to ksoftirqd which runs at SCHED_OTHER priority and competes with every regular task for CPU resources. This introduces long delays on heavy loaded systems and is not desired especially if the system is not overloaded by the softirqs. Split the TIMER_SOFTIRQ and HRTIMER_SOFTIRQ processing into a dedicated timers thread and let it run at the lowest SCHED_FIFO priority. RT tasks are are woken up from hardirq context so only timer_list timers and hrtimers for "regular" tasks are processed here. The higher priority ensures that wakeups are performed before scheduling SCHED_OTHER tasks. Using a dedicated variable to store the pending softirq bits values ensure that the timer are not accidentally picked up by ksoftirqd and other threaded interrupts. It shouldn't be picked up by ksoftirqd since it runs at lower priority. However if the timer bits are ORed while a threaded interrupt is running, then the timer softirq would be performed at higher priority. The new timer thread will block on the softirq lock before it starts softirq work. This "race window" isn't closed because while timer thread is performing the softirq it can get PI-boosted via the softirq lock by a random force-threaded thread. The timer thread can pick up pending softirqs from ksoftirqd but only if the softirq load is high. It is not be desired that the picked up softirqs are processed at SCHED_FIFO priority under high softirq load but this can already happen by a PI-boost by a force-threaded interrupt. Reported-by: kernel test robot <lkp@intel.com> [ static timer_threads ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Juri Lelli <juri.lelli@redhat.com>	2023-02-27 13:46:08 +01:00
Phil Auld	eed8502760	smp: Make softirq handling RT safe in flush_smp_call_function_queue() Bugzilla: https://bugzilla.redhat.com/2120671 commit 1a90bfd220201fbe050dfc15deaac20ca5f15638 Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Wed Apr 13 15:31:05 2022 +0200 smp: Make softirq handling RT safe in flush_smp_call_function_queue() flush_smp_call_function_queue() invokes do_softirq() which is not available on PREEMPT_RT. flush_smp_call_function_queue() is invoked from the idle task and the migration task with preemption or interrupts disabled. So RT kernels cannot process soft interrupts in that context as that has to acquire 'sleeping spinlocks' which is not possible with preemption or interrupts disabled and forbidden from the idle task anyway. The currently known SMP function call which raises a soft interrupt is in the block layer, but this functionality is not enabled on RT kernels due to latency and performance reasons. RT could wake up ksoftirqd unconditionally, but this wants to be avoided if there were soft interrupts pending already when this is invoked in the context of the migration task. The migration task might have preempted a threaded interrupt handler which raised a soft interrupt, but did not reach the local_bh_enable() to process it. The "running" ksoftirqd might prevent the handling in the interrupt thread context which is causing latency issues. Add a new function which handles this case explicitely for RT and falls back to do_softirq() on !RT kernels. In the RT case this warns when one of the flushed SMP function calls raised a soft interrupt so this can be investigated. [ tglx: Moved the RT part out of SMP code ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/YgKgL6aPj8aBES6G@linutronix.de Link: https://lore.kernel.org/r/20220413133024.356509586@linutronix.de Signed-off-by: Phil Auld <pauld@redhat.com>	2022-09-08 11:25:07 -04:00
Phil Auld	4e19e0b26f	timers/nohz: Last resort update jiffies on nohz_full IRQ entry Bugzilla: https://bugzilla.redhat.com/2078906 commit 53e87e3cdc155f20c3417b689df8d2ac88d79576 Author: Frederic Weisbecker <frederic@kernel.org> Date: Tue Oct 26 16:10:54 2021 +0200 timers/nohz: Last resort update jiffies on nohz_full IRQ entry When at least one CPU runs in nohz_full mode, a dedicated timekeeper CPU is guaranteed to stay online and to never stop its tick. Meanwhile on some rare case, the dedicated timekeeper may be running with interrupts disabled for a while, such as in stop_machine. If jiffies stop being updated, a nohz_full CPU may end up endlessly programming the next tick in the past, taking the last jiffies update monotonic timestamp as a stale base, resulting in an tick storm. Here is a scenario where it matters: 0) CPU 0 is the timekeeper and CPU 1 a nohz_full CPU. 1) A stop machine callback is queued to execute somewhere. 2) CPU 0 reaches MULTI_STOP_DISABLE_IRQ while CPU 1 is still in MULTI_STOP_PREPARE. Hence CPU 0 can't do its timekeeping duty. CPU 1 can still take IRQs. 3) CPU 1 receives an IRQ which queues a timer callback one jiffy forward. 4) On IRQ exit, CPU 1 schedules the tick one jiffy forward, taking last_jiffies_update as a base. But last_jiffies_update hasn't been updated for 2 jiffies since the timekeeper has interrupts disabled. 5) clockevents_program_event(), which relies on ktime_get(), observes that the expiration is in the past and therefore programs the min delta event on the clock. 6) The tick fires immediately, goto 3) 7) Tick storm, the nohz_full CPU is drown and takes ages to reach MULTI_STOP_DISABLE_IRQ, which is the only way out of this situation. Solve this with unconditionally updating jiffies if the value is stale on nohz_full IRQ entry. IRQs and other disturbances are expected to be rare enough on nohz_full for the unconditional call to ktime_get() to actually matter. Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20211026141055.57358-2-frederic@kernel.org Signed-off-by: Phil Auld <pauld@redhat.com>	2022-06-01 13:54:12 -04:00
Prarit Bhargava	422a09ac44	genirq: Change force_irqthreads to a static key Bugzilla: http://bugzilla.redhat.com/2023084 commit 91cc470e797828d779cd4c1efbe8519bcb358bae Author: Tanner Love <tannerlove@google.com> Date: Wed Jun 2 14:03:38 2021 -0400 genirq: Change force_irqthreads to a static key With CONFIG_IRQ_FORCED_THREADING=y, testing the boolean force_irqthreads could incur a cache line miss in invoke_softirq() and other places. Replace the test with a static key to avoid the potential cache miss. [ tglx: Dropped the IDE part, removed the export and updated blk-mq ] Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Tanner Love <tannerlove@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210602180338.3324213-1-tannerlove.kernel@gmail.com Signed-off-by: Prarit Bhargava <prarit@redhat.com>	2022-01-11 18:23:31 -05:00
Peter Zijlstra	b03fbd4ff2	sched: Introduce task_is_running() Replace a bunch of 'p->state == TASK_RUNNING' with a new helper: task_is_running(p). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210611082838.222401495@infradead.org	2021-06-18 11:43:07 +02:00
Peter Zijlstra	37aadc687a	sched: Unbreak wakeups Remove broken task->state references and let wake_up_process() DTRT. The anti-pattern in these patches breaks the ordering of ->state vs COND as described in the comment near set_current_state() and can lead to missed wakeups: (OoO load, observes RUNNING)<-. for (;;) { \| t->state = UNINTERRUPTIBLE; \| smp_mb(); ,-----> \| (observes !COND) \| / if (COND) ---------' \| COND = 1; break; `- if (t->state != RUNNING) wake_up_process(t); // not done schedule(); // forever waiting } t->state = TASK_RUNNING; Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210611082838.160855222@infradead.org	2021-06-18 11:43:06 +02:00
Linus Torvalds	9a45da9270	RCU changes for this cycle were: - Bitmap support for "N" as alias for last bit - kvfree_rcu updates - mm_dump_obj() updates. (One of these is to mm, but was suggested by Andrew Morton.) - RCU callback offloading update - Polling RCU grace-period interfaces - Realtime-related RCU updates - Tasks-RCU updates - Torture-test updates - Torture-test scripting updates - Miscellaneous fixes Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJCZERHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1hRjw/+Jkb9KvR9odPt/zqN/KPtIlburCUWgsFb 2zAlWN4uMocPAiXT2Xq58/8gqMkpyn7ZVZtL1tD8fZSvlwEr0U8Z74+/NdoQvYE+ kMXIYIuhIAGRyAupmzkriqN33iY+BSZPacX3u6ziPj57/0OZzbWVN/DAhbuvyLqG J/oL4PHCa7XAqXbf95rd5Zjs680QJ3CbTRh4nA8uHArzJmKZOaaHJ05Pxd1LpULe SJ+5p1GQnnwxd1HqmlHMDu/dW+2hE35BGykF8zi78je9OJXualDoM/6JpIYGhMNY 5qlhU55QYP1jzjuNGVZZUS4L77eS2/W7SpPAaTmMEy/SsVB59G8Kf22oNDpVaEqQ m+2ErqwaHvlkMjqnsx+JQbsOP0yCi2NZBoEPFdfk1H23E2deVlSDbxPso4Zb1oUD E12769kN+SWDytuLSOAe1PY/KXqmNUKjPZl1GDCGXL7HlCnWyggUDschTsKJa19O XXl+yCTGMUH4XAPSqavAKQbBjurqpT6i4zfooSH4TBtOHm1ExgZOUS8gglZ1JuJd q+uJdZIgS8BcGkGw/k1bYDWY5TA4Rjv3sAOKQL1PgYBl1t/yLK441mE7LI9gWOwz Crz7vlSxD6Jc2cYQeUVW0KPGt5aVd63Gd9HjpXxGkqYQSDRqYMCebHEAGagz+jj7 Nv/nOnf34Uc= =mpNt -----END PGP SIGNATURE----- Merge tag 'core-rcu-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: - Support for "N" as alias for last bit in bitmap parsing library (eg using syntax like "nohz_full=2-N") - kvfree_rcu updates - mm_dump_obj() updates. (One of these is to mm, but was suggested by Andrew Morton.) - RCU callback offloading update - Polling RCU grace-period interfaces - Realtime-related RCU updates - Tasks-RCU updates - Torture-test updates - Torture-test scripting updates - Miscellaneous fixes * tag 'core-rcu-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits) rcutorture: Test start_poll_synchronize_rcu() and poll_state_synchronize_rcu() rcu: Provide polling interfaces for Tiny RCU grace periods torture: Fix kvm.sh --datestamp regex check torture: Consolidate qemu-cmd duration editing into kvm-transform.sh torture: Print proper vmlinux path for kvm-again.sh runs torture: Make TORTURE_TRUST_MAKE available in kvm-again.sh environment torture: Make kvm-transform.sh update jitter commands torture: Add --duration argument to kvm-again.sh torture: Add kvm-again.sh to rerun a previous torture-test torture: Create a "batches" file for build reuse torture: De-capitalize TORTURE_SUITE torture: Make upper-case-only no-dot no-slash scenario names official torture: Rename SRCU-t and SRCU-u to avoid lowercase characters torture: Remove no-mpstat error message torture: Record kvm-test-1-run.sh and kvm-test-1-run-qemu.sh PIDs torture: Record jitter start/stop commands torture: Extract kvm-test-1-run-qemu.sh from kvm-test-1-run.sh torture: Record TORTURE_KCONFIG_GDB_ARG in qemu-cmd torture: Abstract jitter.sh start/stop into scripts rcu: Provide polling interfaces for Tree RCU grace periods ...	2021-04-28 12:00:13 -07:00
Thomas Gleixner	47c218dcae	tick/sched: Prevent false positive softirq pending warnings on RT On RT a task which has soft interrupts disabled can block on a lock and schedule out to idle while soft interrupts are pending. This triggers the warning in the NOHZ idle code which complains about going idle with pending soft interrupts. But as the task is blocked soft interrupt processing is temporarily blocked as well which means that such a warning is a false positive. To prevent that check the per CPU state which indicates that a scheduled out task has soft interrupts disabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309085727.527563866@linutronix.de	2021-03-17 16:34:11 +01:00
Thomas Gleixner	8b1c04acad	softirq: Make softirq control and processing RT aware Provide a local lock based serialization for soft interrupts on RT which allows the local_bh_disabled() sections and servicing soft interrupts to be preemptible. Provide the necessary inline helpers which allow to reuse the bulk of the softirq processing code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309085727.426370483@linutronix.de	2021-03-17 16:34:10 +01:00
Thomas Gleixner	f02fc963e9	softirq: Move various protections into inline helpers To allow reuse of the bulk of softirq processing code for RT and to avoid #ifdeffery all over the place, split protections for various code sections out into inline helpers so the RT variant can just replace them in one go. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309085727.310118772@linutronix.de	2021-03-17 16:34:10 +01:00
Thomas Gleixner	eb2dafbba8	tasklets: Prevent tasklet_unlock_spin_wait() deadlock on RT tasklet_unlock_spin_wait() spin waits for the TASKLET_STATE_SCHED bit in the tasklet state to be cleared. This works on !RT nicely because the corresponding execution can only happen on a different CPU. On RT softirq processing is preemptible, therefore a task preempting the softirq processing thread can spin forever. Prevent this by invoking local_bh_disable()/enable() inside the loop. In case that the softirq processing thread was preempted by the current task, current will block on the local lock which yields the CPU to the preempted softirq processing thread. If the tasklet is processed on a different CPU then the local_bh_disable()/enable() pair is just a waste of processor cycles. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309084241.988908275@linutronix.de	2021-03-17 16:33:57 +01:00
Peter Zijlstra	697d8c63c4	tasklets: Replace spin wait in tasklet_kill() tasklet_kill() spin waits for TASKLET_STATE_SCHED to be cleared invoking yield() from inside the loop. yield() is an ill defined mechanism and the result might still be wasting CPU cycles in a tight loop which is especially painful in a guest when the CPU running the tasklet is scheduled out. tasklet_kill() is used in teardown paths and not performance critical at all. Replace the spin wait with wait_var_event(). Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309084241.890532921@linutronix.de	2021-03-17 16:33:57 +01:00
Peter Zijlstra	da04474740	tasklets: Replace spin wait in tasklet_unlock_wait() tasklet_unlock_wait() spin waits for TASKLET_STATE_RUN to be cleared. This is wasting CPU cycles in a tight loop which is especially painful in a guest when the CPU running the tasklet is scheduled out. tasklet_unlock_wait() is invoked from tasklet_kill() which is used in teardown paths and not performance critical at all. Replace the spin wait with wait_var_event(). There are no users of tasklet_unlock_wait() which are invoked from atomic contexts. The usage in tasklet_disable() has been replaced temporarily with the spin waiting variant until the atomic users are fixed up and will be converted to the sleep wait variant later. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210309084241.783936921@linutronix.de	2021-03-17 16:33:55 +01:00
Dirk Behme	6b2c339df9	softirq: s/BUG/WARN_ONCE/ on tasklet SCHED state not set Replace BUG() with WARN_ONCE() on wrong tasklet state, in order to: - increase the verbosity / aid in debugging - avoid fatal/unrecoverable state Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com> Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210317102012.32399-1-erosca@de.adit-jv.com	2021-03-17 12:59:35 +01:00
Davidlohr Bueso	3a0ade0c52	tasklet: Remove tasklet_kill_immediate Ever since RCU was converted to softirq, it has no users. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20210306213658.12862-1-dave@stgolabs.net	2021-03-16 15:06:31 +01:00
Paul E. McKenney	1c0c4bc1ce	softirq: Don't try waking ksoftirqd before it has been spawned If there is heavy softirq activity, the softirq system will attempt to awaken ksoftirqd and will stop the traditional back-of-interrupt softirq processing. This is all well and good, but only if the ksoftirqd kthreads already exist, which is not the case during early boot, in which case the system hangs. One reproducer is as follows: tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 2 --configs "TREE03" --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y CONFIG_NO_HZ_IDLE=y CONFIG_HZ_PERIODIC=n" --bootargs "threadirqs=1" --trust-make This commit therefore adds a couple of existence checks for ksoftirqd and forces back-of-interrupt softirq processing when ksoftirqd does not yet exist. With this change, the above test passes. Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reported-by: Uladzislau Rezki <urezki@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> [ paulmck: Remove unneeded check per Sebastian Siewior feedback. ] Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-15 13:51:48 -07:00
Thomas Gleixner	db1cc7aede	softirq: Move do_softirq_own_stack() to generic asm header To avoid include recursion hell move the do_softirq_own_stack() related content into a generic asm header and include it from all places in arch/ which need the prototype. This allows architectures to provide an inline implementation of do_softirq_own_stack() without introducing a lot of #ifdeffery all over the place. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210210002513.289960691@linutronix.de	2021-02-10 23:34:16 +01:00
Linus Torvalds	6be5f58215	Misc fixes/updates: - Fix static keys usage in module __init sections - Add separate MAINTAINERS entry for static branches/calls - Fix lockdep splat with CONFIG_PREEMPTIRQ_EVENTS=y tracing Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl/oWJ0RHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1hdtRAAsmmi7b8Di3ANfkJPRWyikPETdOn2bZA6 dNaXSmRC0vPrngfoPgJ1A0iqIMgnZyeZs907qeB/vV9/EXa9zkFRzLvYL307lVnN sJo5kx7PLOGdGtQ1jcbDO2QC4INPD0PLlMr3wnAVF5EycX+geux/Vc/R2ZLB1pkC BzrA4u2V1P+DCbstNAX4b0SwAGSdvBWLFNSpXBbQrPMd9umdoL6uS9mF+zOf/0+q 5r7Hc+41t7COBpHumiPdF7IrLTKP8bostMQalUu41GkD5G4vhoQUGw2edRirHE3X GGZbDmXo1FqU9q6qHaWhwugYbMoaMH3LTyOYTpW5xnDH1huXyXhaX2385V+aT/Ts g64SrLrAwJ9lokZwDUqORfxi6pi8mfNzwCQ49fKKjBC606bEKE1tjW1b7KkHxIWe wLCcdhZA4QuSN5F9XT3NyiQQgDLcReSA3WjA6T6QF26q5x5hmvuQqP+gmoqRthw1 YXQ9lox3u4bLfqlvBMpJFhrCGBC2afyQyH18Xy7lsI1qfoktwvgoUEsObpb52U/G v/Y4sl6MnZJHJPik3yTdD+/EtdLPvMRPn9b/wMS3B+JSlP1pv1L03Nklz82tcMs/ BoeKtdynRlKd3Sw4o3K44c8WjJ1ARXtDhLVUMDiILYnOVsT0B1xkdJkwLGOlhe3A hMuhh41TFtI= =fG6D -----END PGP SIGNATURE----- Merge tag 'locking-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Misc fixes/updates: - Fix static keys usage in module __init sections - Add separate MAINTAINERS entry for static branches/calls - Fix lockdep splat with CONFIG_PREEMPTIRQ_EVENTS=y tracing" * tag 'locking-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: softirq: Avoid bad tracing / lockdep interaction jump_label/static_call: Add MAINTAINERS jump_label: Fix usage in module __init	2020-12-27 09:06:10 -08:00
Peter Zijlstra	91ea62d58b	softirq: Avoid bad tracing / lockdep interaction Similar to commit: `1a63dcd876` ("softirq: Reorder trace_softirqs_on to prevent lockdep splat") __local_bh_enable_ip() can also call into tracing with inconsistent state. Unlike that commit we don't need to bother about the tracepoint because 'cnt-1' never matches preempt_count() (by construction). Reported-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lkml.kernel.org/r/20201218154519.GW3092@hirez.programming.kicks-ass.net	2020-12-18 16:53:13 +01:00
Frederic Weisbecker	d14ce74f1f	irq: Call tick_irq_enter() inside HARDIRQ_OFFSET Now that account_hardirq_enter() is called after HARDIRQ_OFFSET has been incremented, there is nothing left that prevents us from also moving tick_irq_enter() after HARDIRQ_OFFSET is incremented. The desired outcome is to remove the nasty hack that prevents softirqs from being raised through ksoftirqd instead of the hardirq bottom half. Also tick_irq_enter() then becomes appropriately covered by lockdep. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20201202115732.27827-6-frederic@kernel.org	2020-12-02 20:20:05 +01:00
Frederic Weisbecker	d3759e7184	irqtime: Move irqtime entry accounting after irq offset incrementation IRQ time entry is currently accounted before HARDIRQ_OFFSET or SOFTIRQ_OFFSET are incremented. This is convenient to decide to which index the cputime to account is dispatched. Unfortunately it prevents tick_irq_enter() from being called under HARDIRQ_OFFSET because tick_irq_enter() has to be called before the IRQ entry accounting due to the necessary clock catch up. As a result we don't benefit from appropriate lockdep coverage on tick_irq_enter(). To prepare for fixing this, move the IRQ entry cputime accounting after the preempt offset is incremented. This requires the cputime dispatch code to handle the extra offset. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20201202115732.27827-5-frederic@kernel.org	2020-12-02 20:20:05 +01:00
Thomas Gleixner	ae9ef58996	softirq: Move related code into one section To prepare for adding a RT aware variant of softirq serialization and processing move related code into one section so the necessary #ifdeffery is reduced to one. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20201113141733.974214480@linutronix.de	2020-11-23 10:31:06 +01:00
Jiafei Pan	cdabce2e3d	softirq: Add debug check to __raise_softirq_irqoff() __raise_softirq_irqoff() must be called with interrupts disabled to protect the per CPU softirq pending state update against an interrupt and soft interrupt handling on return from interrupt. Add a lockdep assertion to validate the calling convention. [ tglx: Massaged changelog ] Signed-off-by: Jiafei Pan <Jiafei.Pan@nxp.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200814045522.45719-1-Jiafei.Pan@nxp.com	2020-09-16 15:18:56 +02:00
Linus Torvalds	427714f258	tasklets API update for v5.9-rc1 - Prepare for tasklet API modernization (Romain Perier, Allen Pais, Kees Cook) -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl8oXpMWHGtlZXNjb29r QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJtJgEACVb88nzYwu5mC5ZcfvwSyXeQsR eDpCkX5HT6CsxlOn0/YJvxUtkkerQftbRuAXrzoUpQkpyBh82PviVZFKDS7NE9Lc 6xPqloi2gbZ8EfgMraVynL+9lpLh0+qNCM7LPg4xT+JxMDLut/nWRdrp8d7uBYfQ AXV6CV4Tc4ijOMROV6AEVVdSTzkRCbiqUnRDBLETBfiJOdDn5MgJgxicWvN5FTpu PiUVF3CtWaKCRfQO/GEAXTG65hOtmql5IbX9n7uooNu/wCCnEFfVUus1uTcsrqxN ByrZ56NVPoO7z2jYLt8Lft3myo2e/mn88PKqrzS2p9GPn0VBv7rcO3ePmbbHL/RA mp+pg8wdpmKrHv4YGfsF+obT1v8f6VJoTLUt5S/WqZAzl1sVJgEJdAkjmDKythGG yYKKCemMceMMzLXxnFAYMzdXzdXZ3YEpiW4UkBb77EhUisDrLxCHSL5t4UzyWnuO Gtzw7N69iHPHLsxAk1hESAD8sdlk2EdN6vzJVelOsiW955x1hpR+msvNpwZwBqdq A2h8VnnrxLK2APl93T5VW9T6kvhzaTwLhoCH+oKklE+U0XJTAYZ4D/AcRVghBvMg bC1+1vDx+t/S+8P308evPQnEygLtL2I+zpPnBA1DZzHRAoY8inCLc5HQOfr6pi/f koNTtKkmSSKaFSYITw== =hb+e -----END PGP SIGNATURE----- Merge tag 'tasklets-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull tasklets API update from Kees Cook: "These are the infrastructure updates needed to support converting the tasklet API to something more modern (and hopefully for removal further down the road). There is a 300-patch series waiting in the wings to get set out to subsystem maintainers, but these changes need to be present in the kernel first. Since this has some treewide changes, I carried this series for -next instead of paining Thomas with it in -tip, but it's got his Ack. This is similar to the timer_struct modernization from a while back, but not nearly as messy (I hope). :) - Prepare for tasklet API modernization (Romain Perier, Allen Pais, Kees Cook)" * tag 'tasklets-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: tasklet: Introduce new initialization API treewide: Replace DECLARE_TASKLET() with DECLARE_TASKLET_OLD() usb: gadget: udc: Avoid tasklet passing a global	2020-08-04 13:40:35 -07:00
Romain Perier	12cc923f1c	tasklet: Introduce new initialization API Nowadays, modern kernel subsystems that use callbacks pass the data structure associated with a given callback as argument to the callback. The tasklet subsystem remains one which passes an arbitrary unsigned long to the callback function. This has several problems: - This keeps an extra field for storing the argument in each tasklet data structure, it bloats the tasklet_struct structure with a redundant .data field - No type checking can be performed on this argument. Instead of using container_of() like other callback subsystems, it forces callbacks to do explicit type cast of the unsigned long argument into the required object type. - Buffer overflows can overwrite the .func and the .data field, so an attacker can easily overwrite the function and its first argument to whatever it wants. Add a new tasklet initialization API, via DECLARE_TASKLET() and tasklet_setup(), which will replace the existing ones. This work is greatly inspired by the timer_struct conversion series, see commit `e99e88a9d2` ("treewide: setup_timer() -> timer_setup()") To avoid problems with both -Wcast-function-type (which is enabled in the kernel via -Wextra is several subsystems), and with mismatched function prototypes when build with Control Flow Integrity enabled, this adds the "use_callback" member to let the tasklet caller choose which union member to call through. Once all old API uses are removed, this and the .data member will be removed as well. (On 64-bit this does not grow the struct size as the new member fills the hole after atomic_t, which is also "int" sized.) Signed-off-by: Romain Perier <romain.perier@gmail.com> Co-developed-by: Allen Pais <allen.lkml@gmail.com> Signed-off-by: Allen Pais <allen.lkml@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>	2020-07-30 11:16:01 -07:00
Peter Zijlstra	f9ad4a5f3f	lockdep: Remove lockdep_hardirq{s_enabled,_context}() argument Now that the macros use per-cpu data, we no longer need the argument. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20200623083721.571835311@infradead.org	2020-07-10 12:00:02 +02:00
Peter Zijlstra	a21ee6055c	lockdep: Change hardirq{s_enabled,_context} to per-cpu variables Currently all IRQ-tracking state is in task_struct, this means that task_struct needs to be defined before we use it. Especially for lockdep_assert_irq*() this can lead to header-hell. Move the hardirq state into per-cpu variables to avoid the task_struct dependency. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20200623083721.512673481@infradead.org	2020-07-10 12:00:02 +02:00
Peter Zijlstra	59bc300b71	x86/entry: Clarify irq_{enter,exit}_rcu() Because: irq_enter_rcu() includes lockdep_hardirq_enter() irq_exit_rcu() does NOT include lockdep_hardirq_exit() Which resulted in two 'stray' lockdep_hardirq_exit() calls in idtentry.h, and me spending a long time trying to find the matching enter calls. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200529213321.359433429@infradead.org	2020-06-11 15:15:24 +02:00
Thomas Gleixner	8a6bc4787f	genirq: Provide irq_enter/exit_rcu() irq_enter()/exit() currently include RCU handling. To properly separate the RCU handling code, provide variants which contain only the non-RCU related functionality. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20200521202117.567023613@linutronix.de	2020-06-11 15:15:06 +02:00
Peter Zijlstra	ef996916e7	lockdep: Rename trace_{hard,soft}{irq_context,irqs_enabled}() Continue what commit: `d820ac4c2f` ("locking: rename trace_softirq_[enter\|exit] => lockdep_softirq_[enter\|exit]") started, rename these to avoid confusing them with tracepoints. git grep -l "trace_$soft\\|hard$$irq_context\\|irqs_enabled$" \| while read file; do sed -ie 's/trace_$soft\\|hard$$irq_context\\|irqs_enabled$/lockdep_\1\2/g' $file; done Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20200320115859.178626842@infradead.org	2020-03-21 16:03:54 +01:00
Peter Zijlstra	0d38453c85	lockdep: Rename trace_softirqs_{on,off}() Continue what commit: `d820ac4c2f` ("locking: rename trace_softirq_[enter\|exit] => lockdep_softirq_[enter\|exit]") started, rename these to avoid confusing them with tracepoints. git grep -l "trace_softirqs_$on\\|off$" \| while read file; do sed -ie 's/trace_softirqs_$on\\|off$/lockdep_softirqs_\1/g' $file; done Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20200320115859.119434738@infradead.org	2020-03-21 16:03:54 +01:00
Thomas Gleixner	2502ec37a7	lockdep: Rename trace_hardirq_{enter,exit}() Continue what commit: `d820ac4c2f` ("locking: rename trace_softirq_[enter\|exit] => lockdep_softirq_[enter\|exit]") started, rename these to avoid confusing them with tracepoints. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20200320115859.060481361@infradead.org	2020-03-21 16:03:53 +01:00
Linus Torvalds	2a1ccd3142	Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "The irq departement provides the usual mixed bag: Core: - Further improvements to the irq timings code which aims to predict the next interrupt for power state selection to achieve better latency/power balance - Add interrupt statistics to the core NMI handlers - The usual small fixes and cleanups Drivers: - Support for Renesas RZ/A1, Annapurna Labs FIC, Meson-G12A SoC and Amazon Gravition AMR/GIC interrupt controllers. - Rework of the Renesas INTC controller driver - ACPI support for Socionext SoCs - Enhancements to the CSKY interrupt controller - The usual small fixes and cleanups" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits) irq/irqdomain: Fix comment typo genirq: Update irq stats from NMI handlers irqchip/gic-pm: Remove PM_CLK dependency irqchip/al-fic: Introduce Amazon's Annapurna Labs Fabric Interrupt Controller Driver dt-bindings: interrupt-controller: Add Amazon's Annapurna Labs FIC softirq: Use __this_cpu_write() in takeover_tasklets() irqchip/mbigen: Stop printing kernel addresses irqchip/gic: Add dependency for ARM_GIC_MAX_NR genirq/affinity: Remove unused argument from [__]irq_build_affinity_masks() genirq/timings: Add selftest for next event computation genirq/timings: Add selftest for irqs circular buffer genirq/timings: Add selftest for circular array genirq/timings: Encapsulate storing function genirq/timings: Encapsulate timings push genirq/timings: Optimize the period detection speed genirq/timings: Fix timings buffer inspection genirq/timings: Fix next event index function irqchip/qcom: Use struct_size() in devm_kzalloc() irqchip/irq-csky-mpintc: Remove unnecessary loop in interrupt handler dt-bindings: interrupt-controller: Update csky mpintc ...	2019-07-08 11:01:13 -07:00
Muchun Song	8afecaa68d	softirq: Use __this_cpu_write() in takeover_tasklets() The code is executed with interrupts disabled, so it's safe to use __this_cpu_write(). [ tglx: Massaged changelog ] Signed-off-by: Muchun Song <smuchun@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: joel@joelfernandes.org Cc: rostedt@goodmis.org Cc: frederic@kernel.org Cc: paulmck@linux.vnet.ibm.com Cc: alexander.levin@verizon.com Cc: peterz@infradead.org Link: https://lkml.kernel.org/r/20190618143305.2038-1-smuchun@gmail.com	2019-06-23 18:14:27 +02:00
Thomas Gleixner	767a67b0b3	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 430 Based on 1 normalized pattern(s): distribute under gplv2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 8 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Armijn Hemel <armijn@tjaldur.nl> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531190114.475576622@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-06-05 17:37:16 +02:00
Thomas Gleixner	d7dcf26ff0	softirq: Remove tasklet_hrtimer There are no more users of this interface. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Link: https://lkml.kernel.org/r/20190301224821.29843-4-bigeasy@linutronix.de	2019-03-22 14:36:02 +01:00
Matthias Kaehlcke	1342d8080f	softirq: Don't skip softirq execution when softirq thread is parking When a CPU is unplugged the kernel threads of this CPU are parked (see smpboot_park_threads()). kthread_park() is used to mark each thread as parked and wake it up, so it can complete the process of parking itselfs (see smpboot_thread_fn()). If local softirqs are pending on interrupt exit invoke_softirq() is called to process the softirqs, however it skips processing when the softirq kernel thread of the local CPU is scheduled to run. The softirq kthread is one of the threads that is parked when a CPU is unplugged. Parking the kthread wakes it up, however only to complete the parking process, not to process the pending softirqs. Hence processing of softirqs at the end of an interrupt is skipped, but not done elsewhere, which can result in warnings about pending softirqs when a CPU is unplugged: /sys/devices/system/cpu # echo 0 > cpu4/online [ ... ] NOHZ: local_softirq_pending 02 [ ... ] NOHZ: local_softirq_pending 202 [ ... ] CPU4: shutdown [ ... ] psci: CPU4 killed. Don't skip processing of softirqs at the end of an interrupt when the softirq thread of the CPU is parking. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Douglas Anderson <dianders@chromium.org> Cc: Stephen Boyd <swboyd@chromium.org> Link: https://lkml.kernel.org/r/20190128234625.78241-3-mka@chromium.org	2019-02-10 21:51:39 +01:00
Linus Torvalds	5947a64a7e	Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "The interrupt brigade came up with the following updates: - Driver for the Marvell System Error Interrupt machinery - Overhaul of the GIC-V3 ITS driver - Small updates and fixes all over the place" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) genirq: Fix race on spurious interrupt detection softirq: Fix typo in __do_softirq() comments genirq: Fix grammar s/an /a / irqchip/gic: Unify GIC priority definitions irqchip/gic-v3: Remove acknowledge loop dt-bindings/interrupt-controller: Add documentation for Marvell SEI controller dt-bindings/interrupt-controller: Update Marvell ICU bindings irqchip/irq-mvebu-icu: Add support for System Error Interrupts (SEI) arm64: marvell: Enable SEI driver irqchip/irq-mvebu-sei: Add new driver for Marvell SEI irqchip/irq-mvebu-icu: Support ICU subnodes irqchip/irq-mvebu-icu: Disociate ICU and NSR irqchip/irq-mvebu-icu: Clarify the reset operation of configured interrupts irqchip/irq-mvebu-icu: Fix wrong private data retrieval dt-bindings/interrupt-controller: Fix Marvell ICU length in the example genirq/msi: Allow creation of a tree-based irqdomain for platform-msi dt-bindings: irqchip: renesas-irqc: Document r8a7744 support dt-bindings: irqchip: renesas-irqc: Document R-Car E3 support irqchip/pdc: Setup all edge interrupts as rising edge at GIC irqchip/gic-v3-its: Allow use of LPI tables in reserved memory ...	2018-10-25 11:43:47 -07:00
Yangtao Li	e45506ac0a	softirq: Fix typo in __do_softirq() comments s/s/as [ mingo: Also add a missing 'the', add proper punctuation and clarify what 'swap' means here. ] Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: alexander.levin@verizon.com Cc: frederic@kernel.org Cc: joel@joelfernandes.org Cc: paulmck@linux.vnet.ibm.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/20181018142133.12341-1-tiny.windzz@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-10-18 18:10:23 +02:00
Paul E. McKenney	65cfe3583b	rcu: Define RCU-bh update API in terms of RCU Now that the main RCU API knows about softirq disabling and softirq's quiescent states, the RCU-bh update code can be dispensed with. This commit therefore removes the RCU-bh update-side implementation and defines RCU-bh's update-side API in terms of that of either RCU-preempt or RCU-sched, depending on the setting of the CONFIG_PREEMPT Kconfig option. In kernels built with CONFIG_RCU_NOCB_CPU=y this has the knock-on effect of reducing by one the number of rcuo kthreads per CPU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:02:40 -07:00

1 2 3 4 5 ...

267 Commits