summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* powerpc: fix i8042 module build errorGrant Likely2010-08-071-0/+2
| | | | | | of_i8042_{kbd,aux}_irq needs to be exported Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
* sound/soc: mpc5200_psc_ac97: Use gpio pins for cold resetEric Millbrandt2010-08-071-4/+18
| | | | | | | | | | | | | | | Call the gpio reset platform function instead of using the flawed ac97 functionality of the MPC5200(b) From MPC5200B User's Manual: "Some AC97 devices goes to a test mode, if the Sync line is high during the Res line is low (reset phase). To avoid this behavior the Sync line must be also forced to zero during the reset phase. To do that, the pin muxing should switch to GPIO mode and the GPIO control register should be used to control the output lines." Signed-off-by: Eric Millbrandt <emillbrandt@dekaresearch.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
* powerpc/5200: add mpc5200_psc_ac97_gpio_resetEric Millbrandt2010-08-073-0/+108
| | | | | | | | | | | | | | | | | | Work around a silicon bug in the ac97 reset functionality of the mpc5200(b). The implementation of the ac97 "cold" reset is flawed. If the sync and output lines are high when reset is asserted the attached ac97 device may go into test mode. Avoid this by reconfiguring the psc to gpio mode and generating the reset manually. From MPC5200B User's Manual: "Some AC97 devices goes to a test mode, if the Sync line is high during the Res line is low (reset phase). To avoid this behavior the Sync line must be also forced to zero during the reset phase. To do that, the pin muxing should switch to GPIO mode and the GPIO control register should be used to control the output lines." Signed-off-by: Eric Millbrandt <emillbrandt@dekaresearch.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
* Merge branch 'irq-core-for-linus' of ↵Linus Torvalds2010-08-066-9/+18
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: xen: Do not suspend IPI IRQs. powerpc: Use IRQF_NO_SUSPEND not IRQF_TIMER for non-timer interrupts ixp4xx-beeper: Use IRQF_NO_SUSPEND not IRQF_TIMER for non-timer interrupt irq: Add new IRQ flag IRQF_NO_SUSPEND
| * xen: Do not suspend IPI IRQs.Ian Campbell2010-07-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In general the semantics of IPIs are that they are are expected to continue functioning after dpm_suspend_noirq(). Specifically I have seen a deadlock between the callfunc IPI and the stop machine used by xen's do_suspend() routine. If one CPU has already called dpm_suspend_noirq() then there is a window where it can be sent a callfunc IPI before all the other CPUs have entered stop_cpu(). If this happens then the first CPU ends up spinning in stop_cpu() waiting for the other to rendezvous in state STOPMACHINE_PREPARE while the other is spinning in csd_lock_wait(). Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: xen-devel@lists.xensource.com LKML-Reference: <1280398595-29708-4-git-send-email-ian.campbell@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * powerpc: Use IRQF_NO_SUSPEND not IRQF_TIMER for non-timer interruptsIan Campbell2010-07-292-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kw_i2c_irq and via_pmu_interrupt are not timer interrupts and therefore should not use IRQF_TIMER. Use the recently introduced IRQF_NO_SUSPEND instead since that is the actual desired behaviour. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: linuxppc-dev@ozlabs.org Cc: devicetree-discuss@lists.ozlabs.org LKML-Reference: <1280398595-29708-3-git-send-email-ian.campbell@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * ixp4xx-beeper: Use IRQF_NO_SUSPEND not IRQF_TIMER for non-timer interruptIan Campbell2010-07-291-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | ixp4xx_spkr_interrupt is not a timer interrupt and therefore should not use IRQF_TIMER. Use the recently introduced IRQF_NO_SUSPEND instead since that is the actual desired behaviour. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: linux-input@vger.kernel.org LKML-Reference: <1280398595-29708-2-git-send-email-ian.campbell@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * irq: Add new IRQ flag IRQF_NO_SUSPENDIan Campbell2010-07-292-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A small number of users of IRQF_TIMER are using it for the implied no suspend behaviour on interrupts which are not timer interrupts. Therefore add a new IRQF_NO_SUSPEND flag, rename IRQF_TIMER to __IRQF_TIMER and redefine IRQF_TIMER in terms of these new flags. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: xen-devel@lists.xensource.com Cc: linux-input@vger.kernel.org Cc: linuxppc-dev@ozlabs.org Cc: devicetree-discuss@lists.ozlabs.org LKML-Reference: <1280398595-29708-1-git-send-email-ian.campbell@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | Merge branch 'timers-timekeeping-for-linus' of ↵Linus Torvalds2010-08-0646-262/+165Star
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-timekeeping-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: um: Fix read_persistent_clock fallout kgdb: Do not access xtime directly powerpc: Clean up obsolete code relating to decrementer and timebase powerpc: Rework VDSO gettimeofday to prevent time going backwards clocksource: Add __clocksource_updatefreq_hz/khz methods x86: Convert common clocksources to use clocksource_register_hz/khz timekeeping: Make xtime and wall_to_monotonic static hrtimer: Cleanup direct access to wall_to_monotonic um: Convert to use read_persistent_clock timkeeping: Fix update_vsyscall to provide wall_to_monotonic offset powerpc: Cleanup xtime usage powerpc: Simplify update_vsyscall time: Kill off CONFIG_GENERIC_TIME time: Implement timespec_add x86: Fix vtime/file timestamp inconsistencies Trivial conflicts in Documentation/feature-removal-schedule.txt Much less trivial conflicts in arch/powerpc/kernel/time.c resolved as per Thomas' earlier merge commit 47916be4e28c ("Merge branch 'powerpc.cherry-picks' into timers/clocksource")
| * | um: Fix read_persistent_clock falloutThomas Gleixner2010-08-031-3/+2Star
| | | | | | | | | | | | | | | | | | | | | commit 9f31f57(um: Convert to use read_persistent_clock) moved the code, but not the variable. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | kgdb: Do not access xtime directlyThomas Gleixner2010-07-291-1/+3
| | | | | | | | | | | | | | | | | | The xtime cleanup missed the kgdb access to xtime. Fix it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | Merge branch 'powerpc.cherry-picks' into timers/clocksourceThomas Gleixner2010-07-2811-363/+80Star
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/powerpc/kernel/time.c Reason: The powerpc next tree contains two commits which conflict with the timekeeping changes: 8fd63a9e powerpc: Rework VDSO gettimeofday to prevent time going backwards c1aa687d powerpc: Clean up obsolete code relating to decrementer and timebase John Stultz identified them and provided the conflict resolution. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| | * | powerpc: Clean up obsolete code relating to decrementer and timebasePaul Mackerras2010-07-287-153/+9Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the decrementer and timekeeping code was moved over to using the generic clockevents and timekeeping infrastructure, several variables and functions have been obsolete and effectively unused. This deletes them. In particular, wakeup_decrementer() is no longer needed since the generic code reprograms the decrementer as part of the process of resuming the timekeeping code, which happens during sysdev resume. Thus the wakeup_decrementer calls in the suspend_enter methods for 52xx platforms have been removed. The call in the powermac cpu frequency change code has been replaced by set_dec(1), which will cause a timer interrupt as soon as interrupts are enabled, and the generic code will then reprogram the decrementer with the correct value. This also simplifies the generic_suspend_en/disable_irqs functions and makes them static since they are not referenced outside time.c. The preempt_enable/disable calls are removed because the generic code has disabled all but the boot cpu at the point where these functions are called, so we can't be moved to another cpu. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| | * | powerpc: Rework VDSO gettimeofday to prevent time going backwardsPaul Mackerras2010-07-285-237/+99Star
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently it is possible for userspace to see the result of gettimeofday() going backwards by 1 microsecond, assuming that userspace is using the gettimeofday() in the VDSO. The VDSO gettimeofday() algorithm computes the time in "xsecs", which are units of 2^-20 seconds, or approximately 0.954 microseconds, using the algorithm now = (timebase - tb_orig_stamp) * tb_to_xs + stamp_xsec and then converts the time in xsecs to seconds and microseconds. The kernel updates the tb_orig_stamp and stamp_xsec values every tick in update_vsyscall(). If the length of the tick is not an integer number of xsecs, then some precision is lost in converting the current time to xsecs. For example, with CONFIG_HZ=1000, the tick is 1ms long, which is 1048.576 xsecs. That means that stamp_xsec will advance by either 1048 or 1049 on each tick. With the right conditions, it is possible for userspace to get (timebase - tb_orig_stamp) * tb_to_xs being 1049 if the kernel is slightly late in updating the vdso_datapage, and then for stamp_xsec to advance by 1048 when the kernel does update it, and for userspace to then see (timebase - tb_orig_stamp) * tb_to_xs being zero due to integer truncation. The result is that time appears to go backwards by 1 microsecond. To fix this we change the VDSO gettimeofday to use a new field in the VDSO datapage which stores the nanoseconds part of the time as a fractional number of seconds in a 0.32 binary fraction format. (Or put another way, as a 32-bit number in units of 0.23283 ns.) This is convenient because we can use the mulhwu instruction to convert it to either microseconds or nanoseconds. Since it turns out that computing the time of day using this new field is simpler than either using stamp_xsec (as gettimeofday does) or stamp_xtime.tv_nsec (as clock_gettime does), this converts both gettimeofday and clock_gettime to use the new field. The existing __do_get_tspec function is converted to use the new field and take a parameter in r7 that indicates the desired resolution, 1,000,000 for microseconds or 1,000,000,000 for nanoseconds. The __do_get_xsec function is then unused and is deleted. The new algorithm is now = ((timebase - tb_orig_stamp) << 12) * tb_to_xs + (stamp_xtime_seconds << 32) + stamp_sec_fraction with 'now' in units of 2^-32 seconds. That is then converted to seconds and either microseconds or nanoseconds with seconds = now >> 32 partseconds = ((now & 0xffffffff) * resolution) >> 32 The 32-bit VDSO code also makes a further simplification: it ignores the bottom 32 bits of the tb_to_xs value, which is a 0.64 format binary fraction. Doing so gets rid of 4 multiply instructions. Assuming a timebase frequency of 1GHz or less and an update interval of no more than 10ms, the upper 32 bits of tb_to_xs will be at least 4503599, so the error from ignoring the low 32 bits will be at most 2.2ns, which is more than an order of magnitude less than the time taken to do gettimeofday or clock_gettime on our fastest processors, so there is no possibility of seeing inconsistent values due to this. This also moves update_gtod() down next to its only caller, and makes update_vsyscall use the time passed in via the wall_time argument rather than accessing xtime directly. At present, wall_time always points to xtime, but that could change in future. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | clocksource: Add __clocksource_updatefreq_hz/khz methodsJohn Stultz2010-07-272-5/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To properly handle clocksources that change frequencies at the clocksource->enable() point, this patch adds a method that will update the clocksource's mult/shift and max_idle_ns values. Signed-off-by: John Stultz <johnstul@us.ibm.com> LKML-Reference: <1279068988-21864-12-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | x86: Convert common clocksources to use clocksource_register_hz/khzJohn Stultz2010-07-273-15/+12Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | This converts the most common of the x86 clocksources over to use clocksource_register_hz/khz. Signed-off-by: John Stultz <johnstul@us.ibm.com> LKML-Reference: <1279068988-21864-11-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | timekeeping: Make xtime and wall_to_monotonic staticJohn Stultz2010-07-273-14/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes xtime and wall_to_monotonic static, as planned in Documentation/feature-removal-schedule.txt. This will allow for further cleanups to the timekeeping core. Signed-off-by: John Stultz <johnstul@us.ibm.com> LKML-Reference: <1279068988-21864-10-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | hrtimer: Cleanup direct access to wall_to_monotonicJohn Stultz2010-07-273-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provides an accessor function to replace hrtimer.c's direct access of wall_to_monotonic. This will allow wall_to_monotonic to be made static as planned in Documentation/feature-removal-schedule.txt Signed-off-by: John Stultz <johnstul@us.ibm.com> LKML-Reference: <1279068988-21864-9-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | um: Convert to use read_persistent_clockJohn Stultz2010-07-271-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch converts the um arch to use read_persistent_clock(). This allows it to avoid accessing xtime and wall_to_monotonic directly. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> LKML-Reference: <1279068988-21864-8-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | timkeeping: Fix update_vsyscall to provide wall_to_monotonic offsetJohn Stultz2010-07-276-19/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | update_vsyscall() did not provide the wall_to_monotoinc offset, so arch specific implementations tend to reference wall_to_monotonic directly. This limits future cleanups in the timekeeping core, so this patch fixes the update_vsyscall interface to provide wall_to_monotonic, allowing wall_to_monotonic to be made static as planned in Documentation/feature-removal-schedule.txt Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Anton Blanchard <anton@samba.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Tony Luck <tony.luck@intel.com> LKML-Reference: <1279068988-21864-7-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | powerpc: Cleanup xtime usageJohn Stultz2010-07-271-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This removes powerpc's direct xtime usage, allowing for further generic timeekeping cleanups Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> LKML-Reference: <1279068988-21864-6-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | powerpc: Simplify update_vsyscallJohn Stultz2010-07-271-30/+25Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently powerpc's update_vsyscall calls an inline update_gtod. However, both are straightforward, and there are no other users, so this patch merges update_gtod into update_vsyscall. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Anton Blanchard <anton@samba.org> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1279068988-21864-5-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | time: Kill off CONFIG_GENERIC_TIMEJohn Stultz2010-07-2733-159/+19Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that all arches have been converted over to use generic time via clocksources or arch_gettimeoffset(), we can remove the GENERIC_TIME config option and simplify the generic code. Signed-off-by: John Stultz <johnstul@us.ibm.com> LKML-Reference: <1279068988-21864-4-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | time: Implement timespec_addJohn Stultz2010-07-272-3/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After accidentally misusing timespec_add_safe, I wanted to make sure we don't accidently trip over that issue again, so I created a simple timespec_add() function which we can use to replace the instances of timespec_add_safe() that don't want the overflow detection. Signed-off-by: John Stultz <johnstul@us.ibm.com> LKML-Reference: <1279068988-21864-3-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | x86: Fix vtime/file timestamp inconsistenciesJohn Stultz2010-07-271-3/+8
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to vtime calling vgettimeofday(), its possible that an application could call time();create("stuff",O_RDRW); only to see the file's creation timestamp to be before the value returned by time. A similar way to reproduce the issue is to compare the vsyscall time() with the syscall time(), and observe ordering issues. The modified test case from Oleg Nesterov below can illustrate this: int main(void) { time_t sec1,sec2; do { sec1 = time(&sec2); sec2 = syscall(__NR_time, NULL); } while (sec1 <= sec2); printf("vtime: %d.000000\n", sec1); printf("time: %d.000000\n", sec2); return 0; } The proper fix is to make vtime use the same time value as current_kernel_time() (which is exported via update_vsyscall) instead of vgettime(). Thanks to Jiri Olsa for bringing up the issue and catching bugs in earlier verisons of this fix. Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> LKML-Reference: <1279068988-21864-2-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | Merge branch 'timers-for-linus' of ↵Linus Torvalds2010-08-065-12/+141
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: Documentation: Add timers/timers-howto.txt timer: Added usleep_range timer Revert "timer: Added usleep[_range] timer" clockevents: Remove the per cpu tick skew posix_timer: Move copy_to_user(created_timer_id) down in timer_create() timer: Added usleep[_range] timer timers: Document meaning of deferrable timer
| * | Documentation: Add timers/timers-howto.txtPatrick Pannuto2010-08-041-0/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This file seeks to explain the nuances in various delays; many driver writers are not necessarily familiar with the various kernel timers, their shortfalls, and quirks. When faced with ndelay, udelay, mdelay, usleep_range, msleep, and msleep_interrubtible the question "How do I just wait 1 ms for my hardware to latch?" has the non-intuitive "best" answer: usleep_range(1000,1500) This patch is followed by a series of checkpatch additions that seek to help kernel hackers pick the best delay. Signed-off-by: Patrick Pannuto <ppannuto@codeaurora.org> Cc: apw@canonical.com Cc: corbet@lwn.net Cc: arjan@linux.intel.com Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <1280786467-26999-3-git-send-email-ppannuto@codeaurora.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | timer: Added usleep_range timerPatrick Pannuto2010-08-042-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | usleep_range is a finer precision implementations of msleep and is designed to be a drop-in replacement for udelay where a precise sleep / busy-wait is unnecessary. Since an easy interface to hrtimers could lead to an undesired proliferation of interrupts, we provide only a "range" API, forcing the caller to think about an acceptable tolerance on both ends and hopefully avoiding introducing another interrupt. INTRO As discussed here ( http://lkml.org/lkml/2007/8/3/250 ), msleep(1) is not precise enough for many drivers (yes, sleep precision is an unfair notion, but consistently sleeping for ~an order of magnitude greater than requested is worth fixing). This patch adds a usleep API so that udelay does not have to be used. Obviously not every udelay can be replaced (those in atomic contexts or being used for simple bitbanging come to mind), but there are many, many examples of mydriver_write(...) /* Wait for hardware to latch */ udelay(100) in various drivers where a busy-wait loop is neither beneficial nor necessary, but msleep simply does not provide enough precision and people are using a busy-wait loop instead. CONCERNS FROM THE RFC Why is udelay a problem / necessary? Most callers of udelay are in device/ driver initialization code, which is serial... As I see it, there is only benefit to sleeping over a delay; the notion of "refactoring" areas that use udelay was presented, but I see usleep as the refactoring. Consider i2c, if the bus is busy, you need to wait a bit (say 100us) before trying again, your current options are: * udelay(100) * msleep(1) <-- As noted above, actually as high as ~20ms on some platforms, so not really an option * Manually set up an hrtimer to try again in 100us (which is what usleep does anyway...) People choose the udelay route because it is EASY; we need to provide a better easy route. Device / driver / boot code is *currently* serial, but every few months someone makes noise about parallelizing boot, and IMHO, a little forward-thinking now is one less thing to worry about if/when that ever happens udelay's could be preempted Sure, but if udelay plans on looping 1000 times, and it gets preempted on loop 200, whenever it's scheduled again, it is going to do the next 800 loops. Is the interruptible case needed? Probably not, but I see usleep as a very logical parallel to msleep, so it made sense to include the "full" API. Processors are getting faster (albeit not as quickly as they are becoming more parallel), so if someone wanted to be interruptible for a few usecs, why not let them? If this is a contentious point, I'm happy to remove it. OTHER THOUGHTS I believe there is also value in exposing the usleep_range option; it gives the scheduler a lot more flexibility and allows the programmer to express his intent much more clearly; it's something I would hope future driver writers will take advantage of. To get the results in the NUMBERS section below, I literally s/udelay/usleep the kernel tree; I had to go in and undo the changes to the USB drivers, but everything else booted successfully; I find that extremely telling in and of itself -- many people are using a delay API where a sleep will suit them just fine. SOME ATTEMPTS AT NUMBERS It turns out that calculating quantifiable benefit on this is challenging, so instead I will simply present the current state of things, and I hope this to be sufficient: How many udelay calls are there in 2.6.35-rc5? udealy(ARG) >= | COUNT 1000 | 319 500 | 414 100 | 1146 20 | 1832 I am working on Android, so that is my focus for this. The following table is a modified usleep that simply printk's the amount of time requested to sleep; these tests were run on a kernel with udelay >= 20 --> usleep "boot" is power-on to lock screen "power collapse" is when the power button is pushed and the device suspends "resume" is when the power button is pushed and the lock screen is displayed (no touchscreen events or anything, just turning on the display) "use device" is from the unlock swipe to clicking around a bit; there is no sd card in this phone, so fail loading music, video, camera ACTION | TOTAL NUMBER OF USLEEP CALLS | NET TIME (us) boot | 22 | 1250 power-collapse | 9 | 1200 resume | 5 | 500 use device | 59 | 7700 The most interesting category to me is the "use device" field; 7700us of busy-wait time that could be put towards better responsiveness, or at the least less power usage. Signed-off-by: Patrick Pannuto <ppannuto@codeaurora.org> Cc: apw@canonical.com Cc: corbet@lwn.net Cc: arjan@linux.intel.com Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | Revert "timer: Added usleep[_range] timer"Thomas Gleixner2010-08-042-28/+0Star
| | | | | | | | | | | | | | | | | | | | | This reverts commit 22b8f15c2f7130bb0386f548428df2ffd4e81903 to merge an advanced version. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | clockevents: Remove the per cpu tick skewArjan van de Ven2010-08-021-5/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Historically, Linux has tried to make the regular timer tick on the various CPUs not happen at the same time, to avoid contention on xtime_lock. Nowadays, with the tickless kernel, this contention no longer happens since time keeping and updating are done differently. In addition, this skew is actually hurting power consumption in a measurable way on many-core systems. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <20100727210210.58d3118c@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | posix_timer: Move copy_to_user(created_timer_id) down in timer_create()Andrey Vagin2010-07-231-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to Oleg Nesterov: We can move copy_to_user(created_timer_id) down after "if (timer_event_spec)" block too. (but before CLOCK_DISPATCH(), of course). Signed-off-by: Andrey Vagin <avagin@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Andrey Vagin <avagin@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | timer: Added usleep[_range] timerPatrick Pannuto2010-07-232-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | usleep[_range] are finer precision implementations of msleep and are designed to be drop-in replacements for udelay where a precise sleep / busy-wait is unnecessary. They also allow an easy interface to specify slack when a precise (ish) wakeup is unnecessary to help minimize wakeups Signed-off-by: Patrick Pannuto <ppannuto@codeaurora.org> Cc: akinobu.mita@gmail.com Cc: sboyd@codeaurora.org Acked-by: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <4C44CDD2.1070708@codeaurora.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * | timers: Document meaning of deferrable timerJ. Bruce Fields2010-07-231-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Steal some text from 6e453a67510 "Add support for deferrable timers". A reader shouldn't have to dig through the git logs for the basic description of a deferrable timer. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: johnstul@us.ibm.com Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* | | x86, kvm: Remove cast obsoleted by set_64bit() prototype cleanupH. Peter Anvin2010-08-061-5/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM ended up having to put a pretty ugly wrapper around set_64bit() in order to get the type right. Now set_64bit() takes the expected u64 type, and this wrapper can be cleaned up. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Avi Kivity <avi@redhat.com> LKML-Reference: <4C5C4E7A.8040603@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6Linus Torvalds2010-08-0697-3195/+1197Star
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6: pcmcia: avoid buffer overflow in pcmcia_setup_isa_irq pcmcia: do not request windows if you don't need to pcmcia: insert PCMCIA device resources into resource tree pcmcia: export resource information to sysfs pcmcia: use struct resource for PCMCIA devices, part 2 pcmcia: remove memreq_t pcmcia: move local definitions out of include/pcmcia/cs.h pcmcia: do not use io_req_t when calling pcmcia_request_io() pcmcia: do not use io_req_t after call to pcmcia_request_io() pcmcia: use struct resource for PCMCIA devices pcmcia: clean up cs.h pcmcia: use pcmica_{read,write}_config_byte pcmcia: remove cs_types.h pcmcia: remove unused flag, simplify headers pcmcia: remove obsolete CS_EVENT_ definitions pcmcia: split up central event handler pcmcia: simplify event callback pcmcia: remove obsolete ioctl Conflicts in: - drivers/staging/comedi/drivers/* - drivers/staging/wlags49_h2/wl_cs.c due to dev_info_t and whitespace changes
| * | | pcmcia: avoid buffer overflow in pcmcia_setup_isa_irqDominik Brodowski2010-08-031-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NR_IRQS may be as low as 16, causing a (harmless?) buffer overflow in pcmcia_setup_isa_irq(): static u8 pcmcia_used_irq[NR_IRQS]; ... if ((try < 32) && pcmcia_used_irq[irq]) continue; This is read-only, so if this address would be non-zero, it would just mean we would not attempt an IRQ >= NR_IRQS -- which would fail anyway! And as request_irq() fails for an irq >= NR_IRQS, the setting code path: pcmcia_used_irq[irq]++; is never reached as well. Reported-by: Christoph Fritz <chf.fritz@googlemail.com> CC: stable@kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com>
| * | | pcmcia: do not request windows if you don't need toDominik Brodowski2010-08-035-140/+5Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several drivers contained dummy code to request for memory windows, even though they never made use of it. Remove all such code snippets. CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | | pcmcia: insert PCMCIA device resources into resource treeDominik Brodowski2010-08-036-34/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Insert PCMCIA device resources into the resource tree. However, this is currently only implemented for sockets which do not statically map the resources. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | | pcmcia: export resource information to sysfsDominik Brodowski2010-08-031-0/+13
| | | | | | | | | | | | | | | | Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | | pcmcia: use struct resource for PCMCIA devices, part 2Dominik Brodowski2010-08-0310-78/+75Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use struct resource * also for iomem resources. CC: linux-mtd@lists.infradead.org CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | | pcmcia: remove memreq_tDominik Brodowski2010-08-0318-105/+42Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Page already had to be set to 0; Offset can easily be passed as parameter to pcmcia_map_mem_page. CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: linux-ide@vger.kernel.org CC: linux-usb@vger.kernel.org CC: laforge@gnumonks.org CC: linux-mtd@lists.infradead.org CC: linux-bluetooth@vger.kernel.org CC: alsa-devel@alsa-project.org CC: linux-serial@vger.kernel.org CC: Michael Buesch <mb@bu3sch.de> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | | pcmcia: move local definitions out of include/pcmcia/cs.hDominik Brodowski2010-08-033-19/+6Star
| | | | | | | | | | | | | | | | Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | | pcmcia: do not use io_req_t when calling pcmcia_request_io()Dominik Brodowski2010-08-0357-584/+527Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of io_req_t, drivers are now requested to fill out struct pcmcia_device *p_dev->resource[0,1] for up to two ioport ranges. After a call to pcmcia_request_io(), the ports found there are reserved, after calling pcmcia_request_configuration(), they may be used. CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: linux-ide@vger.kernel.org CC: linux-usb@vger.kernel.org CC: laforge@gnumonks.org CC: linux-mtd@lists.infradead.org CC: alsa-devel@alsa-project.org CC: linux-serial@vger.kernel.org CC: Michael Buesch <mb@bu3sch.de> Acked-by: Marcel Holtmann <marcel@holtmann.org> (for drivers/bluetooth/) Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | | pcmcia: do not use io_req_t after call to pcmcia_request_io()Dominik Brodowski2010-08-0351-240/+223Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After pcmcia_request_io(), do not make use of the values stored in io_req_t, but instead use those found in struct pcmcia_device->resource[]. CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: linux-ide@vger.kernel.org CC: linux-usb@vger.kernel.org CC: laforge@gnumonks.org CC: linux-mtd@lists.infradead.org CC: alsa-devel@alsa-project.org CC: linux-serial@vger.kernel.org Acked-by: Marcel Holtmann <marcel@holtmann.org> (for drivers/bluetooth/) Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | | pcmcia: use struct resource for PCMCIA devicesDominik Brodowski2010-08-035-73/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new field into struct pcmcia_device named "resource" and of type struct resource *, which contains the IO port ranges allocated for this device. Memory window ranges and registration with the resource trees will follow at a later date. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | | pcmcia: clean up cs.hDominik Brodowski2010-08-032-12/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | Remove some obsolete definitions from cs.h Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | | pcmcia: use pcmica_{read,write}_config_byteDominik Brodowski2010-08-0312-178/+104Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use pcmcia_read_config_byte and pcmcia_write_config_byte instead of pcmcia_access_configuration_register. CC: netdev@vger.kernel.org CC: linux-wireless@vger.kernel.org CC: linux-serial@vger.kernel.org CC: Michael Buesch <mb@bu3sch.de> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | | pcmcia: remove cs_types.hDominik Brodowski2010-07-3090-151/+14Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove cs_types.h which is no longer needed: Most definitions aren't used at all, a few can be made away with, and two remaining definitions (typedefs, unfortunatley) may be moved to more specific places. CC: linux-ide@vger.kernel.org CC: linux-usb@vger.kernel.org CC: laforge@gnumonks.org CC: linux-mtd@lists.infradead.org CC: alsa-devel@alsa-project.org CC: linux-serial@vger.kernel.org Acked-by: Marcel Holtmann <marcel@holtmann.org> (for drivers/bluetooth/) Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | | pcmcia: remove unused flag, simplify headersDominik Brodowski2010-07-303-26/+10Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we only provide one way to set up resources now, we can remove the resource-setup-related bitfield (except resource_setup_done). In addition, pcmcia_state only consisted of one entry, so remove this bitfield as well. Suggested-by: Komuro <komurojun-mbn@nifty.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| * | | pcmcia: remove obsolete CS_EVENT_ definitionsDominik Brodowski2010-07-301-45/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove some definitions which became obsolete when the central event handler got removed. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>