diff options
author | Srikant Patnaik | 2015-01-11 12:28:04 +0530 |
---|---|---|
committer | Srikant Patnaik | 2015-01-11 12:28:04 +0530 |
commit | 871480933a1c28f8a9fed4c4d34d06c439a7a422 (patch) | |
tree | 8718f573808810c2a1e8cb8fb6ac469093ca2784 /Documentation/cpu-freq | |
parent | 9d40ac5867b9aefe0722bc1f110b965ff294d30d (diff) | |
download | FOSSEE-netbook-kernel-source-871480933a1c28f8a9fed4c4d34d06c439a7a422.tar.gz FOSSEE-netbook-kernel-source-871480933a1c28f8a9fed4c4d34d06c439a7a422.tar.bz2 FOSSEE-netbook-kernel-source-871480933a1c28f8a9fed4c4d34d06c439a7a422.zip |
Moved, renamed, and deleted files
The original directory structure was scattered and unorganized.
Changes are basically to make it look like kernel structure.
Diffstat (limited to 'Documentation/cpu-freq')
-rw-r--r-- | Documentation/cpu-freq/amd-powernow.txt | 38 | ||||
-rw-r--r-- | Documentation/cpu-freq/core.txt | 98 | ||||
-rw-r--r-- | Documentation/cpu-freq/cpu-drivers.txt | 216 | ||||
-rw-r--r-- | Documentation/cpu-freq/cpufreq-nforce2.txt | 19 | ||||
-rw-r--r-- | Documentation/cpu-freq/cpufreq-stats.txt | 128 | ||||
-rw-r--r-- | Documentation/cpu-freq/governors.txt | 301 | ||||
-rw-r--r-- | Documentation/cpu-freq/index.txt | 54 | ||||
-rw-r--r-- | Documentation/cpu-freq/pcc-cpufreq.txt | 207 | ||||
-rw-r--r-- | Documentation/cpu-freq/user-guide.txt | 224 |
9 files changed, 1285 insertions, 0 deletions
diff --git a/Documentation/cpu-freq/amd-powernow.txt b/Documentation/cpu-freq/amd-powernow.txt new file mode 100644 index 00000000..254da155 --- /dev/null +++ b/Documentation/cpu-freq/amd-powernow.txt @@ -0,0 +1,38 @@ + +PowerNow! and Cool'n'Quiet are AMD names for frequency +management capabilities in AMD processors. As the hardware +implementation changes in new generations of the processors, +there is a different cpu-freq driver for each generation. + +Note that the driver's will not load on the "wrong" hardware, +so it is safe to try each driver in turn when in doubt as to +which is the correct driver. + +Note that the functionality to change frequency (and voltage) +is not available in all processors. The drivers will refuse +to load on processors without this capability. The capability +is detected with the cpuid instruction. + +The drivers use BIOS supplied tables to obtain frequency and +voltage information appropriate for a particular platform. +Frequency transitions will be unavailable if the BIOS does +not supply these tables. + +6th Generation: powernow-k6 + +7th Generation: powernow-k7: Athlon, Duron, Geode. + +8th Generation: powernow-k8: Athlon, Athlon 64, Opteron, Sempron. +Documentation on this functionality in 8th generation processors +is available in the "BIOS and Kernel Developer's Guide", publication +26094, in chapter 9, available for download from www.amd.com. + +BIOS supplied data, for powernow-k7 and for powernow-k8, may be +from either the PSB table or from ACPI objects. The ACPI support +is only available if the kernel config sets CONFIG_ACPI_PROCESSOR. +The powernow-k8 driver will attempt to use ACPI if so configured, +and fall back to PST if that fails. +The powernow-k7 driver will try to use the PSB support first, and +fall back to ACPI if the PSB support fails. A module parameter, +acpi_force, is provided to force ACPI support to be used instead +of PSB support. diff --git a/Documentation/cpu-freq/core.txt b/Documentation/cpu-freq/core.txt new file mode 100644 index 00000000..ce0666e5 --- /dev/null +++ b/Documentation/cpu-freq/core.txt @@ -0,0 +1,98 @@ + CPU frequency and voltage scaling code in the Linux(TM) kernel + + + L i n u x C P U F r e q + + C P U F r e q C o r e + + + Dominik Brodowski <linux@brodo.de> + David Kimdon <dwhedon@debian.org> + + + + Clock scaling allows you to change the clock speed of the CPUs on the + fly. This is a nice method to save battery power, because the lower + the clock speed, the less power the CPU consumes. + + +Contents: +--------- +1. CPUFreq core and interfaces +2. CPUFreq notifiers + +1. General Information +======================= + +The CPUFreq core code is located in drivers/cpufreq/cpufreq.c. This +cpufreq code offers a standardized interface for the CPUFreq +architecture drivers (those pieces of code that do actual +frequency transitions), as well as to "notifiers". These are device +drivers or other part of the kernel that need to be informed of +policy changes (ex. thermal modules like ACPI) or of all +frequency changes (ex. timing code) or even need to force certain +speed limits (like LCD drivers on ARM architecture). Additionally, the +kernel "constant" loops_per_jiffy is updated on frequency changes +here. + +Reference counting is done by cpufreq_get_cpu and cpufreq_put_cpu, +which make sure that the cpufreq processor driver is correctly +registered with the core, and will not be unloaded until +cpufreq_put_cpu is called. + +2. CPUFreq notifiers +==================== + +CPUFreq notifiers conform to the standard kernel notifier interface. +See linux/include/linux/notifier.h for details on notifiers. + +There are two different CPUFreq notifiers - policy notifiers and +transition notifiers. + + +2.1 CPUFreq policy notifiers +---------------------------- + +These are notified when a new policy is intended to be set. Each +CPUFreq policy notifier is called three times for a policy transition: + +1.) During CPUFREQ_ADJUST all CPUFreq notifiers may change the limit if + they see a need for this - may it be thermal considerations or + hardware limitations. + +2.) During CPUFREQ_INCOMPATIBLE only changes may be done in order to avoid + hardware failure. + +3.) And during CPUFREQ_NOTIFY all notifiers are informed of the new policy + - if two hardware drivers failed to agree on a new policy before this + stage, the incompatible hardware shall be shut down, and the user + informed of this. + +The phase is specified in the second argument to the notifier. + +The third argument, a void *pointer, points to a struct cpufreq_policy +consisting of five values: cpu, min, max, policy and max_cpu_freq. min +and max are the lower and upper frequencies (in kHz) of the new +policy, policy the new policy, cpu the number of the affected CPU; and +max_cpu_freq the maximum supported CPU frequency. This value is given +for informational purposes only. + + +2.2 CPUFreq transition notifiers +-------------------------------- + +These are notified twice when the CPUfreq driver switches the CPU core +frequency and this change has any external implications. + +The second argument specifies the phase - CPUFREQ_PRECHANGE or +CPUFREQ_POSTCHANGE. + +The third argument is a struct cpufreq_freqs with the following +values: +cpu - number of the affected CPU +old - old frequency +new - new frequency + +If the cpufreq core detects the frequency has changed while the system +was suspended, these notifiers are called with CPUFREQ_RESUMECHANGE as +second argument. diff --git a/Documentation/cpu-freq/cpu-drivers.txt b/Documentation/cpu-freq/cpu-drivers.txt new file mode 100644 index 00000000..c4360963 --- /dev/null +++ b/Documentation/cpu-freq/cpu-drivers.txt @@ -0,0 +1,216 @@ + CPU frequency and voltage scaling code in the Linux(TM) kernel + + + L i n u x C P U F r e q + + C P U D r i v e r s + + - information for developers - + + + Dominik Brodowski <linux@brodo.de> + + + + Clock scaling allows you to change the clock speed of the CPUs on the + fly. This is a nice method to save battery power, because the lower + the clock speed, the less power the CPU consumes. + + +Contents: +--------- +1. What To Do? +1.1 Initialization +1.2 Per-CPU Initialization +1.3 verify +1.4 target or setpolicy? +1.5 target +1.6 setpolicy +2. Frequency Table Helpers + + + +1. What To Do? +============== + +So, you just got a brand-new CPU / chipset with datasheets and want to +add cpufreq support for this CPU / chipset? Great. Here are some hints +on what is necessary: + + +1.1 Initialization +------------------ + +First of all, in an __initcall level 7 (module_init()) or later +function check whether this kernel runs on the right CPU and the right +chipset. If so, register a struct cpufreq_driver with the CPUfreq core +using cpufreq_register_driver() + +What shall this struct cpufreq_driver contain? + +cpufreq_driver.name - The name of this driver. + +cpufreq_driver.owner - THIS_MODULE; + +cpufreq_driver.init - A pointer to the per-CPU initialization + function. + +cpufreq_driver.verify - A pointer to a "verification" function. + +cpufreq_driver.setpolicy _or_ +cpufreq_driver.target - See below on the differences. + +And optionally + +cpufreq_driver.exit - A pointer to a per-CPU cleanup function. + +cpufreq_driver.resume - A pointer to a per-CPU resume function + which is called with interrupts disabled + and _before_ the pre-suspend frequency + and/or policy is restored by a call to + ->target or ->setpolicy. + +cpufreq_driver.attr - A pointer to a NULL-terminated list of + "struct freq_attr" which allow to + export values to sysfs. + + +1.2 Per-CPU Initialization +-------------------------- + +Whenever a new CPU is registered with the device model, or after the +cpufreq driver registers itself, the per-CPU initialization function +cpufreq_driver.init is called. It takes a struct cpufreq_policy +*policy as argument. What to do now? + +If necessary, activate the CPUfreq support on your CPU. + +Then, the driver must fill in the following values: + +policy->cpuinfo.min_freq _and_ +policy->cpuinfo.max_freq - the minimum and maximum frequency + (in kHz) which is supported by + this CPU +policy->cpuinfo.transition_latency the time it takes on this CPU to + switch between two frequencies in + nanoseconds (if appropriate, else + specify CPUFREQ_ETERNAL) + +policy->cur The current operating frequency of + this CPU (if appropriate) +policy->min, +policy->max, +policy->policy and, if necessary, +policy->governor must contain the "default policy" for + this CPU. A few moments later, + cpufreq_driver.verify and either + cpufreq_driver.setpolicy or + cpufreq_driver.target is called with + these values. + +For setting some of these values, the frequency table helpers might be +helpful. See the section 2 for more information on them. + + +1.3 verify +------------ + +When the user decides a new policy (consisting of +"policy,governor,min,max") shall be set, this policy must be validated +so that incompatible values can be corrected. For verifying these +values, a frequency table helper and/or the +cpufreq_verify_within_limits(struct cpufreq_policy *policy, unsigned +int min_freq, unsigned int max_freq) function might be helpful. See +section 2 for details on frequency table helpers. + +You need to make sure that at least one valid frequency (or operating +range) is within policy->min and policy->max. If necessary, increase +policy->max first, and only if this is no solution, decrease policy->min. + + +1.4 target or setpolicy? +---------------------------- + +Most cpufreq drivers or even most cpu frequency scaling algorithms +only allow the CPU to be set to one frequency. For these, you use the +->target call. + +Some cpufreq-capable processors switch the frequency between certain +limits on their own. These shall use the ->setpolicy call + + +1.4. target +------------- + +The target call has three arguments: struct cpufreq_policy *policy, +unsigned int target_frequency, unsigned int relation. + +The CPUfreq driver must set the new frequency when called here. The +actual frequency must be determined using the following rules: + +- keep close to "target_freq" +- policy->min <= new_freq <= policy->max (THIS MUST BE VALID!!!) +- if relation==CPUFREQ_REL_L, try to select a new_freq higher than or equal + target_freq. ("L for lowest, but no lower than") +- if relation==CPUFREQ_REL_H, try to select a new_freq lower than or equal + target_freq. ("H for highest, but no higher than") + +Here again the frequency table helper might assist you - see section 2 +for details. + + +1.5 setpolicy +--------------- + +The setpolicy call only takes a struct cpufreq_policy *policy as +argument. You need to set the lower limit of the in-processor or +in-chipset dynamic frequency switching to policy->min, the upper limit +to policy->max, and -if supported- select a performance-oriented +setting when policy->policy is CPUFREQ_POLICY_PERFORMANCE, and a +powersaving-oriented setting when CPUFREQ_POLICY_POWERSAVE. Also check +the reference implementation in drivers/cpufreq/longrun.c + + + +2. Frequency Table Helpers +========================== + +As most cpufreq processors only allow for being set to a few specific +frequencies, a "frequency table" with some functions might assist in +some work of the processor driver. Such a "frequency table" consists +of an array of struct cpufreq_freq_table entries, with any value in +"index" you want to use, and the corresponding frequency in +"frequency". At the end of the table, you need to add a +cpufreq_freq_table entry with frequency set to CPUFREQ_TABLE_END. And +if you want to skip one entry in the table, set the frequency to +CPUFREQ_ENTRY_INVALID. The entries don't need to be in ascending +order. + +By calling cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy, + struct cpufreq_frequency_table *table); +the cpuinfo.min_freq and cpuinfo.max_freq values are detected, and +policy->min and policy->max are set to the same values. This is +helpful for the per-CPU initialization stage. + +int cpufreq_frequency_table_verify(struct cpufreq_policy *policy, + struct cpufreq_frequency_table *table); +assures that at least one valid frequency is within policy->min and +policy->max, and all other criteria are met. This is helpful for the +->verify call. + +int cpufreq_frequency_table_target(struct cpufreq_policy *policy, + struct cpufreq_frequency_table *table, + unsigned int target_freq, + unsigned int relation, + unsigned int *index); + +is the corresponding frequency table helper for the ->target +stage. Just pass the values to this function, and the unsigned int +index returns the number of the frequency table entry which contains +the frequency the CPU shall be set to. PLEASE NOTE: This is not the +"index" which is in this cpufreq_table_entry.index, but instead +cpufreq_table[index]. So, the new frequency is +cpufreq_table[index].frequency, and the value you stored into the +frequency table "index" field is +cpufreq_table[index].index. + diff --git a/Documentation/cpu-freq/cpufreq-nforce2.txt b/Documentation/cpu-freq/cpufreq-nforce2.txt new file mode 100644 index 00000000..babce131 --- /dev/null +++ b/Documentation/cpu-freq/cpufreq-nforce2.txt @@ -0,0 +1,19 @@ + +The cpufreq-nforce2 driver changes the FSB on nVidia nForce2 platforms. + +This works better than on other platforms, because the FSB of the CPU +can be controlled independently from the PCI/AGP clock. + +The module has two options: + + fid: multiplier * 10 (for example 8.5 = 85) + min_fsb: minimum FSB + +If not set, fid is calculated from the current CPU speed and the FSB. +min_fsb defaults to FSB at boot time - 50 MHz. + +IMPORTANT: The available range is limited downwards! + Also the minimum available FSB can differ, for systems + booting with 200 MHz, 150 should always work. + + diff --git a/Documentation/cpu-freq/cpufreq-stats.txt b/Documentation/cpu-freq/cpufreq-stats.txt new file mode 100644 index 00000000..fc647492 --- /dev/null +++ b/Documentation/cpu-freq/cpufreq-stats.txt @@ -0,0 +1,128 @@ + + CPU frequency and voltage scaling statistics in the Linux(TM) kernel + + + L i n u x c p u f r e q - s t a t s d r i v e r + + - information for users - + + + Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> + +Contents +1. Introduction +2. Statistics Provided (with example) +3. Configuring cpufreq-stats + + +1. Introduction + +cpufreq-stats is a driver that provides CPU frequency statistics for each CPU. +These statistics are provided in /sysfs as a bunch of read_only interfaces. This +interface (when configured) will appear in a separate directory under cpufreq +in /sysfs (<sysfs root>/devices/system/cpu/cpuX/cpufreq/stats/) for each CPU. +Various statistics will form read_only files under this directory. + +This driver is designed to be independent of any particular cpufreq_driver +that may be running on your CPU. So, it will work with any cpufreq_driver. + + +2. Statistics Provided (with example) + +cpufreq stats provides following statistics (explained in detail below). +- time_in_state +- total_trans +- trans_table + +All the statistics will be from the time the stats driver has been inserted +to the time when a read of a particular statistic is done. Obviously, stats +driver will not have any information about the frequency transitions before +the stats driver insertion. + +-------------------------------------------------------------------------------- +<mysystem>:/sys/devices/system/cpu/cpu0/cpufreq/stats # ls -l +total 0 +drwxr-xr-x 2 root root 0 May 14 16:06 . +drwxr-xr-x 3 root root 0 May 14 15:58 .. +-r--r--r-- 1 root root 4096 May 14 16:06 time_in_state +-r--r--r-- 1 root root 4096 May 14 16:06 total_trans +-r--r--r-- 1 root root 4096 May 14 16:06 trans_table +-------------------------------------------------------------------------------- + +- time_in_state +This gives the amount of time spent in each of the frequencies supported by +this CPU. The cat output will have "<frequency> <time>" pair in each line, which +will mean this CPU spent <time> usertime units of time at <frequency>. Output +will have one line for each of the supported frequencies. usertime units here +is 10mS (similar to other time exported in /proc). + +-------------------------------------------------------------------------------- +<mysystem>:/sys/devices/system/cpu/cpu0/cpufreq/stats # cat time_in_state +3600000 2089 +3400000 136 +3200000 34 +3000000 67 +2800000 172488 +-------------------------------------------------------------------------------- + + +- total_trans +This gives the total number of frequency transitions on this CPU. The cat +output will have a single count which is the total number of frequency +transitions. + +-------------------------------------------------------------------------------- +<mysystem>:/sys/devices/system/cpu/cpu0/cpufreq/stats # cat total_trans +20 +-------------------------------------------------------------------------------- + +- trans_table +This will give a fine grained information about all the CPU frequency +transitions. The cat output here is a two dimensional matrix, where an entry +<i,j> (row i, column j) represents the count of number of transitions from +Freq_i to Freq_j. Freq_i is in descending order with increasing rows and +Freq_j is in descending order with increasing columns. The output here also +contains the actual freq values for each row and column for better readability. + +-------------------------------------------------------------------------------- +<mysystem>:/sys/devices/system/cpu/cpu0/cpufreq/stats # cat trans_table + From : To + : 3600000 3400000 3200000 3000000 2800000 + 3600000: 0 5 0 0 0 + 3400000: 4 0 2 0 0 + 3200000: 0 1 0 2 0 + 3000000: 0 0 1 0 3 + 2800000: 0 0 0 2 0 +-------------------------------------------------------------------------------- + + +3. Configuring cpufreq-stats + +To configure cpufreq-stats in your kernel +Config Main Menu + Power management options (ACPI, APM) ---> + CPU Frequency scaling ---> + [*] CPU Frequency scaling + <*> CPU frequency translation statistics + [*] CPU frequency translation statistics details + + +"CPU Frequency scaling" (CONFIG_CPU_FREQ) should be enabled to configure +cpufreq-stats. + +"CPU frequency translation statistics" (CONFIG_CPU_FREQ_STAT) provides the +basic statistics which includes time_in_state and total_trans. + +"CPU frequency translation statistics details" (CONFIG_CPU_FREQ_STAT_DETAILS) +provides fine grained cpufreq stats by trans_table. The reason for having a +separate config option for trans_table is: +- trans_table goes against the traditional /sysfs rule of one value per + interface. It provides a whole bunch of value in a 2 dimensional matrix + form. + +Once these two options are enabled and your CPU supports cpufrequency, you +will be able to see the CPU frequency statistics in /sysfs. + + + + diff --git a/Documentation/cpu-freq/governors.txt b/Documentation/cpu-freq/governors.txt new file mode 100644 index 00000000..6aed1ce3 --- /dev/null +++ b/Documentation/cpu-freq/governors.txt @@ -0,0 +1,301 @@ + CPU frequency and voltage scaling code in the Linux(TM) kernel + + + L i n u x C P U F r e q + + C P U F r e q G o v e r n o r s + + - information for users and developers - + + + Dominik Brodowski <linux@brodo.de> + some additions and corrections by Nico Golde <nico@ngolde.de> + + + + Clock scaling allows you to change the clock speed of the CPUs on the + fly. This is a nice method to save battery power, because the lower + the clock speed, the less power the CPU consumes. + + +Contents: +--------- +1. What is a CPUFreq Governor? + +2. Governors In the Linux Kernel +2.1 Performance +2.2 Powersave +2.3 Userspace +2.4 Ondemand +2.5 Conservative +2.6 Interactive + +3. The Governor Interface in the CPUfreq Core + + + +1. What Is A CPUFreq Governor? +============================== + +Most cpufreq drivers (in fact, all except one, longrun) or even most +cpu frequency scaling algorithms only offer the CPU to be set to one +frequency. In order to offer dynamic frequency scaling, the cpufreq +core must be able to tell these drivers of a "target frequency". So +these specific drivers will be transformed to offer a "->target" +call instead of the existing "->setpolicy" call. For "longrun", all +stays the same, though. + +How to decide what frequency within the CPUfreq policy should be used? +That's done using "cpufreq governors". Two are already in this patch +-- they're the already existing "powersave" and "performance" which +set the frequency statically to the lowest or highest frequency, +respectively. At least two more such governors will be ready for +addition in the near future, but likely many more as there are various +different theories and models about dynamic frequency scaling +around. Using such a generic interface as cpufreq offers to scaling +governors, these can be tested extensively, and the best one can be +selected for each specific use. + +Basically, it's the following flow graph: + +CPU can be set to switch independently | CPU can only be set + within specific "limits" | to specific frequencies + + "CPUfreq policy" + consists of frequency limits (policy->{min,max}) + and CPUfreq governor to be used + / \ + / \ + / the cpufreq governor decides + / (dynamically or statically) + / what target_freq to set within + / the limits of policy->{min,max} + / \ + / \ + Using the ->setpolicy call, Using the ->target call, + the limits and the the frequency closest + "policy" is set. to target_freq is set. + It is assured that it + is within policy->{min,max} + + +2. Governors In the Linux Kernel +================================ + +2.1 Performance +--------------- + +The CPUfreq governor "performance" sets the CPU statically to the +highest frequency within the borders of scaling_min_freq and +scaling_max_freq. + + +2.2 Powersave +------------- + +The CPUfreq governor "powersave" sets the CPU statically to the +lowest frequency within the borders of scaling_min_freq and +scaling_max_freq. + + +2.3 Userspace +------------- + +The CPUfreq governor "userspace" allows the user, or any userspace +program running with UID "root", to set the CPU to a specific frequency +by making a sysfs file "scaling_setspeed" available in the CPU-device +directory. + + +2.4 Ondemand +------------ + +The CPUfreq governor "ondemand" sets the CPU depending on the +current usage. To do this the CPU must have the capability to +switch the frequency very quickly. There are a number of sysfs file +accessible parameters: + +sampling_rate: measured in uS (10^-6 seconds), this is how often you +want the kernel to look at the CPU usage and to make decisions on +what to do about the frequency. Typically this is set to values of +around '10000' or more. It's default value is (cmp. with users-guide.txt): +transition_latency * 1000 +Be aware that transition latency is in ns and sampling_rate is in us, so you +get the same sysfs value by default. +Sampling rate should always get adjusted considering the transition latency +To set the sampling rate 750 times as high as the transition latency +in the bash (as said, 1000 is default), do: +echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) \ + >ondemand/sampling_rate + +sampling_rate_min: +The sampling rate is limited by the HW transition latency: +transition_latency * 100 +Or by kernel restrictions: +If CONFIG_NO_HZ is set, the limit is 10ms fixed. +If CONFIG_NO_HZ is not set or nohz=off boot parameter is used, the +limits depend on the CONFIG_HZ option: +HZ=1000: min=20000us (20ms) +HZ=250: min=80000us (80ms) +HZ=100: min=200000us (200ms) +The highest value of kernel and HW latency restrictions is shown and +used as the minimum sampling rate. + +up_threshold: defines what the average CPU usage between the samplings +of 'sampling_rate' needs to be for the kernel to make a decision on +whether it should increase the frequency. For example when it is set +to its default value of '95' it means that between the checking +intervals the CPU needs to be on average more than 95% in use to then +decide that the CPU frequency needs to be increased. + +ignore_nice_load: this parameter takes a value of '0' or '1'. When +set to '0' (its default), all processes are counted towards the +'cpu utilisation' value. When set to '1', the processes that are +run with a 'nice' value will not count (and thus be ignored) in the +overall usage calculation. This is useful if you are running a CPU +intensive calculation on your laptop that you do not care how long it +takes to complete as you can 'nice' it and prevent it from taking part +in the deciding process of whether to increase your CPU frequency. + +sampling_down_factor: this parameter controls the rate at which the +kernel makes a decision on when to decrease the frequency while running +at top speed. When set to 1 (the default) decisions to reevaluate load +are made at the same interval regardless of current clock speed. But +when set to greater than 1 (e.g. 100) it acts as a multiplier for the +scheduling interval for reevaluating load when the CPU is at its top +speed due to high load. This improves performance by reducing the overhead +of load evaluation and helping the CPU stay at its top speed when truly +busy, rather than shifting back and forth in speed. This tunable has no +effect on behavior at lower speeds/lower CPU loads. + + +2.5 Conservative +---------------- + +The CPUfreq governor "conservative", much like the "ondemand" +governor, sets the CPU depending on the current usage. It differs in +behaviour in that it gracefully increases and decreases the CPU speed +rather than jumping to max speed the moment there is any load on the +CPU. This behaviour more suitable in a battery powered environment. +The governor is tweaked in the same manner as the "ondemand" governor +through sysfs with the addition of: + +freq_step: this describes what percentage steps the cpu freq should be +increased and decreased smoothly by. By default the cpu frequency will +increase in 5% chunks of your maximum cpu frequency. You can change this +value to anywhere between 0 and 100 where '0' will effectively lock your +CPU at a speed regardless of its load whilst '100' will, in theory, make +it behave identically to the "ondemand" governor. + +down_threshold: same as the 'up_threshold' found for the "ondemand" +governor but for the opposite direction. For example when set to its +default value of '20' it means that if the CPU usage needs to be below +20% between samples to have the frequency decreased. + + +2.6 Interactive +--------------- + +The CPUfreq governor "interactive" is designed for latency-sensitive, +interactive workloads. This governor sets the CPU speed depending on +usage, similar to "ondemand" and "conservative" governors. However, +the governor is more aggressive about scaling the CPU speed up in +response to CPU-intensive activity. + +Sampling the CPU load every X ms can lead to under-powering the CPU +for X ms, leading to dropped frames, stuttering UI, etc. Instead of +sampling the cpu at a specified rate, the interactive governor will +check whether to scale the cpu frequency up soon after coming out of +idle. When the cpu comes out of idle, a timer is configured to fire +within 1-2 ticks. If the cpu is very busy between exiting idle and +when the timer fires then we assume the cpu is underpowered and ramp +to MAX speed. + +If the cpu was not sufficiently busy to immediately ramp to MAX speed, +then governor evaluates the cpu load since the last speed adjustment, +choosing the highest value between that longer-term load or the +short-term load since idle exit to determine the cpu speed to ramp to. + +The tuneable values for this governor are: + +min_sample_time: The minimum amount of time to spend at the current +frequency before ramping down. This is to ensure that the governor has +seen enough historic cpu load data to determine the appropriate +workload. Default is 80000 uS. + +hispeed_freq: An intermediate "hi speed" at which to initially ramp +when CPU load hits the value specified in go_hispeed_load. If load +stays high for the amount of time specified in above_hispeed_delay, +then speed may be bumped higher. Default is maximum speed. + +go_hispeed_load: The CPU load at which to ramp to the intermediate "hi +speed". Default is 85%. + +above_hispeed_delay: Once speed is set to hispeed_freq, wait for this +long before bumping speed higher in response to continued high load. +Default is 20000 uS. + +timer_rate: Sample rate for reevaluating cpu load when the system is +not idle. Default is 20000 uS. + +input_boost: If non-zero, boost speed of all CPUs to hispeed_freq on +touchscreen activity. Default is 0. + +boost: If non-zero, immediately boost speed of all CPUs to at least +hispeed_freq until zero is written to this attribute. If zero, allow +CPU speeds to drop below hispeed_freq according to load as usual. + +boostpulse: Immediately boost speed of all CPUs to hispeed_freq for +min_sample_time, after which speeds are allowed to drop below +hispeed_freq according to load as usual. + + +3. The Governor Interface in the CPUfreq Core +============================================= + +A new governor must register itself with the CPUfreq core using +"cpufreq_register_governor". The struct cpufreq_governor, which has to +be passed to that function, must contain the following values: + +governor->name - A unique name for this governor +governor->governor - The governor callback function +governor->owner - .THIS_MODULE for the governor module (if + appropriate) + +The governor->governor callback is called with the current (or to-be-set) +cpufreq_policy struct for that CPU, and an unsigned int event. The +following events are currently defined: + +CPUFREQ_GOV_START: This governor shall start its duty for the CPU + policy->cpu +CPUFREQ_GOV_STOP: This governor shall end its duty for the CPU + policy->cpu +CPUFREQ_GOV_LIMITS: The limits for CPU policy->cpu have changed to + policy->min and policy->max. + +If you need other "events" externally of your driver, _only_ use the +cpufreq_governor_l(unsigned int cpu, unsigned int event) call to the +CPUfreq core to ensure proper locking. + + +The CPUfreq governor may call the CPU processor driver using one of +these two functions: + +int cpufreq_driver_target(struct cpufreq_policy *policy, + unsigned int target_freq, + unsigned int relation); + +int __cpufreq_driver_target(struct cpufreq_policy *policy, + unsigned int target_freq, + unsigned int relation); + +target_freq must be within policy->min and policy->max, of course. +What's the difference between these two functions? When your governor +still is in a direct code path of a call to governor->governor, the +per-CPU cpufreq lock is still held in the cpufreq core, and there's +no need to lock it again (in fact, this would cause a deadlock). So +use __cpufreq_driver_target only in these cases. In all other cases +(for example, when there's a "daemonized" function that wakes up +every second), use cpufreq_driver_target to lock the cpufreq per-CPU +lock before the command is passed to the cpufreq processor driver. + diff --git a/Documentation/cpu-freq/index.txt b/Documentation/cpu-freq/index.txt new file mode 100644 index 00000000..3d0b9150 --- /dev/null +++ b/Documentation/cpu-freq/index.txt @@ -0,0 +1,54 @@ + CPU frequency and voltage scaling code in the Linux(TM) kernel + + + L i n u x C P U F r e q + + + + + Dominik Brodowski <linux@brodo.de> + + + + Clock scaling allows you to change the clock speed of the CPUs on the + fly. This is a nice method to save battery power, because the lower + the clock speed, the less power the CPU consumes. + + + +Documents in this directory: +---------------------------- +core.txt - General description of the CPUFreq core and + of CPUFreq notifiers + +cpu-drivers.txt - How to implement a new cpufreq processor driver + +governors.txt - What are cpufreq governors and how to + implement them? + +index.txt - File index, Mailing list and Links (this document) + +user-guide.txt - User Guide to CPUFreq + + +Mailing List +------------ +There is a CPU frequency changing CVS commit and general list where +you can report bugs, problems or submit patches. To post a message, +send an email to cpufreq@vger.kernel.org, to subscribe go to +http://vger.kernel.org/vger-lists.html#cpufreq and follow the +instructions there. + +Links +----- +the FTP archives: +* ftp://ftp.linux.org.uk/pub/linux/cpufreq/ + +how to access the CVS repository: +* http://cvs.arm.linux.org.uk/ + +the CPUFreq Mailing list: +* http://vger.kernel.org/vger-lists.html#cpufreq + +Clock and voltage scaling for the SA-1100: +* http://www.lartmaker.nl/projects/scaling diff --git a/Documentation/cpu-freq/pcc-cpufreq.txt b/Documentation/cpu-freq/pcc-cpufreq.txt new file mode 100644 index 00000000..9e3c3b33 --- /dev/null +++ b/Documentation/cpu-freq/pcc-cpufreq.txt @@ -0,0 +1,207 @@ +/* + * pcc-cpufreq.txt - PCC interface documentation + * + * Copyright (C) 2009 Red Hat, Matthew Garrett <mjg@redhat.com> + * Copyright (C) 2009 Hewlett-Packard Development Company, L.P. + * Nagananda Chumbalkar <nagananda.chumbalkar@hp.com> + * + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; version 2 of the License. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or NON + * INFRINGEMENT. See the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 675 Mass Ave, Cambridge, MA 02139, USA. + * + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + */ + + + Processor Clocking Control Driver + --------------------------------- + +Contents: +--------- +1. Introduction +1.1 PCC interface +1.1.1 Get Average Frequency +1.1.2 Set Desired Frequency +1.2 Platforms affected +2. Driver and /sys details +2.1 scaling_available_frequencies +2.2 cpuinfo_transition_latency +2.3 cpuinfo_cur_freq +2.4 related_cpus +3. Caveats + +1. Introduction: +---------------- +Processor Clocking Control (PCC) is an interface between the platform +firmware and OSPM. It is a mechanism for coordinating processor +performance (ie: frequency) between the platform firmware and the OS. + +The PCC driver (pcc-cpufreq) allows OSPM to take advantage of the PCC +interface. + +OS utilizes the PCC interface to inform platform firmware what frequency the +OS wants for a logical processor. The platform firmware attempts to achieve +the requested frequency. If the request for the target frequency could not be +satisfied by platform firmware, then it usually means that power budget +conditions are in place, and "power capping" is taking place. + +1.1 PCC interface: +------------------ +The complete PCC specification is available here: +http://www.acpica.org/download/Processor-Clocking-Control-v1p0.pdf + +PCC relies on a shared memory region that provides a channel for communication +between the OS and platform firmware. PCC also implements a "doorbell" that +is used by the OS to inform the platform firmware that a command has been +sent. + +The ACPI PCCH() method is used to discover the location of the PCC shared +memory region. The shared memory region header contains the "command" and +"status" interface. PCCH() also contains details on how to access the platform +doorbell. + +The following commands are supported by the PCC interface: +* Get Average Frequency +* Set Desired Frequency + +The ACPI PCCP() method is implemented for each logical processor and is +used to discover the offsets for the input and output buffers in the shared +memory region. + +When PCC mode is enabled, the platform will not expose processor performance +or throttle states (_PSS, _TSS and related ACPI objects) to OSPM. Therefore, +the native P-state driver (such as acpi-cpufreq for Intel, powernow-k8 for +AMD) will not load. + +However, OSPM remains in control of policy. The governor (eg: "ondemand") +computes the required performance for each processor based on server workload. +The PCC driver fills in the command interface, and the input buffer and +communicates the request to the platform firmware. The platform firmware is +responsible for delivering the requested performance. + +Each PCC command is "global" in scope and can affect all the logical CPUs in +the system. Therefore, PCC is capable of performing "group" updates. With PCC +the OS is capable of getting/setting the frequency of all the logical CPUs in +the system with a single call to the BIOS. + +1.1.1 Get Average Frequency: +---------------------------- +This command is used by the OSPM to query the running frequency of the +processor since the last time this command was completed. The output buffer +indicates the average unhalted frequency of the logical processor expressed as +a percentage of the nominal (ie: maximum) CPU frequency. The output buffer +also signifies if the CPU frequency is limited by a power budget condition. + +1.1.2 Set Desired Frequency: +---------------------------- +This command is used by the OSPM to communicate to the platform firmware the +desired frequency for a logical processor. The output buffer is currently +ignored by OSPM. The next invocation of "Get Average Frequency" will inform +OSPM if the desired frequency was achieved or not. + +1.2 Platforms affected: +----------------------- +The PCC driver will load on any system where the platform firmware: +* supports the PCC interface, and the associated PCCH() and PCCP() methods +* assumes responsibility for managing the hardware clocking controls in order +to deliver the requested processor performance + +Currently, certain HP ProLiant platforms implement the PCC interface. On those +platforms PCC is the "default" choice. + +However, it is possible to disable this interface via a BIOS setting. In +such an instance, as is also the case on platforms where the PCC interface +is not implemented, the PCC driver will fail to load silently. + +2. Driver and /sys details: +--------------------------- +When the driver loads, it merely prints the lowest and the highest CPU +frequencies supported by the platform firmware. + +The PCC driver loads with a message such as: +pcc-cpufreq: (v1.00.00) driver loaded with frequency limits: 1600 MHz, 2933 +MHz + +This means that the OPSM can request the CPU to run at any frequency in +between the limits (1600 MHz, and 2933 MHz) specified in the message. + +Internally, there is no need for the driver to convert the "target" frequency +to a corresponding P-state. + +The VERSION number for the driver will be of the format v.xy.ab. +eg: 1.00.02 + ----- -- + | | + | -- this will increase with bug fixes/enhancements to the driver + |-- this is the version of the PCC specification the driver adheres to + + +The following is a brief discussion on some of the fields exported via the +/sys filesystem and how their values are affected by the PCC driver: + +2.1 scaling_available_frequencies: +---------------------------------- +scaling_available_frequencies is not created in /sys. No intermediate +frequencies need to be listed because the BIOS will try to achieve any +frequency, within limits, requested by the governor. A frequency does not have +to be strictly associated with a P-state. + +2.2 cpuinfo_transition_latency: +------------------------------- +The cpuinfo_transition_latency field is 0. The PCC specification does +not include a field to expose this value currently. + +2.3 cpuinfo_cur_freq: +--------------------- +A) Often cpuinfo_cur_freq will show a value different than what is declared +in the scaling_available_frequencies or scaling_cur_freq, or scaling_max_freq. +This is due to "turbo boost" available on recent Intel processors. If certain +conditions are met the BIOS can achieve a slightly higher speed than requested +by OSPM. An example: + +scaling_cur_freq : 2933000 +cpuinfo_cur_freq : 3196000 + +B) There is a round-off error associated with the cpuinfo_cur_freq value. +Since the driver obtains the current frequency as a "percentage" (%) of the +nominal frequency from the BIOS, sometimes, the values displayed by +scaling_cur_freq and cpuinfo_cur_freq may not match. An example: + +scaling_cur_freq : 1600000 +cpuinfo_cur_freq : 1583000 + +In this example, the nominal frequency is 2933 MHz. The driver obtains the +current frequency, cpuinfo_cur_freq, as 54% of the nominal frequency: + + 54% of 2933 MHz = 1583 MHz + +Nominal frequency is the maximum frequency of the processor, and it usually +corresponds to the frequency of the P0 P-state. + +2.4 related_cpus: +----------------- +The related_cpus field is identical to affected_cpus. + +affected_cpus : 4 +related_cpus : 4 + +Currently, the PCC driver does not evaluate _PSD. The platforms that support +PCC do not implement SW_ALL. So OSPM doesn't need to perform any coordination +to ensure that the same frequency is requested of all dependent CPUs. + +3. Caveats: +----------- +The "cpufreq_stats" module in its present form cannot be loaded and +expected to work with the PCC driver. Since the "cpufreq_stats" module +provides information wrt each P-state, it is not applicable to the PCC driver. diff --git a/Documentation/cpu-freq/user-guide.txt b/Documentation/cpu-freq/user-guide.txt new file mode 100644 index 00000000..04f6b329 --- /dev/null +++ b/Documentation/cpu-freq/user-guide.txt @@ -0,0 +1,224 @@ + CPU frequency and voltage scaling code in the Linux(TM) kernel + + + L i n u x C P U F r e q + + U S E R G U I D E + + + Dominik Brodowski <linux@brodo.de> + + + + Clock scaling allows you to change the clock speed of the CPUs on the + fly. This is a nice method to save battery power, because the lower + the clock speed, the less power the CPU consumes. + + +Contents: +--------- +1. Supported Architectures and Processors +1.1 ARM +1.2 x86 +1.3 sparc64 +1.4 ppc +1.5 SuperH +1.6 Blackfin + +2. "Policy" / "Governor"? +2.1 Policy +2.2 Governor + +3. How to change the CPU cpufreq policy and/or speed +3.1 Preferred interface: sysfs + + + +1. Supported Architectures and Processors +========================================= + +1.1 ARM +------- + +The following ARM processors are supported by cpufreq: + +ARM Integrator +ARM-SA1100 +ARM-SA1110 +Intel PXA + + +1.2 x86 +------- + +The following processors for the x86 architecture are supported by cpufreq: + +AMD Elan - SC400, SC410 +AMD mobile K6-2+ +AMD mobile K6-3+ +AMD mobile Duron +AMD mobile Athlon +AMD Opteron +AMD Athlon 64 +Cyrix Media GXm +Intel mobile PIII and Intel mobile PIII-M on certain chipsets +Intel Pentium 4, Intel Xeon +Intel Pentium M (Centrino) +National Semiconductors Geode GX +Transmeta Crusoe +Transmeta Efficeon +VIA Cyrix 3 / C3 +various processors on some ACPI 2.0-compatible systems [*] + +[*] Only if "ACPI Processor Performance States" are available +to the ACPI<->BIOS interface. + + +1.3 sparc64 +----------- + +The following processors for the sparc64 architecture are supported by +cpufreq: + +UltraSPARC-III + + +1.4 ppc +------- + +Several "PowerBook" and "iBook2" notebooks are supported. + + +1.5 SuperH +---------- + +All SuperH processors supporting rate rounding through the clock +framework are supported by cpufreq. + +1.6 Blackfin +------------ + +The following Blackfin processors are supported by cpufreq: + +BF522, BF523, BF524, BF525, BF526, BF527, Rev 0.1 or higher +BF531, BF532, BF533, Rev 0.3 or higher +BF534, BF536, BF537, Rev 0.2 or higher +BF561, Rev 0.3 or higher +BF542, BF544, BF547, BF548, BF549, Rev 0.1 or higher + + +2. "Policy" / "Governor" ? +========================== + +Some CPU frequency scaling-capable processor switch between various +frequencies and operating voltages "on the fly" without any kernel or +user involvement. This guarantees very fast switching to a frequency +which is high enough to serve the user's needs, but low enough to save +power. + + +2.1 Policy +---------- + +On these systems, all you can do is select the lower and upper +frequency limit as well as whether you want more aggressive +power-saving or more instantly available processing power. + + +2.2 Governor +------------ + +On all other cpufreq implementations, these boundaries still need to +be set. Then, a "governor" must be selected. Such a "governor" decides +what speed the processor shall run within the boundaries. One such +"governor" is the "userspace" governor. This one allows the user - or +a yet-to-implement userspace program - to decide what specific speed +the processor shall run at. + + +3. How to change the CPU cpufreq policy and/or speed +==================================================== + +3.1 Preferred Interface: sysfs +------------------------------ + +The preferred interface is located in the sysfs filesystem. If you +mounted it at /sys, the cpufreq interface is located in a subdirectory +"cpufreq" within the cpu-device directory +(e.g. /sys/devices/system/cpu/cpu0/cpufreq/ for the first CPU). + +cpuinfo_min_freq : this file shows the minimum operating + frequency the processor can run at(in kHz) +cpuinfo_max_freq : this file shows the maximum operating + frequency the processor can run at(in kHz) +cpuinfo_transition_latency The time it takes on this CPU to + switch between two frequencies in nano + seconds. If unknown or known to be + that high that the driver does not + work with the ondemand governor, -1 + (CPUFREQ_ETERNAL) will be returned. + Using this information can be useful + to choose an appropriate polling + frequency for a kernel governor or + userspace daemon. Make sure to not + switch the frequency too often + resulting in performance loss. +scaling_driver : this file shows what cpufreq driver is + used to set the frequency on this CPU + +scaling_available_governors : this file shows the CPUfreq governors + available in this kernel. You can see the + currently activated governor in + +scaling_governor, and by "echoing" the name of another + governor you can change it. Please note + that some governors won't load - they only + work on some specific architectures or + processors. + +cpuinfo_cur_freq : Current frequency of the CPU as obtained from + the hardware, in KHz. This is the frequency + the CPU actually runs at. + +scaling_available_frequencies : List of available frequencies, in KHz. + +scaling_min_freq and +scaling_max_freq show the current "policy limits" (in + kHz). By echoing new values into these + files, you can change these limits. + NOTE: when setting a policy you need to + first set scaling_max_freq, then + scaling_min_freq. + +affected_cpus : List of CPUs that require software coordination + of frequency. + +related_cpus : List of CPUs that need some sort of frequency + coordination, whether software or hardware. + +scaling_driver : Hardware driver for cpufreq. + +scaling_cur_freq : Current frequency of the CPU as determined by + the governor and cpufreq core, in KHz. This is + the frequency the kernel thinks the CPU runs + at. + +bios_limit : If the BIOS tells the OS to limit a CPU to + lower frequencies, the user can read out the + maximum available frequency from this file. + This typically can happen through (often not + intended) BIOS settings, restrictions + triggered through a service processor or other + BIOS/HW based implementations. + This does not cover thermal ACPI limitations + which can be detected through the generic + thermal driver. + +If you have selected the "userspace" governor which allows you to +set the CPU operating frequency to a specific value, you can read out +the current frequency in + +scaling_setspeed. By "echoing" a new frequency into this + you can change the speed of the CPU, + but only within the limits of + scaling_min_freq and scaling_max_freq. |