summaryrefslogtreecommitdiffstats
path: root/Documentation/power/energy-model.rst
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--Documentation/power/energy-model.rst162
1 files changed, 162 insertions, 0 deletions
diff --git a/Documentation/power/energy-model.rst b/Documentation/power/energy-model.rst
new file mode 100644
index 000000000..a6fb986ab
--- /dev/null
+++ b/Documentation/power/energy-model.rst
@@ -0,0 +1,162 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================
+Energy Model of devices
+=======================
+
+1. Overview
+-----------
+
+The Energy Model (EM) framework serves as an interface between drivers knowing
+the power consumed by devices at various performance levels, and the kernel
+subsystems willing to use that information to make energy-aware decisions.
+
+The source of the information about the power consumed by devices can vary greatly
+from one platform to another. These power costs can be estimated using
+devicetree data in some cases. In others, the firmware will know better.
+Alternatively, userspace might be best positioned. And so on. In order to avoid
+each and every client subsystem to re-implement support for each and every
+possible source of information on its own, the EM framework intervenes as an
+abstraction layer which standardizes the format of power cost tables in the
+kernel, hence enabling to avoid redundant work.
+
+The figure below depicts an example of drivers (Arm-specific here, but the
+approach is applicable to any architecture) providing power costs to the EM
+framework, and interested clients reading the data from it::
+
+ +---------------+ +-----------------+ +---------------+
+ | Thermal (IPA) | | Scheduler (EAS) | | Other |
+ +---------------+ +-----------------+ +---------------+
+ | | em_cpu_energy() |
+ | | em_cpu_get() |
+ +---------+ | +---------+
+ | | |
+ v v v
+ +---------------------+
+ | Energy Model |
+ | Framework |
+ +---------------------+
+ ^ ^ ^
+ | | | em_dev_register_perf_domain()
+ +----------+ | +---------+
+ | | |
+ +---------------+ +---------------+ +--------------+
+ | cpufreq-dt | | arm_scmi | | Other |
+ +---------------+ +---------------+ +--------------+
+ ^ ^ ^
+ | | |
+ +--------------+ +---------------+ +--------------+
+ | Device Tree | | Firmware | | ? |
+ +--------------+ +---------------+ +--------------+
+
+In case of CPU devices the EM framework manages power cost tables per
+'performance domain' in the system. A performance domain is a group of CPUs
+whose performance is scaled together. Performance domains generally have a
+1-to-1 mapping with CPUFreq policies. All CPUs in a performance domain are
+required to have the same micro-architecture. CPUs in different performance
+domains can have different micro-architectures.
+
+
+2. Core APIs
+------------
+
+2.1 Config options
+^^^^^^^^^^^^^^^^^^
+
+CONFIG_ENERGY_MODEL must be enabled to use the EM framework.
+
+
+2.2 Registration of performance domains
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Drivers are expected to register performance domains into the EM framework by
+calling the following API::
+
+ int em_dev_register_perf_domain(struct device *dev, unsigned int nr_states,
+ struct em_data_callback *cb, cpumask_t *cpus);
+
+Drivers must provide a callback function returning <frequency, power> tuples
+for each performance state. The callback function provided by the driver is free
+to fetch data from any relevant location (DT, firmware, ...), and by any mean
+deemed necessary. Only for CPU devices, drivers must specify the CPUs of the
+performance domains using cpumask. For other devices than CPUs the last
+argument must be set to NULL.
+See Section 3. for an example of driver implementing this
+callback, and kernel/power/energy_model.c for further documentation on this
+API.
+
+
+2.3 Accessing performance domains
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+There are two API functions which provide the access to the energy model:
+em_cpu_get() which takes CPU id as an argument and em_pd_get() with device
+pointer as an argument. It depends on the subsystem which interface it is
+going to use, but in case of CPU devices both functions return the same
+performance domain.
+
+Subsystems interested in the energy model of a CPU can retrieve it using the
+em_cpu_get() API. The energy model tables are allocated once upon creation of
+the performance domains, and kept in memory untouched.
+
+The energy consumed by a performance domain can be estimated using the
+em_cpu_energy() API. The estimation is performed assuming that the schedutil
+CPUfreq governor is in use in case of CPU device. Currently this calculation is
+not provided for other type of devices.
+
+More details about the above APIs can be found in include/linux/energy_model.h.
+
+
+3. Example driver
+-----------------
+
+This section provides a simple example of a CPUFreq driver registering a
+performance domain in the Energy Model framework using the (fake) 'foo'
+protocol. The driver implements an est_power() function to be provided to the
+EM framework::
+
+ -> drivers/cpufreq/foo_cpufreq.c
+
+ 01 static int est_power(unsigned long *mW, unsigned long *KHz,
+ 02 struct device *dev)
+ 03 {
+ 04 long freq, power;
+ 05
+ 06 /* Use the 'foo' protocol to ceil the frequency */
+ 07 freq = foo_get_freq_ceil(dev, *KHz);
+ 08 if (freq < 0);
+ 09 return freq;
+ 10
+ 11 /* Estimate the power cost for the dev at the relevant freq. */
+ 12 power = foo_estimate_power(dev, freq);
+ 13 if (power < 0);
+ 14 return power;
+ 15
+ 16 /* Return the values to the EM framework */
+ 17 *mW = power;
+ 18 *KHz = freq;
+ 19
+ 20 return 0;
+ 21 }
+ 22
+ 23 static int foo_cpufreq_init(struct cpufreq_policy *policy)
+ 24 {
+ 25 struct em_data_callback em_cb = EM_DATA_CB(est_power);
+ 26 struct device *cpu_dev;
+ 27 int nr_opp, ret;
+ 28
+ 29 cpu_dev = get_cpu_device(cpumask_first(policy->cpus));
+ 30
+ 31 /* Do the actual CPUFreq init work ... */
+ 32 ret = do_foo_cpufreq_init(policy);
+ 33 if (ret)
+ 34 return ret;
+ 35
+ 36 /* Find the number of OPPs for this policy */
+ 37 nr_opp = foo_get_nr_opp(policy);
+ 38
+ 39 /* And register the new performance domain */
+ 40 em_dev_register_perf_domain(cpu_dev, nr_opp, &em_cb, policy->cpus);
+ 41
+ 42 return 0;
+ 43 }