summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/ref/pgtesttiming.sgml
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-05-04 12:15:05 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-05-04 12:15:05 +0000
commit46651ce6fe013220ed397add242004d764fc0153 (patch)
tree6e5299f990f88e60174a1d3ae6e48eedd2688b2b /doc/src/sgml/ref/pgtesttiming.sgml
parentInitial commit. (diff)
downloadpostgresql-14-46651ce6fe013220ed397add242004d764fc0153.tar.xz
postgresql-14-46651ce6fe013220ed397add242004d764fc0153.zip
Adding upstream version 14.5.upstream/14.5upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/src/sgml/ref/pgtesttiming.sgml')
-rw-r--r--doc/src/sgml/ref/pgtesttiming.sgml303
1 files changed, 303 insertions, 0 deletions
diff --git a/doc/src/sgml/ref/pgtesttiming.sgml b/doc/src/sgml/ref/pgtesttiming.sgml
new file mode 100644
index 0000000..a5eb3aa
--- /dev/null
+++ b/doc/src/sgml/ref/pgtesttiming.sgml
@@ -0,0 +1,303 @@
+<!--
+doc/src/sgml/ref/pgtesttiming.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="pgtesttiming">
+ <indexterm zone="pgtesttiming">
+ <primary>pg_test_timing</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle><application>pg_test_timing</application></refentrytitle>
+ <manvolnum>1</manvolnum>
+ <refmiscinfo>Application</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>pg_test_timing</refname>
+ <refpurpose>measure timing overhead</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+ <cmdsynopsis>
+ <command>pg_test_timing</command>
+ <arg rep="repeat"><replaceable>option</replaceable></arg>
+ </cmdsynopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <application>pg_test_timing</application> is a tool to measure the timing overhead
+ on your system and confirm that the system time never moves backwards.
+ Systems that are slow to collect timing data can give less accurate
+ <command>EXPLAIN ANALYZE</command> results.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Options</title>
+
+ <para>
+ <application>pg_test_timing</application> accepts the following
+ command-line options:
+
+ <variablelist>
+
+ <varlistentry>
+ <term><option>-d <replaceable class="parameter">duration</replaceable></option></term>
+ <term><option>--duration=<replaceable class="parameter">duration</replaceable></option></term>
+ <listitem>
+ <para>
+ Specifies the test duration, in seconds. Longer durations
+ give slightly better accuracy, and are more likely to discover
+ problems with the system clock moving backwards. The default
+ test duration is 3 seconds.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-V</option></term>
+ <term><option>--version</option></term>
+ <listitem>
+ <para>
+ Print the <application>pg_test_timing</application> version and exit.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-?</option></term>
+ <term><option>--help</option></term>
+ <listitem>
+ <para>
+ Show help about <application>pg_test_timing</application> command line
+ arguments, and exit.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Usage</title>
+
+ <refsect2>
+ <title>Interpreting Results</title>
+
+ <para>
+ Good results will show most (>90%) individual timing calls take less than
+ one microsecond. Average per loop overhead will be even lower, below 100
+ nanoseconds. This example from an Intel i7-860 system using a TSC clock
+ source shows excellent performance:
+
+<screen><![CDATA[
+Testing timing overhead for 3 seconds.
+Per loop time including overhead: 35.96 ns
+Histogram of timing durations:
+ < us % of total count
+ 1 96.40465 80435604
+ 2 3.59518 2999652
+ 4 0.00015 126
+ 8 0.00002 13
+ 16 0.00000 2
+]]></screen>
+ </para>
+
+ <para>
+ Note that different units are used for the per loop time than the
+ histogram. The loop can have resolution within a few nanoseconds (ns),
+ while the individual timing calls can only resolve down to one microsecond
+ (us).
+ </para>
+
+ </refsect2>
+ <refsect2>
+ <title>Measuring Executor Timing Overhead</title>
+
+ <para>
+ When the query executor is running a statement using
+ <command>EXPLAIN ANALYZE</command>, individual operations are timed as well
+ as showing a summary. The overhead of your system can be checked by
+ counting rows with the <application>psql</application> program:
+
+<screen>
+CREATE TABLE t AS SELECT * FROM generate_series(1,100000);
+\timing
+SELECT COUNT(*) FROM t;
+EXPLAIN ANALYZE SELECT COUNT(*) FROM t;
+</screen>
+ </para>
+
+ <para>
+ The i7-860 system measured runs the count query in 9.8 ms while
+ the <command>EXPLAIN ANALYZE</command> version takes 16.6 ms, each
+ processing just over 100,000 rows. That 6.8 ms difference means the timing
+ overhead per row is 68 ns, about twice what pg_test_timing estimated it
+ would be. Even that relatively small amount of overhead is making the fully
+ timed count statement take almost 70% longer. On more substantial queries,
+ the timing overhead would be less problematic.
+ </para>
+
+ </refsect2>
+
+ <refsect2>
+ <title>Changing Time Sources</title>
+ <para>
+ On some newer Linux systems, it's possible to change the clock source used
+ to collect timing data at any time. A second example shows the slowdown
+ possible from switching to the slower acpi_pm time source, on the same
+ system used for the fast results above:
+
+<screen><![CDATA[
+# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
+tsc hpet acpi_pm
+# echo acpi_pm > /sys/devices/system/clocksource/clocksource0/current_clocksource
+# pg_test_timing
+Per loop time including overhead: 722.92 ns
+Histogram of timing durations:
+ < us % of total count
+ 1 27.84870 1155682
+ 2 72.05956 2990371
+ 4 0.07810 3241
+ 8 0.01357 563
+ 16 0.00007 3
+]]></screen>
+ </para>
+
+ <para>
+ In this configuration, the sample <command>EXPLAIN ANALYZE</command> above
+ takes 115.9 ms. That's 1061 ns of timing overhead, again a small multiple
+ of what's measured directly by this utility. That much timing overhead
+ means the actual query itself is only taking a tiny fraction of the
+ accounted for time, most of it is being consumed in overhead instead. In
+ this configuration, any <command>EXPLAIN ANALYZE</command> totals involving
+ many timed operations would be inflated significantly by timing overhead.
+ </para>
+
+ <para>
+ FreeBSD also allows changing the time source on the fly, and it logs
+ information about the timer selected during boot:
+
+<screen>
+# dmesg | grep "Timecounter"
+Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
+Timecounter "i8254" frequency 1193182 Hz quality 0
+Timecounters tick every 10.000 msec
+Timecounter "TSC" frequency 2531787134 Hz quality 800
+# sysctl kern.timecounter.hardware=TSC
+kern.timecounter.hardware: ACPI-fast -> TSC
+</screen>
+ </para>
+
+ <para>
+ Other systems may only allow setting the time source on boot. On older
+ Linux systems the "clock" kernel setting is the only way to make this sort
+ of change. And even on some more recent ones, the only option you'll see
+ for a clock source is "jiffies". Jiffies are the older Linux software clock
+ implementation, which can have good resolution when it's backed by fast
+ enough timing hardware, as in this example:
+
+<screen><![CDATA[
+$ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
+jiffies
+$ dmesg | grep time.c
+time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer.
+time.c: Detected 2400.153 MHz processor.
+$ pg_test_timing
+Testing timing overhead for 3 seconds.
+Per timing duration including loop overhead: 97.75 ns
+Histogram of timing durations:
+ < us % of total count
+ 1 90.23734 27694571
+ 2 9.75277 2993204
+ 4 0.00981 3010
+ 8 0.00007 22
+ 16 0.00000 1
+ 32 0.00000 1
+]]></screen></para>
+
+ </refsect2>
+
+ <refsect2>
+ <title>Clock Hardware and Timing Accuracy</title>
+
+ <para>
+ Collecting accurate timing information is normally done on computers using
+ hardware clocks with various levels of accuracy. With some hardware the
+ operating systems can pass the system clock time almost directly to
+ programs. A system clock can also be derived from a chip that simply
+ provides timing interrupts, periodic ticks at some known time interval. In
+ either case, operating system kernels provide a clock source that hides
+ these details. But the accuracy of that clock source and how quickly it can
+ return results varies based on the underlying hardware.
+ </para>
+
+ <para>
+ Inaccurate time keeping can result in system instability. Test any change
+ to the clock source very carefully. Operating system defaults are sometimes
+ made to favor reliability over best accuracy. And if you are using a virtual
+ machine, look into the recommended time sources compatible with it. Virtual
+ hardware faces additional difficulties when emulating timers, and there are
+ often per operating system settings suggested by vendors.
+ </para>
+
+ <para>
+ The Time Stamp Counter (TSC) clock source is the most accurate one available
+ on current generation CPUs. It's the preferred way to track the system time
+ when it's supported by the operating system and the TSC clock is
+ reliable. There are several ways that TSC can fail to provide an accurate
+ timing source, making it unreliable. Older systems can have a TSC clock that
+ varies based on the CPU temperature, making it unusable for timing. Trying
+ to use TSC on some older multicore CPUs can give a reported time that's
+ inconsistent among multiple cores. This can result in the time going
+ backwards, a problem this program checks for. And even the newest systems
+ can fail to provide accurate TSC timing with very aggressive power saving
+ configurations.
+ </para>
+
+ <para>
+ Newer operating systems may check for the known TSC problems and switch to a
+ slower, more stable clock source when they are seen. If your system
+ supports TSC time but doesn't default to that, it may be disabled for a good
+ reason. And some operating systems may not detect all the possible problems
+ correctly, or will allow using TSC even in situations where it's known to be
+ inaccurate.
+ </para>
+
+ <para>
+ The High Precision Event Timer (HPET) is the preferred timer on systems
+ where it's available and TSC is not accurate. The timer chip itself is
+ programmable to allow up to 100 nanosecond resolution, but you may not see
+ that much accuracy in your system clock.
+ </para>
+
+ <para>
+ Advanced Configuration and Power Interface (ACPI) provides a Power
+ Management (PM) Timer, which Linux refers to as the acpi_pm. The clock
+ derived from acpi_pm will at best provide 300 nanosecond resolution.
+ </para>
+
+ <para>
+ Timers used on older PC hardware include the 8254 Programmable Interval
+ Timer (PIT), the real-time clock (RTC), the Advanced Programmable Interrupt
+ Controller (APIC) timer, and the Cyclone timer. These timers aim for
+ millisecond resolution.
+ </para>
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-explain"/></member>
+ </simplelist>
+ </refsect1>
+</refentry>