summaryrefslogtreecommitdiffstats
path: root/Documentation/parse-date.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/parse-date.txt')
-rw-r--r--Documentation/parse-date.txt468
1 files changed, 468 insertions, 0 deletions
diff --git a/Documentation/parse-date.txt b/Documentation/parse-date.txt
new file mode 100644
index 0000000..b23abe7
--- /dev/null
+++ b/Documentation/parse-date.txt
@@ -0,0 +1,468 @@
+
+NAME
+ parse_date - parses a date string into a timespec struct.
+
+SYNOPSIS
+ #include "timeutils.h"
+
+ int parse_date(struct timespec *result, char const *p,
+ struct timespec const *now)
+
+ LDADD libcommon.la
+
+DESCRIPTION
+ Parse a date/time string, storing the resulting time value into *result.
+ The string itself is pointed to by *p. Return 1 if successful.
+ *p can be an incomplete or relative time specification; if so, use
+ *now as the basis for the returned time.
+
+
+This function is based upon gnulib's parse-datetime.y-dd7a871.
+
+Below is a plain text version of the gnulib parse-datetime.texi-dd7a871 manual
+describing the input strings that are recognized.
+
+Any future modifications to the util-linux parser that affect input strings
+should be noted below.
+
+
+1 Date input formats
+********************
+
+First, a quote:
+
+ Our units of temporal measurement, from seconds on up to months,
+ are so complicated, asymmetrical and disjunctive so as to make
+ coherent mental reckoning in time all but impossible. Indeed, had
+ some tyrannical god contrived to enslave our minds to time, to
+ make it all but impossible for us to escape subjection to sodden
+ routines and unpleasant surprises, he could hardly have done
+ better than handing down our present system. It is like a set of
+ trapezoidal building blocks, with no vertical or horizontal
+ surfaces, like a language in which the simplest thought demands
+ ornate constructions, useless particles and lengthy
+ circumlocutions. Unlike the more successful patterns of language
+ and science, which enable us to face experience boldly or at least
+ level-headedly, our system of temporal calculation silently and
+ persistently encourages our terror of time.
+
+ ... It is as though architects had to measure length in feet,
+ width in meters and height in ells; as though basic instruction
+ manuals demanded a knowledge of five different languages. It is
+ no wonder then that we often look into our own immediate past or
+ future, last Tuesday or a week from Sunday, with feelings of
+ helpless confusion. ...
+
+ --Robert Grudin, `Time and the Art of Living'.
+
+ This section describes the textual date representations that GNU
+programs accept. These are the strings you, as a user, can supply as
+arguments to the various programs. The C interface (via the
+`parse_datetime' function) is not described here.
+
+1.1 General date syntax
+=======================
+
+A "date" is a string, possibly empty, containing many items separated
+by whitespace. The whitespace may be omitted when no ambiguity arises.
+The empty string means the beginning of today (i.e., midnight). Order
+of the items is immaterial. A date string may contain many flavors of
+items:
+
+ * calendar date items
+
+ * time of day items
+
+ * time zone items
+
+ * combined date and time of day items
+
+ * day of the week items
+
+ * relative items
+
+ * pure numbers.
+
+We describe each of these item types in turn, below.
+
+ A few ordinal numbers may be written out in words in some contexts.
+This is most useful for specifying day of the week items or relative
+items (see below). Among the most commonly used ordinal numbers, the
+word `last' stands for -1, `this' stands for 0, and `first' and `next'
+both stand for 1. Because the word `second' stands for the unit of
+time there is no way to write the ordinal number 2, but for convenience
+`third' stands for 3, `fourth' for 4, `fifth' for 5, `sixth' for 6,
+`seventh' for 7, `eighth' for 8, `ninth' for 9, `tenth' for 10,
+`eleventh' for 11 and `twelfth' for 12.
+
+ When a month is written this way, it is still considered to be
+written numerically, instead of being "spelled in full"; this changes
+the allowed strings.
+
+ In the current implementation, only English is supported for words
+and abbreviations like `AM', `DST', `EST', `first', `January',
+`Sunday', `tomorrow', and `year'.
+
+ The output of the `date' command is not always acceptable as a date
+string, not only because of the language problem, but also because
+there is no standard meaning for time zone items like `IST'. When using
+`date' to generate a date string intended to be parsed later, specify a
+date format that is independent of language and that does not use time
+zone items other than `UTC' and `Z'. Here are some ways to do this:
+
+ $ LC_ALL=C TZ=UTC0 date
+ Mon Mar 1 00:21:42 UTC 2004
+ $ TZ=UTC0 date +'%Y-%m-%d %H:%M:%SZ'
+ 2004-03-01 00:21:42Z
+ $ date --rfc-3339=ns # --rfc-3339 is a GNU extension.
+ 2004-02-29 16:21:42.692722128-08:00
+ $ date --rfc-2822 # a GNU extension
+ Sun, 29 Feb 2004 16:21:42 -0800
+ $ date +'%Y-%m-%d %H:%M:%S %z' # %z is a GNU extension.
+ 2004-02-29 16:21:42 -0800
+ $ date +'@%s.%N' # %s and %N are GNU extensions.
+ @1078100502.692722128
+
+ Alphabetic case is completely ignored in dates. Comments may be
+introduced between round parentheses, as long as included parentheses
+are properly nested. Hyphens not followed by a digit are currently
+ignored. Leading zeros on numbers are ignored.
+
+ Invalid dates like `2005-02-29' or times like `24:00' are rejected.
+In the typical case of a host that does not support leap seconds, a
+time like `23:59:60' is rejected even if it corresponds to a valid leap
+second.
+
+1.2 Calendar date items
+=======================
+
+A "calendar date item" specifies a day of the year. It is specified
+differently, depending on whether the month is specified numerically or
+literally. All these strings specify the same calendar date:
+
+ 1972-09-24 # ISO 8601.
+ 72-9-24 # Assume 19xx for 69 through 99,
+ # 20xx for 00 through 68.
+ 72-09-24 # Leading zeros are ignored.
+ 9/24/72 # Common U.S. writing.
+ 24 September 1972
+ 24 Sept 72 # September has a special abbreviation.
+ 24 Sep 72 # Three-letter abbreviations always allowed.
+ Sep 24, 1972
+ 24-sep-72
+ 24sep72
+
+ The year can also be omitted. In this case, the last specified year
+is used, or the current year if none. For example:
+
+ 9/24
+ sep 24
+
+ Here are the rules.
+
+ For numeric months, the ISO 8601 format `YEAR-MONTH-DAY' is allowed,
+where YEAR is any positive number, MONTH is a number between 01 and 12,
+and DAY is a number between 01 and 31. A leading zero must be present
+if a number is less than ten. If YEAR is 68 or smaller, then 2000 is
+added to it; otherwise, if YEAR is less than 100, then 1900 is added to
+it. The construct `MONTH/DAY/YEAR', popular in the United States, is
+accepted. Also `MONTH/DAY', omitting the year.
+
+ Literal months may be spelled out in full: `January', `February',
+`March', `April', `May', `June', `July', `August', `September',
+`October', `November' or `December'. Literal months may be abbreviated
+to their first three letters, possibly followed by an abbreviating dot.
+It is also permitted to write `Sept' instead of `September'.
+
+ When months are written literally, the calendar date may be given as
+any of the following:
+
+ DAY MONTH YEAR
+ DAY MONTH
+ MONTH DAY YEAR
+ DAY-MONTH-YEAR
+
+ Or, omitting the year:
+
+ MONTH DAY
+
+1.3 Time of day items
+=====================
+
+A "time of day item" in date strings specifies the time on a given day.
+Here are some examples, all of which represent the same time:
+
+ 20:02:00.000000
+ 20:02
+ 8:02pm
+ 20:02-0500 # In EST (U.S. Eastern Standard Time).
+
+ More generally, the time of day may be given as
+`HOUR:MINUTE:SECOND', where HOUR is a number between 0 and 23, MINUTE
+is a number between 0 and 59, and SECOND is a number between 0 and 59
+possibly followed by `.' or `,' and a fraction containing one or more
+digits. Alternatively, `:SECOND' can be omitted, in which case it is
+taken to be zero. On the rare hosts that support leap seconds, SECOND
+may be 60.
+
+ If the time is followed by `am' or `pm' (or `a.m.' or `p.m.'), HOUR
+is restricted to run from 1 to 12, and `:MINUTE' may be omitted (taken
+to be zero). `am' indicates the first half of the day, `pm' indicates
+the second half of the day. In this notation, 12 is the predecessor of
+1: midnight is `12am' while noon is `12pm'. (This is the zero-oriented
+interpretation of `12am' and `12pm', as opposed to the old tradition
+derived from Latin which uses `12m' for noon and `12pm' for midnight.)
+
+ The time may alternatively be followed by a time zone correction,
+expressed as `SHHMM', where S is `+' or `-', HH is a number of zone
+hours and MM is a number of zone minutes. The zone minutes term, MM,
+may be omitted, in which case the one- or two-digit correction is
+interpreted as a number of hours. You can also separate HH from MM
+with a colon. When a time zone correction is given this way, it forces
+interpretation of the time relative to Coordinated Universal Time
+(UTC), overriding any previous specification for the time zone or the
+local time zone. For example, `+0530' and `+05:30' both stand for the
+time zone 5.5 hours ahead of UTC (e.g., India). This is the best way to
+specify a time zone correction by fractional parts of an hour. The
+maximum zone correction is 24 hours.
+
+ Either `am'/`pm' or a time zone correction may be specified, but not
+both.
+
+1.4 Time zone items
+===================
+
+A "time zone item" specifies an international time zone, indicated by a
+small set of letters, e.g., `UTC' or `Z' for Coordinated Universal
+Time. Any included periods are ignored. By following a
+non-daylight-saving time zone by the string `DST' in a separate word
+(that is, separated by some white space), the corresponding daylight
+saving time zone may be specified. Alternatively, a
+non-daylight-saving time zone can be followed by a time zone
+correction, to add the two values. This is normally done only for
+`UTC'; for example, `UTC+05:30' is equivalent to `+05:30'.
+
+ Time zone items other than `UTC' and `Z' are obsolescent and are not
+recommended, because they are ambiguous; for example, `EST' has a
+different meaning in Australia than in the United States. Instead,
+it's better to use unambiguous numeric time zone corrections like
+`-0500', as described in the previous section.
+
+ If neither a time zone item nor a time zone correction is supplied,
+timestamps are interpreted using the rules of the default time zone
+(*note Specifying time zone rules::).
+
+1.5 Combined date and time of day items
+=======================================
+
+The ISO 8601 date and time of day extended format consists of an ISO
+8601 date, a `T' character separator, and an ISO 8601 time of day.
+This format is also recognized if the `T' is replaced by a space.
+
+ In this format, the time of day should use 24-hour notation.
+Fractional seconds are allowed, with either comma or period preceding
+the fraction. ISO 8601 fractional minutes and hours are not supported.
+Typically, hosts support nanosecond timestamp resolution; excess
+precision is silently discarded.
+
+ Here are some examples:
+
+ 2012-09-24T20:02:00.052-05:00
+ 2012-12-31T23:59:59,999999999+11:00
+ 1970-01-01 00:00Z
+
+1.6 Day of week items
+=====================
+
+The explicit mention of a day of the week will forward the date (only
+if necessary) to reach that day of the week in the future.
+
+ Days of the week may be spelled out in full: `Sunday', `Monday',
+`Tuesday', `Wednesday', `Thursday', `Friday' or `Saturday'. Days may
+be abbreviated to their first three letters, optionally followed by a
+period. The special abbreviations `Tues' for `Tuesday', `Wednes' for
+`Wednesday' and `Thur' or `Thurs' for `Thursday' are also allowed.
+
+ A number may precede a day of the week item to move forward
+supplementary weeks. It is best used in expression like `third
+monday'. In this context, `last DAY' or `next DAY' is also acceptable;
+they move one week before or after the day that DAY by itself would
+represent.
+
+ A comma following a day of the week item is ignored.
+
+1.7 Relative items in date strings
+==================================
+
+"Relative items" adjust a date (or the current date if none) forward or
+backward. The effects of relative items accumulate. Here are some
+examples:
+
+ 1 year
+ 1 year ago
+ 3 years
+ 2 days
+
+ The unit of time displacement may be selected by the string `year'
+or `month' for moving by whole years or months. These are fuzzy units,
+as years and months are not all of equal duration. More precise units
+are `fortnight' which is worth 14 days, `week' worth 7 days, `day'
+worth 24 hours, `hour' worth 60 minutes, `minute' or `min' worth 60
+seconds, and `second' or `sec' worth one second. An `s' suffix on
+these units is accepted and ignored.
+
+ The unit of time may be preceded by a multiplier, given as an
+optionally signed number. Unsigned numbers are taken as positively
+signed. No number at all implies 1 for a multiplier. Following a
+relative item by the string `ago' is equivalent to preceding the unit
+by a multiplier with value -1.
+
+ The string `tomorrow' is worth one day in the future (equivalent to
+`day'), the string `yesterday' is worth one day in the past (equivalent
+to `day ago').
+
+ The strings `now' or `today' are relative items corresponding to
+zero-valued time displacement, these strings come from the fact a
+zero-valued time displacement represents the current time when not
+otherwise changed by previous items. They may be used to stress other
+items, like in `12:00 today'. The string `this' also has the meaning
+of a zero-valued time displacement, but is preferred in date strings
+like `this thursday'.
+
+ When a relative item causes the resulting date to cross a boundary
+where the clocks were adjusted, typically for daylight saving time, the
+resulting date and time are adjusted accordingly.
+
+ The fuzz in units can cause problems with relative items. For
+example, `2003-07-31 -1 month' might evaluate to 2003-07-01, because
+2003-06-31 is an invalid date. To determine the previous month more
+reliably, you can ask for the month before the 15th of the current
+month. For example:
+
+ $ date -R
+ Thu, 31 Jul 2003 13:02:39 -0700
+ $ date --date='-1 month' +'Last month was %B?'
+ Last month was July?
+ $ date --date="$(date +%Y-%m-15) -1 month" +'Last month was %B!'
+ Last month was June!
+
+ Also, take care when manipulating dates around clock changes such as
+daylight saving leaps. In a few cases these have added or subtracted
+as much as 24 hours from the clock, so it is often wise to adopt
+universal time by setting the `TZ' environment variable to `UTC0'
+before embarking on calendrical calculations.
+
+1.8 Pure numbers in date strings
+================================
+
+The precise interpretation of a pure decimal number depends on the
+context in the date string.
+
+ If the decimal number is of the form YYYYMMDD and no other calendar
+date item (*note Calendar date items::) appears before it in the date
+string, then YYYY is read as the year, MM as the month number and DD as
+the day of the month, for the specified calendar date.
+
+ If the decimal number is of the form HHMM and no other time of day
+item appears before it in the date string, then HH is read as the hour
+of the day and MM as the minute of the hour, for the specified time of
+day. MM can also be omitted.
+
+ If both a calendar date and a time of day appear to the left of a
+number in the date string, but no relative item, then the number
+overrides the year.
+
+1.9 Seconds since the Epoch
+===========================
+
+If you precede a number with `@', it represents an internal timestamp
+as a count of seconds. The number can contain an internal decimal
+point (either `.' or `,'); any excess precision not supported by the
+internal representation is truncated toward minus infinity. Such a
+number cannot be combined with any other date item, as it specifies a
+complete timestamp.
+
+ Internally, computer times are represented as a count of seconds
+since an epoch--a well-defined point of time. On GNU and POSIX
+systems, the epoch is 1970-01-01 00:00:00 UTC, so `@0' represents this
+time, `@1' represents 1970-01-01 00:00:01 UTC, and so forth. GNU and
+most other POSIX-compliant systems support such times as an extension
+to POSIX, using negative counts, so that `@-1' represents 1969-12-31
+23:59:59 UTC.
+
+ Traditional Unix systems count seconds with 32-bit two's-complement
+integers and can represent times from 1901-12-13 20:45:52 through
+2038-01-19 03:14:07 UTC. More modern systems use 64-bit counts of
+seconds with nanosecond subcounts, and can represent all the times in
+the known lifetime of the universe to a resolution of 1 nanosecond.
+
+ On most hosts, these counts ignore the presence of leap seconds.
+For example, on most hosts `@915148799' represents 1998-12-31 23:59:59
+UTC, `@915148800' represents 1999-01-01 00:00:00 UTC, and there is no
+way to represent the intervening leap second 1998-12-31 23:59:60 UTC.
+
+1.10 Specifying time zone rules
+===============================
+
+Normally, dates are interpreted using the rules of the current time
+zone, which in turn are specified by the `TZ' environment variable, or
+by a system default if `TZ' is not set. To specify a different set of
+default time zone rules that apply just to one date, start the date
+with a string of the form `TZ="RULE"'. The two quote characters (`"')
+must be present in the date, and any quotes or backslashes within RULE
+must be escaped by a backslash.
+
+ For example, with the GNU `date' command you can answer the question
+"What time is it in New York when a Paris clock shows 6:30am on October
+31, 2004?" by using a date beginning with `TZ="Europe/Paris"' as shown
+in the following shell transcript:
+
+ $ export TZ="America/New_York"
+ $ date --date='TZ="Europe/Paris" 2004-10-31 06:30'
+ Sun Oct 31 01:30:00 EDT 2004
+
+ In this example, the `--date' operand begins with its own `TZ'
+setting, so the rest of that operand is processed according to
+`Europe/Paris' rules, treating the string `2004-10-31 06:30' as if it
+were in Paris. However, since the output of the `date' command is
+processed according to the overall time zone rules, it uses New York
+time. (Paris was normally six hours ahead of New York in 2004, but
+this example refers to a brief Halloween period when the gap was five
+hours.)
+
+ A `TZ' value is a rule that typically names a location in the `tz'
+database (http://www.twinsun.com/tz/tz-link.htm). A recent catalog of
+location names appears in the TWiki Date and Time Gateway
+(http://twiki.org/cgi-bin/xtra/tzdate). A few non-GNU hosts require a
+colon before a location name in a `TZ' setting, e.g.,
+`TZ=":America/New_York"'.
+
+ The `tz' database includes a wide variety of locations ranging from
+`Arctic/Longyearbyen' to `Antarctica/South_Pole', but if you are at sea
+and have your own private time zone, or if you are using a non-GNU host
+that does not support the `tz' database, you may need to use a POSIX
+rule instead. Simple POSIX rules like `UTC0' specify a time zone
+without daylight saving time; other rules can specify simple daylight
+saving regimes. *Note Specifying the Time Zone with `TZ': (libc)TZ
+Variable.
+
+1.11 Authors of `parse_datetime'
+================================
+
+`parse_datetime' started life as `getdate', as originally implemented
+by Steven M. Bellovin (<smb@research.att.com>) while at the University
+of North Carolina at Chapel Hill. The code was later tweaked by a
+couple of people on Usenet, then completely overhauled by Rich $alz
+(<rsalz@bbn.com>) and Jim Berets (<jberets@bbn.com>) in August, 1990.
+Various revisions for the GNU system were made by David MacKenzie, Jim
+Meyering, Paul Eggert and others, including renaming it to `get_date' to
+avoid a conflict with the alternative Posix function `getdate', and a
+later rename to `parse_datetime'. The Posix function `getdate' can
+parse more locale-specific dates using `strptime', but relies on an
+environment variable and external file, and lacks the thread-safety of
+`parse_datetime'.
+
+ This chapter was originally produced by François Pinard
+(<pinard@iro.umontreal.ca>) from the `parse_datetime.y' source code,
+and then edited by K. Berry (<kb@cs.umb.edu>).
+