summaryrefslogtreecommitdiffstats
path: root/third_party/python/ply/CHANGES
diff options
context:
space:
mode:
Diffstat (limited to 'third_party/python/ply/CHANGES')
-rw-r--r--third_party/python/ply/CHANGES1394
1 files changed, 1394 insertions, 0 deletions
diff --git a/third_party/python/ply/CHANGES b/third_party/python/ply/CHANGES
new file mode 100644
index 0000000000..815c23184e
--- /dev/null
+++ b/third_party/python/ply/CHANGES
@@ -0,0 +1,1394 @@
+Version 3.10
+---------------------
+01/31/17: beazley
+ Changed grammar signature computation to not involve hashing
+ functions. Parts are just combined into a big string.
+
+10/07/16: beazley
+ Fixed Issue #101: Incorrect shift-reduce conflict resolution with
+ precedence specifier.
+
+ PLY was incorrectly resolving shift-reduce conflicts in certain
+ cases. For example, in the example/calc/calc.py example, you
+ could trigger it doing this:
+
+ calc > -3 - 4
+ 1 (correct answer should be -7)
+ calc >
+
+ Issue and suggested patch contributed by https://github.com/RomaVis
+
+Version 3.9
+---------------------
+08/30/16: beazley
+ Exposed the parser state number as the parser.state attribute
+ in productions and error functions. For example:
+
+ def p_somerule(p):
+ '''
+ rule : A B C
+ '''
+ print('State:', p.parser.state)
+
+ May address issue #65 (publish current state in error callback).
+
+08/30/16: beazley
+ Fixed Issue #88. Python3 compatibility with ply/cpp.
+
+08/30/16: beazley
+ Fixed Issue #93. Ply can crash if SyntaxError is raised inside
+ a production. Not actually sure if the original implementation
+ worked as documented at all. Yacc has been modified to follow
+ the spec as outlined in the CHANGES noted for 11/27/07 below.
+
+08/30/16: beazley
+ Fixed Issue #97. Failure with code validation when the original
+ source files aren't present. Validation step now ignores
+ the missing file.
+
+08/30/16: beazley
+ Minor fixes to version numbers.
+
+Version 3.8
+---------------------
+10/02/15: beazley
+ Fixed issues related to Python 3.5. Patch contributed by Barry Warsaw.
+
+Version 3.7
+---------------------
+08/25/15: beazley
+ Fixed problems when reading table files from pickled data.
+
+05/07/15: beazley
+ Fixed regression in handling of table modules if specified as module
+ objects. See https://github.com/dabeaz/ply/issues/63
+
+Version 3.6
+---------------------
+04/25/15: beazley
+ If PLY is unable to create the 'parser.out' or 'parsetab.py' files due
+ to permission issues, it now just issues a warning message and
+ continues to operate. This could happen if a module using PLY
+ is installed in a funny way where tables have to be regenerated, but
+ for whatever reason, the user doesn't have write permission on
+ the directory where PLY wants to put them.
+
+04/24/15: beazley
+ Fixed some issues related to use of packages and table file
+ modules. Just to emphasize, PLY now generates its special
+ files such as 'parsetab.py' and 'lextab.py' in the *SAME*
+ directory as the source file that uses lex() and yacc().
+
+ If for some reason, you want to change the name of the table
+ module, use the tabmodule and lextab options:
+
+ lexer = lex.lex(lextab='spamlextab')
+ parser = yacc.yacc(tabmodule='spamparsetab')
+
+ If you specify a simple name as shown, the module will still be
+ created in the same directory as the file invoking lex() or yacc().
+ If you want the table files to be placed into a different package,
+ then give a fully qualified package name. For example:
+
+ lexer = lex.lex(lextab='pkgname.files.lextab')
+ parser = yacc.yacc(tabmodule='pkgname.files.parsetab')
+
+ For this to work, 'pkgname.files' must already exist as a valid
+ Python package (i.e., the directories must already exist and be
+ set up with the proper __init__.py files, etc.).
+
+Version 3.5
+---------------------
+04/21/15: beazley
+ Added support for defaulted_states in the parser. A
+ defaulted_state is a state where the only legal action is a
+ reduction of a single grammar rule across all valid input
+ tokens. For such states, the rule is reduced and the
+ reading of the next lookahead token is delayed until it is
+ actually needed at a later point in time.
+
+ This delay in consuming the next lookahead token is a
+ potentially important feature in advanced parsing
+ applications that require tight interaction between the
+ lexer and the parser. For example, a grammar rule change
+ modify the lexer state upon reduction and have such changes
+ take effect before the next input token is read.
+
+ *** POTENTIAL INCOMPATIBILITY ***
+ One potential danger of defaulted_states is that syntax
+ errors might be deferred to a a later point of processing
+ than where they were detected in past versions of PLY.
+ Thus, it's possible that your error handling could change
+ slightly on the same inputs. defaulted_states do not change
+ the overall parsing of the input (i.e., the same grammar is
+ accepted).
+
+ If for some reason, you need to disable defaulted states,
+ you can do this:
+
+ parser = yacc.yacc()
+ parser.defaulted_states = {}
+
+04/21/15: beazley
+ Fixed debug logging in the parser. It wasn't properly reporting goto states
+ on grammar rule reductions.
+
+04/20/15: beazley
+ Added actions to be defined to character literals (Issue #32). For example:
+
+ literals = [ '{', '}' ]
+
+ def t_lbrace(t):
+ r'\{'
+ # Some action
+ t.type = '{'
+ return t
+
+ def t_rbrace(t):
+ r'\}'
+ # Some action
+ t.type = '}'
+ return t
+
+04/19/15: beazley
+ Import of the 'parsetab.py' file is now constrained to only consider the
+ directory specified by the outputdir argument to yacc(). If not supplied,
+ the import will only consider the directory in which the grammar is defined.
+ This should greatly reduce problems with the wrong parsetab.py file being
+ imported by mistake. For example, if it's found somewhere else on the path
+ by accident.
+
+ *** POTENTIAL INCOMPATIBILITY *** It's possible that this might break some
+ packaging/deployment setup if PLY was instructed to place its parsetab.py
+ in a different location. You'll have to specify a proper outputdir= argument
+ to yacc() to fix this if needed.
+
+04/19/15: beazley
+ Changed default output directory to be the same as that in which the
+ yacc grammar is defined. If your grammar is in a file 'calc.py',
+ then the parsetab.py and parser.out files should be generated in the
+ same directory as that file. The destination directory can be changed
+ using the outputdir= argument to yacc().
+
+04/19/15: beazley
+ Changed the parsetab.py file signature slightly so that the parsetab won't
+ regenerate if created on a different major version of Python (ie., a
+ parsetab created on Python 2 will work with Python 3).
+
+04/16/15: beazley
+ Fixed Issue #44 call_errorfunc() should return the result of errorfunc()
+
+04/16/15: beazley
+ Support for versions of Python <2.7 is officially dropped. PLY may work, but
+ the unit tests requires Python 2.7 or newer.
+
+04/16/15: beazley
+ Fixed bug related to calling yacc(start=...). PLY wasn't regenerating the
+ table file correctly for this case.
+
+04/16/15: beazley
+ Added skipped tests for PyPy and Java. Related to use of Python's -O option.
+
+05/29/13: beazley
+ Added filter to make unit tests pass under 'python -3'.
+ Reported by Neil Muller.
+
+05/29/13: beazley
+ Fixed CPP_INTEGER regex in ply/cpp.py (Issue 21).
+ Reported by @vbraun.
+
+05/29/13: beazley
+ Fixed yacc validation bugs when from __future__ import unicode_literals
+ is being used. Reported by Kenn Knowles.
+
+05/29/13: beazley
+ Added support for Travis-CI. Contributed by Kenn Knowles.
+
+05/29/13: beazley
+ Added a .gitignore file. Suggested by Kenn Knowles.
+
+05/29/13: beazley
+ Fixed validation problems for source files that include a
+ different source code encoding specifier. Fix relies on
+ the inspect module. Should work on Python 2.6 and newer.
+ Not sure about older versions of Python.
+ Contributed by Michael Droettboom
+
+05/21/13: beazley
+ Fixed unit tests for yacc to eliminate random failures due to dict hash value
+ randomization in Python 3.3
+ Reported by Arfrever
+
+10/15/12: beazley
+ Fixed comment whitespace processing bugs in ply/cpp.py.
+ Reported by Alexei Pososin.
+
+10/15/12: beazley
+ Fixed token names in ply/ctokens.py to match rule names.
+ Reported by Alexei Pososin.
+
+04/26/12: beazley
+ Changes to functions available in panic mode error recover. In previous versions
+ of PLY, the following global functions were available for use in the p_error() rule:
+
+ yacc.errok() # Reset error state
+ yacc.token() # Get the next token
+ yacc.restart() # Reset the parsing stack
+
+ The use of global variables was problematic for code involving multiple parsers
+ and frankly was a poor design overall. These functions have been moved to methods
+ of the parser instance created by the yacc() function. You should write code like
+ this:
+
+ def p_error(p):
+ ...
+ parser.errok()
+
+ parser = yacc.yacc()
+
+ *** POTENTIAL INCOMPATIBILITY *** The original global functions now issue a
+ DeprecationWarning.
+
+04/19/12: beazley
+ Fixed some problems with line and position tracking and the use of error
+ symbols. If you have a grammar rule involving an error rule like this:
+
+ def p_assignment_bad(p):
+ '''assignment : location EQUALS error SEMI'''
+ ...
+
+ You can now do line and position tracking on the error token. For example:
+
+ def p_assignment_bad(p):
+ '''assignment : location EQUALS error SEMI'''
+ start_line = p.lineno(3)
+ start_pos = p.lexpos(3)
+
+ If the trackng=True option is supplied to parse(), you can additionally get
+ spans:
+
+ def p_assignment_bad(p):
+ '''assignment : location EQUALS error SEMI'''
+ start_line, end_line = p.linespan(3)
+ start_pos, end_pos = p.lexspan(3)
+
+ Note that error handling is still a hairy thing in PLY. This won't work
+ unless your lexer is providing accurate information. Please report bugs.
+ Suggested by a bug reported by Davis Herring.
+
+04/18/12: beazley
+ Change to doc string handling in lex module. Regex patterns are now first
+ pulled from a function's .regex attribute. If that doesn't exist, then
+ .doc is checked as a fallback. The @TOKEN decorator now sets the .regex
+ attribute of a function instead of its doc string.
+ Changed suggested by Kristoffer Ellersgaard Koch.
+
+04/18/12: beazley
+ Fixed issue #1: Fixed _tabversion. It should use __tabversion__ instead of __version__
+ Reported by Daniele Tricoli
+
+04/18/12: beazley
+ Fixed issue #8: Literals empty list causes IndexError
+ Reported by Walter Nissen.
+
+04/18/12: beazley
+ Fixed issue #12: Typo in code snippet in documentation
+ Reported by florianschanda.
+
+04/18/12: beazley
+ Fixed issue #10: Correctly escape t_XOREQUAL pattern.
+ Reported by Andy Kittner.
+
+Version 3.4
+---------------------
+02/17/11: beazley
+ Minor patch to make cpp.py compatible with Python 3. Note: This
+ is an experimental file not currently used by the rest of PLY.
+
+02/17/11: beazley
+ Fixed setup.py trove classifiers to properly list PLY as
+ Python 3 compatible.
+
+01/02/11: beazley
+ Migration of repository to github.
+
+Version 3.3
+-----------------------------
+08/25/09: beazley
+ Fixed issue 15 related to the set_lineno() method in yacc. Reported by
+ mdsherry.
+
+08/25/09: beazley
+ Fixed a bug related to regular expression compilation flags not being
+ properly stored in lextab.py files created by the lexer when running
+ in optimize mode. Reported by Bruce Frederiksen.
+
+
+Version 3.2
+-----------------------------
+03/24/09: beazley
+ Added an extra check to not print duplicated warning messages
+ about reduce/reduce conflicts.
+
+03/24/09: beazley
+ Switched PLY over to a BSD-license.
+
+03/23/09: beazley
+ Performance optimization. Discovered a few places to make
+ speedups in LR table generation.
+
+03/23/09: beazley
+ New warning message. PLY now warns about rules never
+ reduced due to reduce/reduce conflicts. Suggested by
+ Bruce Frederiksen.
+
+03/23/09: beazley
+ Some clean-up of warning messages related to reduce/reduce errors.
+
+03/23/09: beazley
+ Added a new picklefile option to yacc() to write the parsing
+ tables to a filename using the pickle module. Here is how
+ it works:
+
+ yacc(picklefile="parsetab.p")
+
+ This option can be used if the normal parsetab.py file is
+ extremely large. For example, on jython, it is impossible
+ to read parsing tables if the parsetab.py exceeds a certain
+ threshold.
+
+ The filename supplied to the picklefile option is opened
+ relative to the current working directory of the Python
+ interpreter. If you need to refer to the file elsewhere,
+ you will need to supply an absolute or relative path.
+
+ For maximum portability, the pickle file is written
+ using protocol 0.
+
+03/13/09: beazley
+ Fixed a bug in parser.out generation where the rule numbers
+ where off by one.
+
+03/13/09: beazley
+ Fixed a string formatting bug with one of the error messages.
+ Reported by Richard Reitmeyer
+
+Version 3.1
+-----------------------------
+02/28/09: beazley
+ Fixed broken start argument to yacc(). PLY-3.0 broke this
+ feature by accident.
+
+02/28/09: beazley
+ Fixed debugging output. yacc() no longer reports shift/reduce
+ or reduce/reduce conflicts if debugging is turned off. This
+ restores similar behavior in PLY-2.5. Reported by Andrew Waters.
+
+Version 3.0
+-----------------------------
+02/03/09: beazley
+ Fixed missing lexer attribute on certain tokens when
+ invoking the parser p_error() function. Reported by
+ Bart Whiteley.
+
+02/02/09: beazley
+ The lex() command now does all error-reporting and diagonistics
+ using the logging module interface. Pass in a Logger object
+ using the errorlog parameter to specify a different logger.
+
+02/02/09: beazley
+ Refactored ply.lex to use a more object-oriented and organized
+ approach to collecting lexer information.
+
+02/01/09: beazley
+ Removed the nowarn option from lex(). All output is controlled
+ by passing in a logger object. Just pass in a logger with a high
+ level setting to suppress output. This argument was never
+ documented to begin with so hopefully no one was relying upon it.
+
+02/01/09: beazley
+ Discovered and removed a dead if-statement in the lexer. This
+ resulted in a 6-7% speedup in lexing when I tested it.
+
+01/13/09: beazley
+ Minor change to the procedure for signalling a syntax error in a
+ production rule. A normal SyntaxError exception should be raised
+ instead of yacc.SyntaxError.
+
+01/13/09: beazley
+ Added a new method p.set_lineno(n,lineno) that can be used to set the
+ line number of symbol n in grammar rules. This simplifies manual
+ tracking of line numbers.
+
+01/11/09: beazley
+ Vastly improved debugging support for yacc.parse(). Instead of passing
+ debug as an integer, you can supply a Logging object (see the logging
+ module). Messages will be generated at the ERROR, INFO, and DEBUG
+ logging levels, each level providing progressively more information.
+ The debugging trace also shows states, grammar rule, values passed
+ into grammar rules, and the result of each reduction.
+
+01/09/09: beazley
+ The yacc() command now does all error-reporting and diagnostics using
+ the interface of the logging module. Use the errorlog parameter to
+ specify a logging object for error messages. Use the debuglog parameter
+ to specify a logging object for the 'parser.out' output.
+
+01/09/09: beazley
+ *HUGE* refactoring of the the ply.yacc() implementation. The high-level
+ user interface is backwards compatible, but the internals are completely
+ reorganized into classes. No more global variables. The internals
+ are also more extensible. For example, you can use the classes to
+ construct a LALR(1) parser in an entirely different manner than
+ what is currently the case. Documentation is forthcoming.
+
+01/07/09: beazley
+ Various cleanup and refactoring of yacc internals.
+
+01/06/09: beazley
+ Fixed a bug with precedence assignment. yacc was assigning the precedence
+ each rule based on the left-most token, when in fact, it should have been
+ using the right-most token. Reported by Bruce Frederiksen.
+
+11/27/08: beazley
+ Numerous changes to support Python 3.0 including removal of deprecated
+ statements (e.g., has_key) and the additional of compatibility code
+ to emulate features from Python 2 that have been removed, but which
+ are needed. Fixed the unit testing suite to work with Python 3.0.
+ The code should be backwards compatible with Python 2.
+
+11/26/08: beazley
+ Loosened the rules on what kind of objects can be passed in as the
+ "module" parameter to lex() and yacc(). Previously, you could only use
+ a module or an instance. Now, PLY just uses dir() to get a list of
+ symbols on whatever the object is without regard for its type.
+
+11/26/08: beazley
+ Changed all except: statements to be compatible with Python2.x/3.x syntax.
+
+11/26/08: beazley
+ Changed all raise Exception, value statements to raise Exception(value) for
+ forward compatibility.
+
+11/26/08: beazley
+ Removed all print statements from lex and yacc, using sys.stdout and sys.stderr
+ directly. Preparation for Python 3.0 support.
+
+11/04/08: beazley
+ Fixed a bug with referring to symbols on the the parsing stack using negative
+ indices.
+
+05/29/08: beazley
+ Completely revamped the testing system to use the unittest module for everything.
+ Added additional tests to cover new errors/warnings.
+
+Version 2.5
+-----------------------------
+05/28/08: beazley
+ Fixed a bug with writing lex-tables in optimized mode and start states.
+ Reported by Kevin Henry.
+
+Version 2.4
+-----------------------------
+05/04/08: beazley
+ A version number is now embedded in the table file signature so that
+ yacc can more gracefully accomodate changes to the output format
+ in the future.
+
+05/04/08: beazley
+ Removed undocumented .pushback() method on grammar productions. I'm
+ not sure this ever worked and can't recall ever using it. Might have
+ been an abandoned idea that never really got fleshed out. This
+ feature was never described or tested so removing it is hopefully
+ harmless.
+
+05/04/08: beazley
+ Added extra error checking to yacc() to detect precedence rules defined
+ for undefined terminal symbols. This allows yacc() to detect a potential
+ problem that can be really tricky to debug if no warning message or error
+ message is generated about it.
+
+05/04/08: beazley
+ lex() now has an outputdir that can specify the output directory for
+ tables when running in optimize mode. For example:
+
+ lexer = lex.lex(optimize=True, lextab="ltab", outputdir="foo/bar")
+
+ The behavior of specifying a table module and output directory are
+ more aligned with the behavior of yacc().
+
+05/04/08: beazley
+ [Issue 9]
+ Fixed filename bug in when specifying the modulename in lex() and yacc().
+ If you specified options such as the following:
+
+ parser = yacc.yacc(tabmodule="foo.bar.parsetab",outputdir="foo/bar")
+
+ yacc would create a file "foo.bar.parsetab.py" in the given directory.
+ Now, it simply generates a file "parsetab.py" in that directory.
+ Bug reported by cptbinho.
+
+05/04/08: beazley
+ Slight modification to lex() and yacc() to allow their table files
+ to be loaded from a previously loaded module. This might make
+ it easier to load the parsing tables from a complicated package
+ structure. For example:
+
+ import foo.bar.spam.parsetab as parsetab
+ parser = yacc.yacc(tabmodule=parsetab)
+
+ Note: lex and yacc will never regenerate the table file if used
+ in the form---you will get a warning message instead.
+ This idea suggested by Brian Clapper.
+
+
+04/28/08: beazley
+ Fixed a big with p_error() functions being picked up correctly
+ when running in yacc(optimize=1) mode. Patch contributed by
+ Bart Whiteley.
+
+02/28/08: beazley
+ Fixed a bug with 'nonassoc' precedence rules. Basically the
+ non-precedence was being ignored and not producing the correct
+ run-time behavior in the parser.
+
+02/16/08: beazley
+ Slight relaxation of what the input() method to a lexer will
+ accept as a string. Instead of testing the input to see
+ if the input is a string or unicode string, it checks to see
+ if the input object looks like it contains string data.
+ This change makes it possible to pass string-like objects
+ in as input. For example, the object returned by mmap.
+
+ import mmap, os
+ data = mmap.mmap(os.open(filename,os.O_RDONLY),
+ os.path.getsize(filename),
+ access=mmap.ACCESS_READ)
+ lexer.input(data)
+
+
+11/29/07: beazley
+ Modification of ply.lex to allow token functions to aliased.
+ This is subtle, but it makes it easier to create libraries and
+ to reuse token specifications. For example, suppose you defined
+ a function like this:
+
+ def number(t):
+ r'\d+'
+ t.value = int(t.value)
+ return t
+
+ This change would allow you to define a token rule as follows:
+
+ t_NUMBER = number
+
+ In this case, the token type will be set to 'NUMBER' and use
+ the associated number() function to process tokens.
+
+11/28/07: beazley
+ Slight modification to lex and yacc to grab symbols from both
+ the local and global dictionaries of the caller. This
+ modification allows lexers and parsers to be defined using
+ inner functions and closures.
+
+11/28/07: beazley
+ Performance optimization: The lexer.lexmatch and t.lexer
+ attributes are no longer set for lexer tokens that are not
+ defined by functions. The only normal use of these attributes
+ would be in lexer rules that need to perform some kind of
+ special processing. Thus, it doesn't make any sense to set
+ them on every token.
+
+ *** POTENTIAL INCOMPATIBILITY *** This might break code
+ that is mucking around with internal lexer state in some
+ sort of magical way.
+
+11/27/07: beazley
+ Added the ability to put the parser into error-handling mode
+ from within a normal production. To do this, simply raise
+ a yacc.SyntaxError exception like this:
+
+ def p_some_production(p):
+ 'some_production : prod1 prod2'
+ ...
+ raise yacc.SyntaxError # Signal an error
+
+ A number of things happen after this occurs:
+
+ - The last symbol shifted onto the symbol stack is discarded
+ and parser state backed up to what it was before the
+ the rule reduction.
+
+ - The current lookahead symbol is saved and replaced by
+ the 'error' symbol.
+
+ - The parser enters error recovery mode where it tries
+ to either reduce the 'error' rule or it starts
+ discarding items off of the stack until the parser
+ resets.
+
+ When an error is manually set, the parser does *not* call
+ the p_error() function (if any is defined).
+ *** NEW FEATURE *** Suggested on the mailing list
+
+11/27/07: beazley
+ Fixed structure bug in examples/ansic. Reported by Dion Blazakis.
+
+11/27/07: beazley
+ Fixed a bug in the lexer related to start conditions and ignored
+ token rules. If a rule was defined that changed state, but
+ returned no token, the lexer could be left in an inconsistent
+ state. Reported by
+
+11/27/07: beazley
+ Modified setup.py to support Python Eggs. Patch contributed by
+ Simon Cross.
+
+11/09/07: beazely
+ Fixed a bug in error handling in yacc. If a syntax error occurred and the
+ parser rolled the entire parse stack back, the parser would be left in in
+ inconsistent state that would cause it to trigger incorrect actions on
+ subsequent input. Reported by Ton Biegstraaten, Justin King, and others.
+
+11/09/07: beazley
+ Fixed a bug when passing empty input strings to yacc.parse(). This
+ would result in an error message about "No input given". Reported
+ by Andrew Dalke.
+
+Version 2.3
+-----------------------------
+02/20/07: beazley
+ Fixed a bug with character literals if the literal '.' appeared as the
+ last symbol of a grammar rule. Reported by Ales Smrcka.
+
+02/19/07: beazley
+ Warning messages are now redirected to stderr instead of being printed
+ to standard output.
+
+02/19/07: beazley
+ Added a warning message to lex.py if it detects a literal backslash
+ character inside the t_ignore declaration. This is to help
+ problems that might occur if someone accidentally defines t_ignore
+ as a Python raw string. For example:
+
+ t_ignore = r' \t'
+
+ The idea for this is from an email I received from David Cimimi who
+ reported bizarre behavior in lexing as a result of defining t_ignore
+ as a raw string by accident.
+
+02/18/07: beazley
+ Performance improvements. Made some changes to the internal
+ table organization and LR parser to improve parsing performance.
+
+02/18/07: beazley
+ Automatic tracking of line number and position information must now be
+ enabled by a special flag to parse(). For example:
+
+ yacc.parse(data,tracking=True)
+
+ In many applications, it's just not that important to have the
+ parser automatically track all line numbers. By making this an
+ optional feature, it allows the parser to run significantly faster
+ (more than a 20% speed increase in many cases). Note: positional
+ information is always available for raw tokens---this change only
+ applies to positional information associated with nonterminal
+ grammar symbols.
+ *** POTENTIAL INCOMPATIBILITY ***
+
+02/18/07: beazley
+ Yacc no longer supports extended slices of grammar productions.
+ However, it does support regular slices. For example:
+
+ def p_foo(p):
+ '''foo: a b c d e'''
+ p[0] = p[1:3]
+
+ This change is a performance improvement to the parser--it streamlines
+ normal access to the grammar values since slices are now handled in
+ a __getslice__() method as opposed to __getitem__().
+
+02/12/07: beazley
+ Fixed a bug in the handling of token names when combined with
+ start conditions. Bug reported by Todd O'Bryan.
+
+Version 2.2
+------------------------------
+11/01/06: beazley
+ Added lexpos() and lexspan() methods to grammar symbols. These
+ mirror the same functionality of lineno() and linespan(). For
+ example:
+
+ def p_expr(p):
+ 'expr : expr PLUS expr'
+ p.lexpos(1) # Lexing position of left-hand-expression
+ p.lexpos(1) # Lexing position of PLUS
+ start,end = p.lexspan(3) # Lexing range of right hand expression
+
+11/01/06: beazley
+ Minor change to error handling. The recommended way to skip characters
+ in the input is to use t.lexer.skip() as shown here:
+
+ def t_error(t):
+ print "Illegal character '%s'" % t.value[0]
+ t.lexer.skip(1)
+
+ The old approach of just using t.skip(1) will still work, but won't
+ be documented.
+
+10/31/06: beazley
+ Discarded tokens can now be specified as simple strings instead of
+ functions. To do this, simply include the text "ignore_" in the
+ token declaration. For example:
+
+ t_ignore_cppcomment = r'//.*'
+
+ Previously, this had to be done with a function. For example:
+
+ def t_ignore_cppcomment(t):
+ r'//.*'
+ pass
+
+ If start conditions/states are being used, state names should appear
+ before the "ignore_" text.
+
+10/19/06: beazley
+ The Lex module now provides support for flex-style start conditions
+ as described at http://www.gnu.org/software/flex/manual/html_chapter/flex_11.html.
+ Please refer to this document to understand this change note. Refer to
+ the PLY documentation for PLY-specific explanation of how this works.
+
+ To use start conditions, you first need to declare a set of states in
+ your lexer file:
+
+ states = (
+ ('foo','exclusive'),
+ ('bar','inclusive')
+ )
+
+ This serves the same role as the %s and %x specifiers in flex.
+
+ One a state has been declared, tokens for that state can be
+ declared by defining rules of the form t_state_TOK. For example:
+
+ t_PLUS = '\+' # Rule defined in INITIAL state
+ t_foo_NUM = '\d+' # Rule defined in foo state
+ t_bar_NUM = '\d+' # Rule defined in bar state
+
+ t_foo_bar_NUM = '\d+' # Rule defined in both foo and bar
+ t_ANY_NUM = '\d+' # Rule defined in all states
+
+ In addition to defining tokens for each state, the t_ignore and t_error
+ specifications can be customized for specific states. For example:
+
+ t_foo_ignore = " " # Ignored characters for foo state
+ def t_bar_error(t):
+ # Handle errors in bar state
+
+ With token rules, the following methods can be used to change states
+
+ def t_TOKNAME(t):
+ t.lexer.begin('foo') # Begin state 'foo'
+ t.lexer.push_state('foo') # Begin state 'foo', push old state
+ # onto a stack
+ t.lexer.pop_state() # Restore previous state
+ t.lexer.current_state() # Returns name of current state
+
+ These methods mirror the BEGIN(), yy_push_state(), yy_pop_state(), and
+ yy_top_state() functions in flex.
+
+ The use of start states can be used as one way to write sub-lexers.
+ For example, the lexer or parser might instruct the lexer to start
+ generating a different set of tokens depending on the context.
+
+ example/yply/ylex.py shows the use of start states to grab C/C++
+ code fragments out of traditional yacc specification files.
+
+ *** NEW FEATURE *** Suggested by Daniel Larraz with whom I also
+ discussed various aspects of the design.
+
+10/19/06: beazley
+ Minor change to the way in which yacc.py was reporting shift/reduce
+ conflicts. Although the underlying LALR(1) algorithm was correct,
+ PLY was under-reporting the number of conflicts compared to yacc/bison
+ when precedence rules were in effect. This change should make PLY
+ report the same number of conflicts as yacc.
+
+10/19/06: beazley
+ Modified yacc so that grammar rules could also include the '-'
+ character. For example:
+
+ def p_expr_list(p):
+ 'expression-list : expression-list expression'
+
+ Suggested by Oldrich Jedlicka.
+
+10/18/06: beazley
+ Attribute lexer.lexmatch added so that token rules can access the re
+ match object that was generated. For example:
+
+ def t_FOO(t):
+ r'some regex'
+ m = t.lexer.lexmatch
+ # Do something with m
+
+
+ This may be useful if you want to access named groups specified within
+ the regex for a specific token. Suggested by Oldrich Jedlicka.
+
+10/16/06: beazley
+ Changed the error message that results if an illegal character
+ is encountered and no default error function is defined in lex.
+ The exception is now more informative about the actual cause of
+ the error.
+
+Version 2.1
+------------------------------
+10/02/06: beazley
+ The last Lexer object built by lex() can be found in lex.lexer.
+ The last Parser object built by yacc() can be found in yacc.parser.
+
+10/02/06: beazley
+ New example added: examples/yply
+
+ This example uses PLY to convert Unix-yacc specification files to
+ PLY programs with the same grammar. This may be useful if you
+ want to convert a grammar from bison/yacc to use with PLY.
+
+10/02/06: beazley
+ Added support for a start symbol to be specified in the yacc
+ input file itself. Just do this:
+
+ start = 'name'
+
+ where 'name' matches some grammar rule. For example:
+
+ def p_name(p):
+ 'name : A B C'
+ ...
+
+ This mirrors the functionality of the yacc %start specifier.
+
+09/30/06: beazley
+ Some new examples added.:
+
+ examples/GardenSnake : A simple indentation based language similar
+ to Python. Shows how you might handle
+ whitespace. Contributed by Andrew Dalke.
+
+ examples/BASIC : An implementation of 1964 Dartmouth BASIC.
+ Contributed by Dave against his better
+ judgement.
+
+09/28/06: beazley
+ Minor patch to allow named groups to be used in lex regular
+ expression rules. For example:
+
+ t_QSTRING = r'''(?P<quote>['"]).*?(?P=quote)'''
+
+ Patch submitted by Adam Ring.
+
+09/28/06: beazley
+ LALR(1) is now the default parsing method. To use SLR, use
+ yacc.yacc(method="SLR"). Note: there is no performance impact
+ on parsing when using LALR(1) instead of SLR. However, constructing
+ the parsing tables will take a little longer.
+
+09/26/06: beazley
+ Change to line number tracking. To modify line numbers, modify
+ the line number of the lexer itself. For example:
+
+ def t_NEWLINE(t):
+ r'\n'
+ t.lexer.lineno += 1
+
+ This modification is both cleanup and a performance optimization.
+ In past versions, lex was monitoring every token for changes in
+ the line number. This extra processing is unnecessary for a vast
+ majority of tokens. Thus, this new approach cleans it up a bit.
+
+ *** POTENTIAL INCOMPATIBILITY ***
+ You will need to change code in your lexer that updates the line
+ number. For example, "t.lineno += 1" becomes "t.lexer.lineno += 1"
+
+09/26/06: beazley
+ Added the lexing position to tokens as an attribute lexpos. This
+ is the raw index into the input text at which a token appears.
+ This information can be used to compute column numbers and other
+ details (e.g., scan backwards from lexpos to the first newline
+ to get a column position).
+
+09/25/06: beazley
+ Changed the name of the __copy__() method on the Lexer class
+ to clone(). This is used to clone a Lexer object (e.g., if
+ you're running different lexers at the same time).
+
+09/21/06: beazley
+ Limitations related to the use of the re module have been eliminated.
+ Several users reported problems with regular expressions exceeding
+ more than 100 named groups. To solve this, lex.py is now capable
+ of automatically splitting its master regular regular expression into
+ smaller expressions as needed. This should, in theory, make it
+ possible to specify an arbitrarily large number of tokens.
+
+09/21/06: beazley
+ Improved error checking in lex.py. Rules that match the empty string
+ are now rejected (otherwise they cause the lexer to enter an infinite
+ loop). An extra check for rules containing '#' has also been added.
+ Since lex compiles regular expressions in verbose mode, '#' is interpreted
+ as a regex comment, it is critical to use '\#' instead.
+
+09/18/06: beazley
+ Added a @TOKEN decorator function to lex.py that can be used to
+ define token rules where the documentation string might be computed
+ in some way.
+
+ digit = r'([0-9])'
+ nondigit = r'([_A-Za-z])'
+ identifier = r'(' + nondigit + r'(' + digit + r'|' + nondigit + r')*)'
+
+ from ply.lex import TOKEN
+
+ @TOKEN(identifier)
+ def t_ID(t):
+ # Do whatever
+
+ The @TOKEN decorator merely sets the documentation string of the
+ associated token function as needed for lex to work.
+
+ Note: An alternative solution is the following:
+
+ def t_ID(t):
+ # Do whatever
+
+ t_ID.__doc__ = identifier
+
+ Note: Decorators require the use of Python 2.4 or later. If compatibility
+ with old versions is needed, use the latter solution.
+
+ The need for this feature was suggested by Cem Karan.
+
+09/14/06: beazley
+ Support for single-character literal tokens has been added to yacc.
+ These literals must be enclosed in quotes. For example:
+
+ def p_expr(p):
+ "expr : expr '+' expr"
+ ...
+
+ def p_expr(p):
+ 'expr : expr "-" expr'
+ ...
+
+ In addition to this, it is necessary to tell the lexer module about
+ literal characters. This is done by defining the variable 'literals'
+ as a list of characters. This should be defined in the module that
+ invokes the lex.lex() function. For example:
+
+ literals = ['+','-','*','/','(',')','=']
+
+ or simply
+
+ literals = '+=*/()='
+
+ It is important to note that literals can only be a single character.
+ When the lexer fails to match a token using its normal regular expression
+ rules, it will check the current character against the literal list.
+ If found, it will be returned with a token type set to match the literal
+ character. Otherwise, an illegal character will be signalled.
+
+
+09/14/06: beazley
+ Modified PLY to install itself as a proper Python package called 'ply'.
+ This will make it a little more friendly to other modules. This
+ changes the usage of PLY only slightly. Just do this to import the
+ modules
+
+ import ply.lex as lex
+ import ply.yacc as yacc
+
+ Alternatively, you can do this:
+
+ from ply import *
+
+ Which imports both the lex and yacc modules.
+ Change suggested by Lee June.
+
+09/13/06: beazley
+ Changed the handling of negative indices when used in production rules.
+ A negative production index now accesses already parsed symbols on the
+ parsing stack. For example,
+
+ def p_foo(p):
+ "foo: A B C D"
+ print p[1] # Value of 'A' symbol
+ print p[2] # Value of 'B' symbol
+ print p[-1] # Value of whatever symbol appears before A
+ # on the parsing stack.
+
+ p[0] = some_val # Sets the value of the 'foo' grammer symbol
+
+ This behavior makes it easier to work with embedded actions within the
+ parsing rules. For example, in C-yacc, it is possible to write code like
+ this:
+
+ bar: A { printf("seen an A = %d\n", $1); } B { do_stuff; }
+
+ In this example, the printf() code executes immediately after A has been
+ parsed. Within the embedded action code, $1 refers to the A symbol on
+ the stack.
+
+ To perform this equivalent action in PLY, you need to write a pair
+ of rules like this:
+
+ def p_bar(p):
+ "bar : A seen_A B"
+ do_stuff
+
+ def p_seen_A(p):
+ "seen_A :"
+ print "seen an A =", p[-1]
+
+ The second rule "seen_A" is merely a empty production which should be
+ reduced as soon as A is parsed in the "bar" rule above. The use
+ of the negative index p[-1] is used to access whatever symbol appeared
+ before the seen_A symbol.
+
+ This feature also makes it possible to support inherited attributes.
+ For example:
+
+ def p_decl(p):
+ "decl : scope name"
+
+ def p_scope(p):
+ """scope : GLOBAL
+ | LOCAL"""
+ p[0] = p[1]
+
+ def p_name(p):
+ "name : ID"
+ if p[-1] == "GLOBAL":
+ # ...
+ else if p[-1] == "LOCAL":
+ #...
+
+ In this case, the name rule is inheriting an attribute from the
+ scope declaration that precedes it.
+
+ *** POTENTIAL INCOMPATIBILITY ***
+ If you are currently using negative indices within existing grammar rules,
+ your code will break. This should be extremely rare if non-existent in
+ most cases. The argument to various grammar rules is not usually not
+ processed in the same way as a list of items.
+
+Version 2.0
+------------------------------
+09/07/06: beazley
+ Major cleanup and refactoring of the LR table generation code. Both SLR
+ and LALR(1) table generation is now performed by the same code base with
+ only minor extensions for extra LALR(1) processing.
+
+09/07/06: beazley
+ Completely reimplemented the entire LALR(1) parsing engine to use the
+ DeRemer and Pennello algorithm for calculating lookahead sets. This
+ significantly improves the performance of generating LALR(1) tables
+ and has the added feature of actually working correctly! If you
+ experienced weird behavior with LALR(1) in prior releases, this should
+ hopefully resolve all of those problems. Many thanks to
+ Andrew Waters and Markus Schoepflin for submitting bug reports
+ and helping me test out the revised LALR(1) support.
+
+Version 1.8
+------------------------------
+08/02/06: beazley
+ Fixed a problem related to the handling of default actions in LALR(1)
+ parsing. If you experienced subtle and/or bizarre behavior when trying
+ to use the LALR(1) engine, this may correct those problems. Patch
+ contributed by Russ Cox. Note: This patch has been superceded by
+ revisions for LALR(1) parsing in Ply-2.0.
+
+08/02/06: beazley
+ Added support for slicing of productions in yacc.
+ Patch contributed by Patrick Mezard.
+
+Version 1.7
+------------------------------
+03/02/06: beazley
+ Fixed infinite recursion problem ReduceToTerminals() function that
+ would sometimes come up in LALR(1) table generation. Reported by
+ Markus Schoepflin.
+
+03/01/06: beazley
+ Added "reflags" argument to lex(). For example:
+
+ lex.lex(reflags=re.UNICODE)
+
+ This can be used to specify optional flags to the re.compile() function
+ used inside the lexer. This may be necessary for special situations such
+ as processing Unicode (e.g., if you want escapes like \w and \b to consult
+ the Unicode character property database). The need for this suggested by
+ Andreas Jung.
+
+03/01/06: beazley
+ Fixed a bug with an uninitialized variable on repeated instantiations of parser
+ objects when the write_tables=0 argument was used. Reported by Michael Brown.
+
+03/01/06: beazley
+ Modified lex.py to accept Unicode strings both as the regular expressions for
+ tokens and as input. Hopefully this is the only change needed for Unicode support.
+ Patch contributed by Johan Dahl.
+
+03/01/06: beazley
+ Modified the class-based interface to work with new-style or old-style classes.
+ Patch contributed by Michael Brown (although I tweaked it slightly so it would work
+ with older versions of Python).
+
+Version 1.6
+------------------------------
+05/27/05: beazley
+ Incorporated patch contributed by Christopher Stawarz to fix an extremely
+ devious bug in LALR(1) parser generation. This patch should fix problems
+ numerous people reported with LALR parsing.
+
+05/27/05: beazley
+ Fixed problem with lex.py copy constructor. Reported by Dave Aitel, Aaron Lav,
+ and Thad Austin.
+
+05/27/05: beazley
+ Added outputdir option to yacc() to control output directory. Contributed
+ by Christopher Stawarz.
+
+05/27/05: beazley
+ Added rununit.py test script to run tests using the Python unittest module.
+ Contributed by Miki Tebeka.
+
+Version 1.5
+------------------------------
+05/26/04: beazley
+ Major enhancement. LALR(1) parsing support is now working.
+ This feature was implemented by Elias Ioup (ezioup@alumni.uchicago.edu)
+ and optimized by David Beazley. To use LALR(1) parsing do
+ the following:
+
+ yacc.yacc(method="LALR")
+
+ Computing LALR(1) parsing tables takes about twice as long as
+ the default SLR method. However, LALR(1) allows you to handle
+ more complex grammars. For example, the ANSI C grammar
+ (in example/ansic) has 13 shift-reduce conflicts with SLR, but
+ only has 1 shift-reduce conflict with LALR(1).
+
+05/20/04: beazley
+ Added a __len__ method to parser production lists. Can
+ be used in parser rules like this:
+
+ def p_somerule(p):
+ """a : B C D
+ | E F"
+ if (len(p) == 3):
+ # Must have been first rule
+ elif (len(p) == 2):
+ # Must be second rule
+
+ Suggested by Joshua Gerth and others.
+
+Version 1.4
+------------------------------
+04/23/04: beazley
+ Incorporated a variety of patches contributed by Eric Raymond.
+ These include:
+
+ 0. Cleans up some comments so they don't wrap on an 80-column display.
+ 1. Directs compiler errors to stderr where they belong.
+ 2. Implements and documents automatic line counting when \n is ignored.
+ 3. Changes the way progress messages are dumped when debugging is on.
+ The new format is both less verbose and conveys more information than
+ the old, including shift and reduce actions.
+
+04/23/04: beazley
+ Added a Python setup.py file to simply installation. Contributed
+ by Adam Kerrison.
+
+04/23/04: beazley
+ Added patches contributed by Adam Kerrison.
+
+ - Some output is now only shown when debugging is enabled. This
+ means that PLY will be completely silent when not in debugging mode.
+
+ - An optional parameter "write_tables" can be passed to yacc() to
+ control whether or not parsing tables are written. By default,
+ it is true, but it can be turned off if you don't want the yacc
+ table file. Note: disabling this will cause yacc() to regenerate
+ the parsing table each time.
+
+04/23/04: beazley
+ Added patches contributed by David McNab. This patch addes two
+ features:
+
+ - The parser can be supplied as a class instead of a module.
+ For an example of this, see the example/classcalc directory.
+
+ - Debugging output can be directed to a filename of the user's
+ choice. Use
+
+ yacc(debugfile="somefile.out")
+
+
+Version 1.3
+------------------------------
+12/10/02: jmdyck
+ Various minor adjustments to the code that Dave checked in today.
+ Updated test/yacc_{inf,unused}.exp to reflect today's changes.
+
+12/10/02: beazley
+ Incorporated a variety of minor bug fixes to empty production
+ handling and infinite recursion checking. Contributed by
+ Michael Dyck.
+
+12/10/02: beazley
+ Removed bogus recover() method call in yacc.restart()
+
+Version 1.2
+------------------------------
+11/27/02: beazley
+ Lexer and parser objects are now available as an attribute
+ of tokens and slices respectively. For example:
+
+ def t_NUMBER(t):
+ r'\d+'
+ print t.lexer
+
+ def p_expr_plus(t):
+ 'expr: expr PLUS expr'
+ print t.lexer
+ print t.parser
+
+ This can be used for state management (if needed).
+
+10/31/02: beazley
+ Modified yacc.py to work with Python optimize mode. To make
+ this work, you need to use
+
+ yacc.yacc(optimize=1)
+
+ Furthermore, you need to first run Python in normal mode
+ to generate the necessary parsetab.py files. After that,
+ you can use python -O or python -OO.
+
+ Note: optimized mode turns off a lot of error checking.
+ Only use when you are sure that your grammar is working.
+ Make sure parsetab.py is up to date!
+
+10/30/02: beazley
+ Added cloning of Lexer objects. For example:
+
+ import copy
+ l = lex.lex()
+ lc = copy.copy(l)
+
+ l.input("Some text")
+ lc.input("Some other text")
+ ...
+
+ This might be useful if the same "lexer" is meant to
+ be used in different contexts---or if multiple lexers
+ are running concurrently.
+
+10/30/02: beazley
+ Fixed subtle bug with first set computation and empty productions.
+ Patch submitted by Michael Dyck.
+
+10/30/02: beazley
+ Fixed error messages to use "filename:line: message" instead
+ of "filename:line. message". This makes error reporting more
+ friendly to emacs. Patch submitted by François Pinard.
+
+10/30/02: beazley
+ Improvements to parser.out file. Terminals and nonterminals
+ are sorted instead of being printed in random order.
+ Patch submitted by François Pinard.
+
+10/30/02: beazley
+ Improvements to parser.out file output. Rules are now printed
+ in a way that's easier to understand. Contributed by Russ Cox.
+
+10/30/02: beazley
+ Added 'nonassoc' associativity support. This can be used
+ to disable the chaining of operators like a < b < c.
+ To use, simply specify 'nonassoc' in the precedence table
+
+ precedence = (
+ ('nonassoc', 'LESSTHAN', 'GREATERTHAN'), # Nonassociative operators
+ ('left', 'PLUS', 'MINUS'),
+ ('left', 'TIMES', 'DIVIDE'),
+ ('right', 'UMINUS'), # Unary minus operator
+ )
+
+ Patch contributed by Russ Cox.
+
+10/30/02: beazley
+ Modified the lexer to provide optional support for Python -O and -OO
+ modes. To make this work, Python *first* needs to be run in
+ unoptimized mode. This reads the lexing information and creates a
+ file "lextab.py". Then, run lex like this:
+
+ # module foo.py
+ ...
+ ...
+ lex.lex(optimize=1)
+
+ Once the lextab file has been created, subsequent calls to
+ lex.lex() will read data from the lextab file instead of using
+ introspection. In optimized mode (-O, -OO) everything should
+ work normally despite the loss of doc strings.
+
+ To change the name of the file 'lextab.py' use the following:
+
+ lex.lex(lextab="footab")
+
+ (this creates a file footab.py)
+
+
+Version 1.1 October 25, 2001
+------------------------------
+
+10/25/01: beazley
+ Modified the table generator to produce much more compact data.
+ This should greatly reduce the size of the parsetab.py[c] file.
+ Caveat: the tables still need to be constructed so a little more
+ work is done in parsetab on import.
+
+10/25/01: beazley
+ There may be a possible bug in the cycle detector that reports errors
+ about infinite recursion. I'm having a little trouble tracking it
+ down, but if you get this problem, you can disable the cycle
+ detector as follows:
+
+ yacc.yacc(check_recursion = 0)
+
+10/25/01: beazley
+ Fixed a bug in lex.py that sometimes caused illegal characters to be
+ reported incorrectly. Reported by Sverre Jørgensen.
+
+7/8/01 : beazley
+ Added a reference to the underlying lexer object when tokens are handled by
+ functions. The lexer is available as the 'lexer' attribute. This
+ was added to provide better lexing support for languages such as Fortran
+ where certain types of tokens can't be conveniently expressed as regular
+ expressions (and where the tokenizing function may want to perform a
+ little backtracking). Suggested by Pearu Peterson.
+
+6/20/01 : beazley
+ Modified yacc() function so that an optional starting symbol can be specified.
+ For example:
+
+ yacc.yacc(start="statement")
+
+ Normally yacc always treats the first production rule as the starting symbol.
+ However, if you are debugging your grammar it may be useful to specify
+ an alternative starting symbol. Idea suggested by Rich Salz.
+
+Version 1.0 June 18, 2001
+--------------------------
+Initial public offering
+