Module peg :: Class Pattern
[hide private]
[frames] | no frames]

Class Pattern

source code

Known Subclasses:

A PEG pattern or set of patterns.

PEG Syntax:

 Operator      Type      Precedence  Description
 --------  ------------  ----------  -----------
   ' '       primary         5       Literal string
   " "       primary         5       Literal string
  [' ']      primary         5       Character class
    1        primary         5       Any character
   (e)       primary         5       Grouping
  e[:1]    unary suffix      4       Optional
  e[0:]    unary suffix      4       Zero-or-more
  e[1:]    unary suffix      4       One-or-more
   +e      unary prefix      3       And-predicate
   -e      unary prefix      3       Not-predicate
 e1 + e2      binary         2       Sequence
 e1 | e2      binary         1       Prioritized Choice

Additionally:

 Operator      Type      Precedence  Description
 --------  ------------  ----------  -----------
 e1 - e2      binary         2       Syntactic sugar for -e2 + e1
Instance Methods [hide private]
 
__init__(self, pattern)
Creates and returns a PEG pattern from a given parameter.
source code
 
__getitem__(self, key)
Returns a PEG repetition of this pattern given a slice.
source code
 
__pos__(self)
Returns a PEG And-predicate for the current pattern.
source code
 
__neg__(self)
Returns a PEG Not-predicate for the current pattern.
source code
 
__add__(self, other)
Joins this pattern and another together in a PEG Sequence and returns it.
source code
 
__sub__(self, other)
Syntactic sugar for ``-other + self``.
source code
 
__or__(self, other)
Join this pattern and another together in a PEG Prioritized Choice and returns it.
source code
 
match(self, subject, init=0)
Matches this PEG pattern against a subject string starting at init.
source code
 
__match(self, pattern, subject, init, userdata)
Private helper function used internally.
source code
Class Variables [hide private]
  REPETITION_LIMIT = 100
Method Details [hide private]

__init__(self, pattern)
(Constructor)

source code 

Creates and returns a PEG pattern from a given parameter.

For a given parameter:
* literal string - Matches its specific text (case insensitively) in a
                   subject string.
* list - Must contain a single string which is treated as a character class,
         otherwise a ``ValueError`` will be raised.
         Matches any character in a subject string contained in this class.
* integer - Must be positive or a ``ValueError`` will be raised.
            Matches exactly that many characters in a subject string.
* function - The function will be passed the subject string and current
             index the PEG is at. If a non-negative integer is returned,
             that integer becomes the new index, indicating a match.
* dictionary - A dictionary is interpreted as a grammar. These are useful
               for defining recursive patterns. Each dictionary has key
               names with pattern values, otherwise a ``TypeError`` will be
               raised. Also, there must be a 'default' key whose value is
               the name key for the default pattern to initially match. A
               ``ValueError`` will be raised if it does not exist. If it
               points to an invalid name key, ``KeyError`` will be raised.
* Pattern - The pattern is simply copied.
* anything else - ``TypeError`` will be raised.

Warning: left-recursive patterns are not checked for and will cause a
``RuntimeError`` when ``match`` is called due to a recursion overflow.

Examples::

  # creates a pattern that matches 'foo' literally
  patt = Pattern('foo')

  # creates a pattern that matches 'a', or 'b', or 'c' character once
  patt = Pattern(['abc'])

  # creates a pattern that matches any single character
  patt = Pattern(1)

  # creates a pattern that matches any even, single digit number
  def even_single_digit(subject, index):
    digit = subject[index]
    if digit >= '0' and digit <= '9' and int(digit) % 2 == 0:
      return index + 1
  patt = Pattern(even_single_digit)

  # creates a pattern that searches for 'e'
  patt = Pattern({
    1: Pattern('e') | Pattern(1) + PatternBackRef(1),
    'default': 1
  })

__getitem__(self, key)
(Indexing operator)

source code 

Returns a PEG repetition of this pattern given a slice.

Valid slices are: * ``[:1]`` - matches zero or one occurance of the pattern. * ``[0:]`` - matches zero or more occurances of the pattern. * ``[1:]`` - matches one or more occurances of the pattern. * anything else - ``ValueError`` will be raised.

Warning: repeating patterns that do not consume input zero or more times will cause a ``RuntimeError`` after ``self.REPETITION_LIMIT`` loops. This cannot be checked for.

Examples:

 # creates a pattern that matches 'foo' either not at all or only once
 patt = Pattern('foo')[:0]

 # creates a pattern that matches 'bar' either not at all or many times
 patt = Pattern('bar')[0:]

 # creates a pattern that matches 'baz' at least once
 patt = Pattern('baz')[1:]

__pos__(self)

source code 

Returns a PEG And-predicate for the current pattern.

An And-predicate matches the pattern, but does not consume input.

Examples:

 # creates a pattern that matches 'foo' literally, but does not consume it
 patt = +Pattern('foo')

 # creates a mattern that matches 'foo', but only consumes 'f'
 patt = +Pattern('foo') + 'f'

__neg__(self)

source code 

Returns a PEG Not-predicate for the current pattern.

A Not-predicate matches the inverse of the pattern, consuming no input.

Examples:

 # creates a pattern that does not match the characters 'a', 'b', or 'c'
 patt = -Pattern(['abc'])

 # creates a pattern that matches 'foo', but not 'bar' afterwards
 patt = Pattern('foo') + -Pattern('bar')

__add__(self, other)
(Addition operator)

source code 

Joins this pattern and another together in a PEG Sequence and returns it.

A Sequence matches the first pattern and then the second.

If the other pattern is a literal string, list, integer, function, or dictionary, it is converted into a PEG pattern as in ``__init__`` first.

Examples:

 # creates a pattern that matches 'foo' and 'bar' literally
 patt = Pattern('foo') + 'bar'

__sub__(self, other)
(Subtraction operator)

source code 

Syntactic sugar for ``-other + self``.

Examples:

 # creates a pattern that matches any character but 'a'
 patt = Pattern(1) - 'a'

__or__(self, other)
(Or operator)

source code 

Join this pattern and another together in a PEG Prioritized Choice and returns it.

A Prioritized Choice matches either pattern, the first taking precedence.

If the other pattern is a literal string, list, integer, function, or dictionary, it is converted into a PEG pattern as in ``__init__`` first.

Examples:

 # creates a pattern that matches 'foo' first, or else 'bar'
 patt = Pattern('foo') | 'bar'

match(self, subject, init=0)

source code 

Matches this PEG pattern against a subject string starting at init. If any captures are contained in the pattern, they are returned as a tuple. Otherwise, the index of the first character in subject after the match is returned, or ``-1`` if the match failed.

``IndexError`` will be raised if ``init >= len(subject)``.

Examples:

 patt = Pattern('foo')
 index = patt.match('foobar')
 #result: index = 3

 patt = Pattern('bar')
 index = patt.match('foobar', 3)
 #result: index = 6

 patt = SimpleCapture('foo')
 text, = patt.match('foobar')
 #result: text = 'foo'

__match(self, pattern, subject, init, userdata)

source code 

Private helper function used internally. Behaves like ``match``, but always returns an index.