MultiPassString¶
Converts an ITU to HTML.
-
class
cpip.util.MultiPassString.
MultiPassString
(theFileObj)¶ Reads a file, the file can be translated any number of times and marked with word types. The latter can then be generated using BufGen for example:
myBg = BufGen.BufGen(self._mps.genChars()) try: i = 0 while 1: print myBg[i] i += 1 except IndexError: pass
-
_MultiPassString__set
(idx, repl)¶ Sets a marker.
Parameters: - idx (
int
) – Index of marker. - repl (
str
) – The marker.
Returns: NoneType
Raises: ExceptionMultiPass
in index error.- idx (
-
__init__
(theFileObj)¶ Constructor.
Parameters: theFileObj ( _io.TextIOWrapper
) – File like object.Returns: NoneType
-
__weakref__
¶ list of weak references to the object (if defined)
-
_retZeroIndex
()¶ Returns the index to the first non-empty token in the current list.
Returns: int
– The index.
-
clearMarker
()¶ The mark at this point in the input.
Returns: NoneType
-
genChars
()¶ Generates the current set of characters. This can be used as the generator for the BufGen and that BufGen can be passed to the
PpTokeniser _slice...
Functions.
-
genWords
()¶ Generates pairs
(word, type)
from the original string.TODO: Solve the overlap problem.
Returns: NoneType
,tuple([str, str])
– a pair of(word, type)
-
hasWord
¶ True if the length of the current word is > 0.
-
prevChar
¶ The previous character of the input. A slight nod to K&R this is (not) a bit like putc().
-
removeMarkedWord
(isTerm)¶ Remove the current marked word. isTerm is a boolean that is True if the current position is a terminal character.
For example if you want to split a string into lines then n is a terminal character and you would call this with
isTerm=True
.However if you were splitting a string into words and whitespace then a whitespace following a word is not the terminal character so at the pint of receiving the whitespace character you would call this with
isTerm=False
Parameters: isTerm ( bool
) – True if a terminal character.Returns: NoneType
-
removeSetReplaceClear
(isTerm, theType, theRepl)¶ This provides a helper combination function for a common operation of:
- Removing the marked word from the output.
- Setting the word type in the input.
- Replacing the marked word with a replacement string.
- Clearing the current marker.
Parameters: - isTerm (
bool
) – SeeremoveMarkedWord()
for an explanation of this. - theType (
str
) – SeeremoveMarkedWord()
for an explanation of this. - theRepl (
str
) – SeesetAtMarker()
for an explanation of this.
Returns: NoneType
-
setAtMarker
(theRepl)¶ Sets the token at the current marker to be theRepl.
Parameters: theRepl ( str
) – The repl.Returns: NoneType
Raises: ExceptionMultiPass
no marker present.
-
setMarker
()¶ Sets a mark at this point in the input.
Returns: NoneType
-
setWordType
(theType, isTerm)¶ Marks a word in the input as a word of type theType starting from the marker up to the current place.
See
removeMarkedWord()
for an explanation ofisTerm
.Parameters: - theType (
str
) – The type. - isTerm (
bool
) – Is a terminal character.
Returns: NoneType
- theType (
-
wordLength
¶ The length of the current word.
Returns: int
– The length.
-
-
class
cpip.util.MultiPassString.
Word
(wordLen, wordType)¶ -
__getnewargs__
()¶ Return self as a plain tuple. Used by copy and pickle.
-
static
__new__
(_cls, wordLen, wordType)¶ Create new instance of Word(wordLen, wordType)
-
__repr__
()¶ Return a nicely formatted representation string
-
_asdict
()¶ Return a new OrderedDict which maps field names to their values.
-
classmethod
_make
(iterable, new=<built-in method __new__ of type object at 0xa385c0>, len=<built-in function len>)¶ Make a new Word object from a sequence or iterable
-
_replace
(_self, **kwds)¶ Return a new Word object replacing specified fields with new values
-
wordLen
¶ Alias for field number 0
-
wordType
¶ Alias for field number 1
-