MultiPassString¶

Converts an ITU to HTML.

class cpip.util.MultiPassString.MultiPassString(theFileObj)¶

Reads a file, the file can be translated any number of times and marked with word types. The latter can then be generated using BufGen for example:

myBg = BufGen.BufGen(self._mps.genChars())
try:
    i = 0
    while 1:
        print myBg[i]
        i += 1
except IndexError:
    pass

_MultiPassString__set(idx, repl)¶

Sets a marker.

Parameters:	idx (`int`) – Index of marker. repl (`str`) – The marker.
Returns:	`NoneType`
Raises:	`ExceptionMultiPass` in index error.

__init__(theFileObj)¶

Constructor.

Parameters:	theFileObj (`_io.TextIOWrapper`) – File like object.
Returns:	`NoneType`

__weakref__¶: list of weak references to the object (if defined)

_retZeroIndex()¶

Returns the index to the first non-empty token in the current list.

Returns:	`int` – The index.

clearMarker()¶

The mark at this point in the input.

Returns:	`NoneType`

genChars()¶: Generates the current set of characters. This can be used as the generator for the BufGen and that BufGen can be passed to the PpTokeniser _slice... Functions.

genWords()¶

Generates pairs (word, type) from the original string.

TODO: Solve the overlap problem.

Returns:	`NoneType`, `tuple([str, str])` – a pair of `(word, type)`

hasWord¶: True if the length of the current word is > 0.

prevChar¶: The previous character of the input. A slight nod to K&R this is (not) a bit like putc().

removeMarkedWord(isTerm)¶

Remove the current marked word. isTerm is a boolean that is True if the current position is a terminal character.

For example if you want to split a string into lines then n is a terminal character and you would call this with isTerm=True.

However if you were splitting a string into words and whitespace then a whitespace following a word is not the terminal character so at the pint of receiving the whitespace character you would call this with isTerm=False

Parameters:	isTerm (`bool`) – True if a terminal character.
Returns:	`NoneType`

removeSetReplaceClear(isTerm, theType, theRepl)¶

This provides a helper combination function for a common operation of:

Removing the marked word from the output.
Setting the word type in the input.
Replacing the marked word with a replacement string.
Clearing the current marker.

Parameters:	isTerm (`bool`) – See `removeMarkedWord()` for an explanation of this. theType (`str`) – See `removeMarkedWord()` for an explanation of this. theRepl (`str`) – See `setAtMarker()` for an explanation of this.
Returns:	`NoneType`

setAtMarker(theRepl)¶

Sets the token at the current marker to be theRepl.

Parameters:	theRepl (`str`) – The repl.
Returns:	`NoneType`
Raises:	`ExceptionMultiPass` no marker present.

setMarker()¶

Sets a mark at this point in the input.

Returns:	`NoneType`

setWordType(theType, isTerm)¶

Marks a word in the input as a word of type theType starting from the marker up to the current place.

See removeMarkedWord() for an explanation of isTerm.

Parameters:	theType (`str`) – The type. isTerm (`bool`) – Is a terminal character.
Returns:	`NoneType`

wordLength¶

The length of the current word.

Returns:	`int` – The length.

class cpip.util.MultiPassString.Word(wordLen, wordType)¶

__getnewargs__()¶: Return self as a plain tuple. Used by copy and pickle.

static __new__(_cls, wordLen, wordType)¶: Create new instance of Word(wordLen, wordType)

__repr__()¶: Return a nicely formatted representation string

_asdict()¶: Return a new OrderedDict which maps field names to their values.

classmethod _make(iterable, new=<built-in method __new__ of type object at 0xa385c0>, len=<built-in function len>)¶: Make a new Word object from a sequence or iterable

_replace(_self, **kwds)¶: Return a new Word object replacing specified fields with new values

wordLen¶: Alias for field number 0

wordType¶: Alias for field number 1