PpToken

Represents a preprocessing Token in C/C++ source code.

cpip.core.PpToken.ENUM_NAME = {0: 'header-name', 1: 'identifier', 2: 'pp-number', 3: 'character-literal', 4: 'string-literal', 5: 'preprocessing-op-or-punc', 6: 'non-whitespace', 7: 'whitespace', 8: 'concat'}

Map of {integer : PREPROCESS_TOKEN_TYPE, ...} So this can be used thus: if ENUM_NAME[token_type] == ‘header-name’:

exception cpip.core.PpToken.ExceptionCpipToken

Used by PpToken.

exception cpip.core.PpToken.ExceptionCpipTokenIllegalMerge

Used by PpToken when merge() is called illegally.

exception cpip.core.PpToken.ExceptionCpipTokenIllegalOperation

Used by PpToken when an illegal operation is performed.

exception cpip.core.PpToken.ExceptionCpipTokenReopenForExpansion

Used by PpToken when a non-expandable token is made available for expansion.

exception cpip.core.PpToken.ExceptionCpipTokenUnknownType

Used by PpToken when the token type is out of range.

cpip.core.PpToken.LEX_PPTOKEN_TYPES = ['header-name', 'identifier', 'pp-number', 'character-literal', 'string-literal', 'preprocessing-op-or-punc', 'non-whitespace', 'whitespace', 'concat']

Types of preprocessing-token From: ISO/IEC 14882:1998(E) 2.4 Preprocessing tokens [lex.pptoken] NOTE: ISO/IEC 9899:1999 (E) 6.4.7 Header names Para 3 says that:

“A header name preprocessing token is recognized only within a #include preprocessing directive.”

So in other contexts a header-name that is a q-char-sequence should be treated as a string-literal

This produces interesting issues in this case:

#define str(s) # s
#include str(foo.h)

The stringise operator creates a string-literal token but the #include directive expects a header-name. So in certain contexts (macro stringising followed by #include instruction) we need to ‘downcast’ a string-literal to a header-name. See PpLexer for how this is done

cpip.core.PpToken.LEX_PPTOKEN_TYPE_ENUM_RANGE = [0, 1, 2, 3, 4, 5, 6, 7, 8]

Range of allowable enum values

cpip.core.PpToken.NAME_ENUM = {'header-name': 0, 'identifier': 1, 'whitespace': 7, 'non-whitespace': 6, 'character-literal': 3, 'pp-number': 2, 'preprocessing-op-or-punc': 5, 'concat': 8, 'string-literal': 4}

Map of {PREPROCESS_TOKEN_TYPE : integer, ...} So this can be used thus: self._cppTokType = NAME_ENUM[‘header-name’]

class cpip.core.PpToken.PpToken(t, tt, lineNum=0, colNum=0)

Holds a preprocessor token, its type and whether the token can be replaced.

t is the token (a string) and tt is either an enumerated integer or a string. Internally tt is stored as an enumerated integer. If the token is an identifier then it is eligible for replacement unless marked otherwise.

SINGLE_SPACE = ' '

Representation of a single whitespace

WORD_REPLACE_MAP = {'false': 'False', '||': 'or', 'true': 'True', '/': '//', '&&': 'and'}

Operators that are replaced directly by Python equivalents for constant evaluation

canReplace

Flag to control whether this token is eligible for replacement

colNum

Returns the column number of the start of the token as an integer.

copy()

Returns a shallow copy of self. This is useful where the same token is added to multiple lists and then a merge() operation on one list will be seen by the others. To avoid this insert self.copy() in all but one of the lists.

evalConstExpr()

Returns an string value suitable for eval’ing in a constant expression. For numbers this removes such tiresome trivia as ‘u’, ‘L’ etc. For others it replaces ‘&&’ with ‘and’ and so on.

See ISO/IEC ISO/IEC 14882:1998(E) 16.1 Conditional inclusion sub-section 4 i.e. section 16.1-4 and: ISO/IEC 9899:1999 (E) 6.10.1 Conditional inclusion sub-section 3 i.e. section 6.10.1-3

getPrevWs()

Gets the flag that records prior whitespace.

getReplace()

Gets the flag that controls whether this can be replaced.

isCond

Flag that if True indicates that the token appeared within a section that was conditionally compiled. This is False on construction and can only be set True by setIsCond()

isIdentifier()

Returns True if the token type is ‘identifier’.

isUnCond

Flag that if True indicates that the token appeared within a section that was un-conditionally compiled. This is the negation of isCond.

isWs()

Returns True if the token type is ‘whitespace’.

lineNum

Returns the line number of the start of the token as an integer.

merge(other)

This will merge by appending the other token if they are different token types the type becomes ‘concat’.

prevWs

Flag to indicate whether this token is preceded by whitespace

replaceNewLine()

Replace any newline with a single whitespace character in-place.

See: C ISO/IEC 9899:1999(E) 6.10-3 and C++ ISO/IEC 14882:1998(E) 16.3-9

This will raise a ExceptionCpipTokenIllegalOperation if I am not a whitespace token.

setIsCond()

Sets self._isCond to be True.

setPrevWs(val)

Sets the flag that records prior whitespace.

setReplace(val)

Setter, will raise if I am not an identifier or val is True and if I am otherwise not expandable.

shrinkWs()

Replace all whitespace with a single ‘ ‘

This will raise a ExceptionCpipTokenIllegalOperation if I am not a whitespace token.

subst(t, tt)

Substitutes token value and type.

t

Returns the token as a string.

tokEnumToktype

Returns the token and the enumerated token type as a tuple.

tokToktype

Returns the token and the token type (as a string) as a tuple.

tt

Returns the token type as a string.

cpip.core.PpToken.tokensStr(theTokens, shortForm=True)

Given a list of tokens this returns them as a string. If shortForm is True then the lexical string is returned. If False then the PpToken representations separated by ‘ | ‘ is returned. e.g. PpToken(t="f", tt=identifier, line=True, prev=False, ?=False) | ...

Previous topic

PpLexer

Next topic

PpTokenCount

This Page