TLex

Source: source/classes/tlex.prg

Inherits from: TFile

TLex is a lexical analyzer (tokenizer) that reads source text from a file or string and breaks it into tokens. It supports configurable separator characters, skip-on-blank and skip-on-CRLF modes, identifier tables, and token ID mapping. The class returns tokens sequentially via nGetToken(), which returns a numeric token ID matching the entries in the aTokens array.

Key DATA Members

DATATypeDescription
aTokensArrayArray of token definitions (each entry: { cTokenString, nTokenId })
aIdsArrayArray of recognized identifier strings
lSkipBlankLogicalSkip whitespace characters when tokenizing
lSkipCRLFLogicalSkip carriage return and line feed characters
cTokenCharacterThe last token text extracted
cSeparatorsCharacterString of separator characters used to delimit tokens
uValueAnyOptional value associated with the current token
cTextCharacterThe full source text being tokenized

Methods

MethodDescription
New( cFile, aTokens, aIds, cSeparators )Create a TLex from a file, with token table, identifier table, and separators
nGetToken()Extract the next token and return its numeric ID (0 = unrecognized, -1 = end)
lEoF()Return .T. when the end of the source text has been reached
SetText( c )Set the source text to be tokenized (alternative to file-based constructor)
Add( cToken, nId )Add a new token string with its numeric ID to the token table

Example: Tokenize IF-THEN-ELSE Script

#include "FiveWin.ch"

function Main()

   local oLex, nToken
   local cScript := "IF x > 10 THEN print 'Hello' ELSE stop"

   // Define tokens: { string, id }
   local aTokens := { ;
      { "IF",    1 }, ;
      { "THEN",  2 }, ;
      { "ELSE",  3 }, ;
      { ">",     4 }, ;
      { "print", 5 }, ;
      { "stop",  6 }  }

   // Create lexer
   oLex := TLex():New( , aTokens )
   oLex:SetText( cScript )
   oLex:lSkipBlank := .T.

   // Tokenize
   while ! oLex:lEoF()
      nToken := oLex:nGetToken()
      if nToken == 0
         ? "Unrecognized:", oLex:cToken
      elseif nToken > 0
         ? "Token ID:", nToken, "Text:", oLex:cToken
      endif
   enddo

return nil

Notes

See Also