Module ldoc.lexer
Lexical scanner for creating a sequence of tokens from text.
lexer.scan(s) returns an iterator over all tokens found in the
string s. This iterator returns two values, a token type string
(such as 'string' for quoted string, 'iden' for identifier) and the value of the
token.
Versions specialized for Lua and C are available; these also handle block comments and classify keywords as 'keyword' tokens. For example:
> s = 'for i=1,n do' > for t,v in lexer.lua(s) do print(t,v) end keyword for iden i = = number 1 , , iden n keyword do
Based on pl.lexer from Penlight
Functions
| scan (s, matches, filter, options) | create a plain token iterator from a string or file-like object. |
| getline (tok) | get everything in a stream upto a newline. |
| lineno (tok) | get current line number. |
| get_keywords () | get the Lua keywords as a set-like table. |
| lua (s, filter, options) | create a Lua token iterator from a string or file-like object. |
| cpp (s, filter, options) | create a C/C++ token iterator from a string or file-like object. |
| get_separated_list (tok, endtoken, delim) | get a list of parameters separated by a delimiter from a stream. |
| skipws (tok) | get the next non-space token from the stream. |
| expecting (tok, expected_type, no_skip_ws) | get the next token, which must be of the expected type. |
Functions
- scan (s, matches, filter, options)
-
create a plain token iterator from a string or file-like object.
Parameters:
- s the string
- matches an optional match table (set of pattern-action pairs)
- filter a table of token types to exclude, by default {space=true}
- options a table of options; by default, {number=true,string=true}, which means convert numbers and strip string quotes.
- getline (tok)
-
get everything in a stream upto a newline.
Parameters:
- tok a token stream
Returns:
-
a string
- lineno (tok)
-
get current line number.
Only available if the input source is a file-like object.Parameters:
- tok a token stream
Returns:
-
the line number and current column
- get_keywords ()
-
get the Lua keywords as a set-like table.
So
res["and"]etc would betrue.Returns:
-
a table
- lua (s, filter, options)
-
create a Lua token iterator from a string or file-like object.
Will return the token type and value.
Parameters:
- s the string
- filter a table of token types to exclude, by default {space=true,comments=true}
- options a table of options; by default, {number=true,string=true}, which means convert numbers and strip string quotes.
- cpp (s, filter, options)
-
create a C/C++ token iterator from a string or file-like object.
Will return the token type type and value.
Parameters:
- s the string
- filter a table of token types to exclude, by default {space=true,comments=true}
- options a table of options; by default, {number=true,string=true}, which means convert numbers and strip string quotes.
- get_separated_list (tok, endtoken, delim)
-
get a list of parameters separated by a delimiter from a stream.
Parameters:
- tok the token stream
- endtoken end of list (default ')'). Can be '\n'
- delim separator (default ',')
Returns:
-
a list of token lists.
- skipws (tok)
-
get the next non-space token from the stream.
Parameters:
- tok the token stream.
- expecting (tok, expected_type, no_skip_ws)
-
get the next token, which must be of the expected type.
Throws an error if this type does not match!
Parameters:
- tok the token stream
- expected_type the token type
- no_skip_ws whether we should skip whitespace