Class: RLTK::Lexer::LexerCore
- Inherits:
-
Object
- Object
- RLTK::Lexer::LexerCore
- Defined in:
- lib/rltk/lexer.rb
Overview
The LexerCore class provides most of the functionality of the Lexer class. A LexerCore is instantiated for each subclass of Lexer, thereby allowing multiple lexers to be defined inside a single Ruby program.
Instance Attribute Summary (collapse)
-
- (Object) start_state
readonly
Returns the value of attribute start_state.
Instance Method Summary (collapse)
-
- (LexerCore) initialize
constructor
Instantiate a new LexerCore object.
-
- (Object) lex(string, env, file_name = nil)
Lex string, using env as the environment.
-
- (Object) lex_file(file_name, evn)
A wrapper function that calls ParserCore.lex on the contents of a file.
-
- (Object) match_first
Used to tell a lexer to use the first match found instead of the longest match found.
-
- (Object) rule(pattern, state = :default, flags = [], &action)
(also: #r)
This method is used to define a new lexing rule.
-
- (Object) start(state)
Changes the starting state of the lexer.
Constructor Details
- (LexerCore) initialize
Instantiate a new LexerCore object.
106 107 108 109 110 |
# File 'lib/rltk/lexer.rb', line 106 def initialize @match_type = :longest @rules = Hash.new {|h,k| h[k] = Array.new} @start_state = :default end |
Instance Attribute Details
- (Object) start_state (readonly)
Returns the value of attribute start_state
103 104 105 |
# File 'lib/rltk/lexer.rb', line 103 def start_state @start_state end |
Instance Method Details
- (Object) lex(string, env, file_name = nil)
Lex string, using env as the environment. This method will return the array of tokens generated by the lexer with a token of type EOS (End of Stream) appended to the end.
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
# File 'lib/rltk/lexer.rb', line 115 def lex(string, env, file_name = nil) # Offset from start of stream. stream_offset = 0 # Offset from the start of the line. line_offset = 0 line_number = 1 # Empty token list. tokens = Array.new # The scanner. scanner = StringScanner.new(string) # Start scanning the input string. until scanner.eos? match = nil # If the match_type is set to :longest all of the # rules for the current state need to be scanned # and the longest match returned. If the # match_type is :first, we only need to scan until # we find a match. @rules[env.state].each do |rule| if (rule.flags - env.flags).empty? if txt = scanner.check(rule.pattern) if not match or match.first.length < txt.length match = [txt, rule] break if @match_type == :first end end end end if match rule = match.last txt = scanner.scan(rule.pattern) type, value = env.instance_exec(txt, &rule.action) if type pos = StreamPosition.new(stream_offset, line_number, line_offset, txt.length, file_name) tokens << Token.new(type, value, pos) end # Advance our stat counters. stream_offset += txt.length if (newlines = txt.count("\n")) > 0 line_number += newlines line_offset = 0 else line_offset += txt.length() end else error = LexingError.new(stream_offset, line_number, line_offset, scanner.post_match) raise(error, 'Unable to match string with any of the given rules') end end return tokens << Token.new(:EOS) end |
- (Object) lex_file(file_name, evn)
A wrapper function that calls ParserCore.lex on the contents of a file.
181 182 183 |
# File 'lib/rltk/lexer.rb', line 181 def lex_file(file_name, evn) File.open(file_name, 'r') { |f| lex(f.read, env, file_name) } end |
- (Object) match_first
Used to tell a lexer to use the first match found instead of the longest match found.
187 188 189 |
# File 'lib/rltk/lexer.rb', line 187 def match_first @match_type = :first end |
- (Object) rule(pattern, state = :default, flags = [], &action) Also known as: r
This method is used to define a new lexing rule. The first argument is the regular expression used to match substrings of the input. The second argument is the state to which the rule belongs. Flags that need to be set for the rule to be considered are specified by the third argument. The last argument is a block that returns a type and value to be used in constructing a Token. If no block is specified the matched substring will be discarded and lexing will continue.
200 201 202 203 204 205 206 207 208 |
# File 'lib/rltk/lexer.rb', line 200 def rule(pattern, state = :default, flags = [], &action) # If no action is given we will set it to an empty # action. action ||= Proc.new() {} r = Rule.new(pattern, action, state, flags) if state == :ALL then @rules.each_key { |k| @rules[k] << r } else @rules[state] << r end end |
- (Object) start(state)
Changes the starting state of the lexer.
213 214 215 |
# File 'lib/rltk/lexer.rb', line 213 def start(state) @start_state = state end |