Class: RDoc::Parser

Inherits:
Object show all
Defined in:
lib/rdoc/parser.rb

Overview

A parser is simple a class that implements

#initialize(file_name, body, options)

and

#scan

The initialize method takes a file name to be used, the body of the file, and an RDoc::Options object. The scan method is then called to return an appropriately parsed TopLevel code object.

The ParseFactory is used to redirect to the correct parser given a filename extension. This magic works because individual parsers have to register themselves with us as they are loaded in. The do this using the following incantation

require "rdoc/parser"

class RDoc::Parser::Xyz < RDoc::Parser
  parse_files_matching /\.xyz$/ # <<<<

  def initialize(file_name, body, options)
    ...
  end

  def scan
    ...
  end
end

Just to make life interesting, if we suspect a plain text file, we also look for a shebang line just in case it's a potential shell script

Defined Under Namespace

Modules: RubyTools Classes: C, Ruby, Simple

Class Attribute Summary (collapse)

Class Method Summary (collapse)

Instance Method Summary (collapse)

Constructor Details

- (Parser) initialize(top_level, file_name, content, options, stats)

Creates a new Parser storing top_level, file_name, content, options and stats in instance variables.

Usually invoked by super



193
194
195
196
197
198
199
# File 'lib/rdoc/parser.rb', line 193

def initialize(top_level, file_name, content, options, stats)
  @top_level = top_level
  @file_name = file_name
  @content = content
  @options = options
  @stats = stats
end

Class Attribute Details

+ (Object) parsers (readonly)

A Hash that maps file extensions regular expressions to parsers that will consume them.

Use parse_files_matching to register a parser's file extensions.



53
54
55
# File 'lib/rdoc/parser.rb', line 53

def parsers
  @parsers
end

Class Method Details

+ (Object) alias_extension(old_ext, new_ext)

Alias an extension to another extension. After this call, files ending "new_ext" will be parsed using the same parser as "old_ext"



61
62
63
64
65
66
67
68
69
70
71
# File 'lib/rdoc/parser.rb', line 61

def self.alias_extension(old_ext, new_ext)
  old_ext = old_ext.sub(/^\.(.*)/, '\1')
  new_ext = new_ext.sub(/^\.(.*)/, '\1')

  parser = can_parse "xxx.#{old_ext}"
  return false unless parser

  RDoc::Parser.parsers.unshift [/\.#{new_ext}$/, parser]

  true
end

+ (Boolean) binary?(file)

Determines if the file is a "binary" file which basically means it has content that an RDoc parser shouldn't try to consume.

Returns:

  • (Boolean)


77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
# File 'lib/rdoc/parser.rb', line 77

def self.binary?(file)
  return false if file =~ /\.(rdoc|txt)$/

  s = File.read(file, 1024) or return false

  have_encoding = s.respond_to? :encoding

  if have_encoding then
    return false if s.encoding != Encoding::ASCII_8BIT and s.valid_encoding?
  end

  return true if s[0, 2] == Marshal.dump('')[0, 2] or s.index("\x00")

  if have_encoding then
    s.force_encoding Encoding.default_external

    not s.valid_encoding?
  else
    if 0.respond_to? :fdiv then
      s.count("\x00-\x7F", "^ -~\t\r\n").fdiv(s.size) > 0.3
    else # HACK 1.8.6
      (s.count("\x00-\x7F", "^ -~\t\r\n").to_f / s.size) > 0.3
    end
  end
end

+ (Object) can_parse(file_name)

Return a parser that can handle a particular extension



141
142
143
144
145
146
147
148
149
150
151
152
153
# File 'lib/rdoc/parser.rb', line 141

def self.can_parse(file_name)
  parser = RDoc::Parser.parsers.find { |regexp,| regexp =~ file_name }.last

  # HACK Selenium hides a jar file using a .txt extension
  return if parser == RDoc::Parser::Simple and zip? file_name

  # The default parser must not parse binary files
  ext_name = File.extname file_name
  return parser if ext_name.empty?
  return if parser == RDoc::Parser::Simple and ext_name !~ /txt|rdoc/

  parser
end

+ (Object) for(top_level, file_name, body, options, stats)

Find the correct parser for a particular file name. Return a SimpleParser for ones that we don't know



159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
# File 'lib/rdoc/parser.rb', line 159

def self.for(top_level, file_name, body, options, stats)
  return if binary? file_name

  # If no extension, look for shebang
  if file_name !~ /\.\w+$/ && body =~ %r{\A#!(.+)} then
    shebang = $1
    case shebang
    when %r{env\s+ruby}, %r{/ruby}
      file_name = "dummy.rb"
    end
  end

  parser = can_parse file_name

  return unless parser

  parser.new top_level, file_name, body, options, stats
end

+ (Object) parse_files_matching(regexp)

Record which file types this parser can understand.

It is ok to call this multiple times.



183
184
185
# File 'lib/rdoc/parser.rb', line 183

def self.parse_files_matching(regexp)
  RDoc::Parser.parsers.unshift [regexp, self]
end

+ (Object) process_directive(code_object, directive, value)

Processes common directives for CodeObjects for the C and Ruby parsers.

Applies directive's value to code_object, if appropriate



108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
# File 'lib/rdoc/parser.rb', line 108

def self.process_directive code_object, directive, value
  case directive
  when 'nodoc' then
    code_object.document_self = nil # notify nodoc
    code_object.document_children = value.downcase != 'all'
  when 'doc' then
    code_object.document_self = true
    code_object.force_documentation = true
  when 'yield', 'yields' then
    # remove parameter &block
    code_object.params.sub!(/,?\s*&\w+/, '') if code_object.params

    code_object.block_params = value
  when 'arg', 'args' then
    code_object.params = value
  end
end

+ (Boolean) zip?(file)

Checks if file is a zip file in disguise. Signatures from www.garykessler.net/library/file_sigs.html

Returns:

  • (Boolean)


130
131
132
133
134
135
136
# File 'lib/rdoc/parser.rb', line 130

def self.zip? file
  zip_signature = File.read file, 4

  zip_signature == "PK\x03\x04" or
    zip_signature == "PK\x05\x06" or
    zip_signature == "PK\x07\x08"
end