Class: Spider::FullSanitizer
Instance Method Summary (collapse)
Methods inherited from Sanitizer
Instance Method Details
- (Object) process_node(node, result, options)
42 43 44 |
# File 'lib/spiderfw/utils/sanitizer.rb', line 42 def process_node(node, result, ) result << node.to_s if node.class == HTML::Text end |
- (Object) sanitize(text, options = {})
33 34 35 36 37 38 39 40 |
# File 'lib/spiderfw/utils/sanitizer.rb', line 33 def sanitize(text, = {}) result = super # strip any comments, and if they have a newline at the end (ie. line with # only a comment) strip that too result.gsub!(/<!--(.*?)-->[\n]?/m, "") if result # Recurse - handle all dirty nested tags result == text ? result : sanitize(result, ) end |