Class: RedCloth3

Inherits:
String show all
Defined in:
lib/redcloth3.rb

Overview

RedCloth

RedCloth is a Ruby library for converting Textile and/or Markdown into HTML. You can use either format, intermingled or separately. You can also extend RedCloth to honor your own custom text stylings.

RedCloth users are encouraged to use Textile if they are generating HTML and to use Markdown if others will be viewing the plain text.

What is Textile?

Textile is a simple formatting style for text documents, loosely based on some HTML conventions.

Sample Textile Text

h2. This is a title

h3. This is a subhead

This is a bit of paragraph.

bq. This is a blockquote.

Writing Textile

A Textile document consists of paragraphs. Paragraphs can be specially formatted by adding a small instruction to the beginning of the paragraph.

h[n].   Header of size [n].
bq.     Blockquote.
#       Numeric list.
*       Bulleted list.

Quick Phrase Modifiers

Quick phrase modifiers are also included, to allow formatting of small portions of text within a paragraph.

\_emphasis\_
\_\_italicized\_\_
\*strong\*
\*\*bold\*\*
??citation??
-deleted text-
+inserted text+
^superscript^
~subscript~
@code@
%(classname)span%

==notextile== (leave text alone)

Links

To make a hypertext link, put the link text in "quotation marks" followed immediately by a colon and the URL of the link.

Optional: text in (parentheses) following the link text, but before the closing quotation mark, will become a Title attribute for the link, visible as a tool tip when a cursor is above it.

Example:

"This is a link (This is a title) ":http://www.textism.com

Will become:

<a href="http://www.textism.com" title="This is a title">This is a link</a>

Images

To insert an image, put the URL for the image inside exclamation marks.

Optional: text that immediately follows the URL in (parentheses) will be used as the Alt text for the image. Images on the web should always have descriptive Alt text for the benefit of readers using non-graphical browsers.

Optional: place a colon followed by a URL immediately after the closing ! to make the image into a link.

Example:

!http://www.textism.com/common/textist.gif(Textist)!

Will become:

<img src="http://www.textism.com/common/textist.gif" alt="Textist" />

With a link:

!/common/textist.gif(Textist)!:http://textism.com

Will become:

<a href="http://textism.com"><img src="/common/textist.gif" alt="Textist" /></a>

Defining Acronyms

HTML allows authors to define acronyms via the tag. The definition appears as a tool tip when a cursor hovers over the acronym. A crucial aid to clear writing, this should be used at least once for each acronym in documents where they appear.

To quickly define an acronym in Textile, place the full text in (parentheses) immediately following the acronym.

Example:

ACLU(American Civil Liberties Union)

Will become:

<acronym title="American Civil Liberties Union">ACLU</acronym>

Adding Tables

In Textile, simple tables can be added by seperating each column by a pipe.

|a|simple|table|row|
|And|Another|table|row|

Attributes are defined by style definitions in parentheses.

table(border:1px solid black).
(background:#ddd;color:red). |{}| | | |

Using RedCloth

RedCloth is simply an extension of the String class, which can handle Textile formatting. Use it like a String and output HTML with its RedCloth#to_html method.

doc = RedCloth.new "

h2. Test document

Just a simple test."

puts doc.to_html

By default, RedCloth uses both Textile and Markdown formatting, with Textile formatting taking precedence. If you want to turn off Markdown formatting, to boost speed and limit the processor:

class RedCloth::Textile.new( str )

Direct Known Subclasses

Redmine::WikiFormatting::Textile::Formatter

Constant Summary

VERSION =
'3.0.4'
DEFAULT_RULES =
[:textile, :markdown]
TEXTILE_TAGS =

Mapping of 8-bit ASCII codes to HTML numerical entity equivalents. (from PyTextile)

[[128, 8364], [129, 0], [130, 8218], [131, 402], [132, 8222], [133, 8230], 
 [134, 8224], [135, 8225], [136, 710], [137, 8240], [138, 352], [139, 8249], 
 [140, 338], [141, 0], [142, 0], [143, 0], [144, 0], [145, 8216], [146, 8217], 
 [147, 8220], [148, 8221], [149, 8226], [150, 8211], [151, 8212], [152, 732], 
 [153, 8482], [154, 353], [155, 8250], [156, 339], [157, 0], [158, 0], [159, 376]].

collect! do |a, b|
    [a.chr, ( b.zero? and "" or "&#{ b };" )]
end
A_HLGN =

Regular expressions to convert to HTML.

/(?:(?:<>|<|>|\=|[()]+)+)/
A_VLGN =
/[\-^~]/
C_CLAS =
'(?:\([^)]+\))'
C_LNGE =
'(?:\[[^\[\]]+\])'
C_STYL =
'(?:\{[^}]+\})'
S_CSPN =
'(?:\\\\\d+)'
S_RSPN =
'(?:/\d+)'
A =
"(?:#{A_HLGN}?#{A_VLGN}?|#{A_VLGN}?#{A_HLGN}?)"
S =
"(?:#{S_CSPN}?#{S_RSPN}|#{S_RSPN}?#{S_CSPN}?)"
C =
"(?:#{C_CLAS}?#{C_STYL}?#{C_LNGE}?|#{C_STYL}?#{C_LNGE}?#{C_CLAS}?|#{C_LNGE}?#{C_STYL}?#{C_CLAS}?)"
PUNCT =

PUNCT = Regexp::quote( '!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~' )

Regexp::quote( '!"#$%&\'*+,-./:;=?@\\^_`|~' )
PUNCT_NOQ =
Regexp::quote( '!"#$&\',./:;=?@\\`|' )
PUNCT_Q =
Regexp::quote( '*-_+^~%' )
'(\S+?)([^\w\s/;=\?]*?)(?=\s|<|$)'
SIMPLE_HTML_TAGS =

Text markup tags, don't conflict with block tags

[
    'tt', 'b', 'i', 'big', 'small', 'em', 'strong', 'dfn', 'code', 
    'samp', 'kbd', 'var', 'cite', 'abbr', 'acronym', 'a', 'img', 'br',
    'br', 'map', 'q', 'sub', 'sup', 'span', 'bdo'
]
QTAGS =
[
    ['**', 'b', :limit],
    ['*', 'strong', :limit],
    ['??', 'cite', :limit],
    ['-', 'del', :limit],
    ['__', 'i', :limit],
    ['_', 'em', :limit],
    ['%', 'span', :limit],
    ['+', 'ins', :limit],
    ['^', 'sup', :limit],
    ['~', 'sub', :limit]
]
GLYPHS =

Elements to handle

[
#   [ /([^\s\[{(>])?\'([dmst]\b|ll\b|ve\b|\s|:|$)/, '\1&#8217;\2' ], # single closing
#   [ /([^\s\[{(>#{PUNCT_Q}][#{PUNCT_Q}]*)\'/, '\1&#8217;' ], # single closing
#   [ /\'(?=[#{PUNCT_Q}]*(s\b|[\s#{PUNCT_NOQ}]))/, '&#8217;' ], # single closing
#   [ /\'/, '&#8216;' ], # single opening
#   [ /</, '&lt;' ], # less-than
#   [ />/, '&gt;' ], # greater-than
#   [ /([^\s\[{(])?"(\s|:|$)/, '\1&#8221;\2' ], # double closing
#   [ /([^\s\[{(>#{PUNCT_Q}][#{PUNCT_Q}]*)"/, '\1&#8221;' ], # double closing
#   [ /"(?=[#{PUNCT_Q}]*[\s#{PUNCT_NOQ}])/, '&#8221;' ], # double closing
#   [ /"/, '&#8220;' ], # double opening
#   [ /\b( )?\.{3}/, '\1&#8230;' ], # ellipsis
#   [ /\b([A-Z][A-Z0-9]{2,})\b(?:[(]([^)]*)[)])/, '<acronym title="\2">\1</acronym>' ], # 3+ uppercase acronym
#   [ /(^|[^"][>\s])([A-Z][A-Z0-9 ]+[A-Z0-9])([^<A-Za-z0-9]|$)/, '\1<span class="caps">\2</span>\3', :no_span_caps ], # 3+ uppercase caps
#   [ /(\.\s)?\s?--\s?/, '\1&#8212;' ], # em dash
#   [ /\s->\s/, ' &rarr; ' ], # right arrow
#   [ /\s-\s/, ' &#8211; ' ], # en dash
#   [ /(\d+) ?x ?(\d+)/, '\1&#215;\2' ], # dimension sign
#   [ /\b ?[(\[]TM[\])]/i, '&#8482;' ], # trademark
#   [ /\b ?[(\[]R[\])]/i, '&#174;' ], # registered
#   [ /\b ?[(\[]C[\])]/i, '&#169;' ] # copyright
]
H_ALGN_VALS =
{
    '<' => 'left',
    '=' => 'center',
    '>' => 'right',
    '<>' => 'justify'
}
V_ALGN_VALS =
{
    '^' => 'top',
    '-' => 'middle',
    '~' => 'bottom'
}
TABLE_RE =
/^(?:table(_?#{S}#{A}#{C})\. ?\n)?^(#{A}#{C}\.? ?\|.*?\|)(\n\n|\Z)/m
LISTS_RE =
/^([#*]+?#{C} .*?)$(?![^#*])/m
LISTS_CONTENT_RE =
/^([#*]+)(#{A}#{C}) (.*)$/m
QUOTES_RE =
/(^>+([^\n]*?)\n?)+/m
QUOTES_CONTENT_RE =
/^([> ]+)(.*)$/m
CODE_RE =
/(\W)
@
(?:\|(\w+?)\|)?
(.+?)
@
(?=\W)/x
BLOCKS_GROUP_RE =
/\n{2,}(?! )/m
BLOCK_RE =
/^(([a-z]+)(\d*))(#{A}#{C})\.(?::(\S+))? (.*)$/m
SETEXT_RE =
/\A(.+?)\n([=-])[=-]* *$/m
ATX_RE =
/\A(\#{1,6})  # $1 = string of #'s
[ ]*
(.+?)       # $2 = Header text
[ ]*
\#*         # optional closing #'s (not counted)
$/x
MARKDOWN_BQ_RE =
/\A(^ *> ?.+$(.+\n)*\n*)+/m
MARKDOWN_RULE_RE =
/^(#{
    ['*', '-', '_'].collect { |ch| ' ?(' + Regexp::quote( ch ) + ' ?){3,}' }.join( '|' )
})$/
/
    ([\s\[{(]|[#{PUNCT}])?     # $pre
    "                          # start
    (#{C})                     # $atts
    ([^"\n]+?)                 # $text
    \s?
    (?:\(([^)]+?)\)(?="))?     # $title
    ":
    (                          # $url
    (\/|[a-zA-Z]+:\/\/|www\.)  # $proto
    [\w\/]\S+?
    )               
    (\/)?                      # $slash
    ([^\w\=\/;\(\)]*?)         # $post
    (?=<|\s|$)
/x
/
    \[([^\[\]]+)\]      # $text
    [ ]?                # opt. space
    (?:\n[ ]*)?         # one optional newline followed by spaces
    \[(.*?)\]           # $id
/x
/
    \[([^\[\]]+)\]      # $text
    \(                  # open paren
    [ \t]*              # opt space
    <?(.+?)>?           # $href
    [ \t]*              # opt space
    (?:                 # whole title
    (['"])              # $quote
    (.*?)               # $title
    \3                  # matching quote
    )?                  # title is optional
    \)
/x
TEXTILE_REFS_RE =
/(^ *)\[([^\[\n]+?)\](#{HYPERLINK})(?=\s|$)/
MARKDOWN_REFS_RE =
/(^ *)\[([^\n]+?)\]:\s+<?(#{HYPERLINK})>?(?:\s+"((?:[^"]|\\")+)")?(?=\s|$)/m
IMAGE_RE =
/
    (<p>|.|^)            # start of line?
    \!                   # opening
    (\<|\=|\>)?          # optional alignment atts
    (#{C})               # optional style,class atts
    (?:\. )?             # optional dot-space
    ([^\s(!]+?)          # presume this is the src
    \s?                  # optional space
    (?:\(((?:[^\(\)]|\([^\)]+\))+?)\))?   # optional title
    \!                   # closing
    (?::#{ HYPERLINK })? # optional href
/x
OFFTAGS =
/(code|pre|kbd|notextile)/
OFFTAG_MATCH =
/(?:(<\/#{ OFFTAGS }>)|(<#{ OFFTAGS }[^>]*>))(.*?)(?=<\/?#{ OFFTAGS }|\Z)/mi
OFFTAG_OPEN =
/<#{ OFFTAGS }/
OFFTAG_CLOSE =
/<\/?#{ OFFTAGS }/
HASTAG_MATCH =
/(<\/?\w[^\n]*?>)/m
ALLTAG_MATCH =
/(<\/?\w[^\n]*?>)|.*?(?=<\/?\w[^\n]*?>|$)/m
BASIC_TAGS =

HTML cleansing stuff

{
    'a' => ['href', 'title'],
    'img' => ['src', 'alt', 'title'],
    'br' => [],
    'i' => nil,
    'u' => nil, 
    'b' => nil,
    'pre' => nil,
    'kbd' => nil,
    'code' => ['lang'],
    'cite' => nil,
    'strong' => nil,
    'em' => nil,
    'ins' => nil,
    'sup' => nil,
    'sub' => nil,
    'del' => nil,
    'table' => nil,
    'tr' => nil,
    'td' => ['colspan', 'rowspan'],
    'th' => nil,
    'ol' => nil,
    'ul' => nil,
    'li' => nil,
    'p' => nil,
    'h1' => nil,
    'h2' => nil,
    'h3' => nil,
    'h4' => nil,
    'h5' => nil,
    'h6' => nil, 
    'blockquote' => ['cite']
}
ALLOWED_TAGS =
%w(redpre pre code notextile)

Instance Attribute Summary (collapse)

Instance Method Summary (collapse)

Methods inherited from String

#with_leading_slash

Methods included from Redmine::CoreExtensions::String::Conversions

#to_hours

Methods included from Diffable

#diff, #patch, #replacenextlarger, #reverse_hash

Constructor Details

- (RedCloth3) initialize(string, restrictions = [])

Returns a new RedCloth object, based on string and enforcing all the included restrictions.

r = RedCloth.new( "h1. A <b>bold</b> man", [:filter_html] )
r.to_html
  #=>"<h1>A &lt;b&gt;bold&lt;/b&gt; man</h1>"


254
255
256
257
# File 'lib/redcloth3.rb', line 254

def initialize( string, restrictions = [] )
    restrictions.each { |r| method( "#{ r }=" ).call( true ) }
    super( string )
end

Instance Attribute Details

- (Object) filter_html

Two accessor for setting security restrictions.

This is a nice thing if you're using RedCloth for formatting in public places (e.g. Wikis) where you don't want users to abuse HTML for bad things.

If :filter_html is set, HTML which wasn't created by the Textile processor will be escaped.

If :filter_styles is set, it will also disable the style markup specifier. ('red')



185
186
187
# File 'lib/redcloth3.rb', line 185

def filter_html
  @filter_html
end

- (Object) filter_styles

Two accessor for setting security restrictions.

This is a nice thing if you're using RedCloth for formatting in public places (e.g. Wikis) where you don't want users to abuse HTML for bad things.

If :filter_html is set, HTML which wasn't created by the Textile processor will be escaped.

If :filter_styles is set, it will also disable the style markup specifier. ('red')



185
186
187
# File 'lib/redcloth3.rb', line 185

def filter_styles
  @filter_styles
end

- (Object) hard_breaks

Accessor for toggling hard breaks.

If :hard_breaks is set, single newlines will be converted to HTML break tags. This is the default behavior for traditional RedCloth.



194
195
196
# File 'lib/redcloth3.rb', line 194

def hard_breaks
  @hard_breaks
end

- (Object) lite_mode

Accessor for toggling lite mode.

In lite mode, block-level rules are ignored. This means that tables, paragraphs, lists, and such aren't available. Only the inline markup for bold, italics, entities and so on.

r = RedCloth.new( "And then? She *fell*!", [:lite_mode] )
r.to_html
#=> "And then? She <strong>fell</strong>!"


206
207
208
# File 'lib/redcloth3.rb', line 206

def lite_mode
  @lite_mode
end

- (Object) no_span_caps

Accessor for toggling span caps.

Textile places `span' tags around capitalized words by default, but this wreaks havoc on Wikis. If :no_span_caps is set, this will be suppressed.



216
217
218
# File 'lib/redcloth3.rb', line 216

def no_span_caps
  @no_span_caps
end

- (Object) rules

Establishes the markup predence. Available rules include:

Textile Rules

The following textile rules can be set individually. Or add the complete set of rules with the single :textile rule, which supplies the rule set in the following precedence:

refs_textile

Textile references (i.e. [hobix]hobix.com/)

block_textile_table

Textile table block structures

block_textile_lists

Textile list structures

block_textile_prefix

Textile blocks with prefixes (i.e. bq., h2., etc.)

inline_textile_image

Textile inline images

inline_textile_link

Textile inline links

inline_textile_span

Textile inline spans

glyphs_textile

Textile entities (such as em-dashes and smart quotes)

Markdown

refs_markdown

Markdown references (for example: [hobix]: hobix.com/)

block_markdown_setext

Markdown setext headers

block_markdown_atx

Markdown atx headers

block_markdown_rule

Markdown horizontal rules

block_markdown_bq

Markdown blockquotes

block_markdown_lists

Markdown lists

inline_markdown_link

Markdown links



245
246
247
# File 'lib/redcloth3.rb', line 245

def rules
  @rules
end

Instance Method Details

- (Object) to_html(*rules)

Generates HTML from the Textile contents.

r = RedCloth.new( "And then? She *fell*!" )
r.to_html( true )
  #=>"And then? She <strong>fell</strong>!"


266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
# File 'lib/redcloth3.rb', line 266

def to_html( *rules )
    rules = DEFAULT_RULES if rules.empty?
    # make our working copy
    text = self.dup
    
    @urlrefs = {}
    @shelf = []
    textile_rules = [:refs_textile, :block_textile_table, :block_textile_lists,
                     :block_textile_prefix, :inline_textile_image, :inline_textile_link,
                     :inline_textile_code, :inline_textile_span, :glyphs_textile]
    markdown_rules = [:refs_markdown, :block_markdown_setext, :block_markdown_atx, :block_markdown_rule,
                      :block_markdown_bq, :block_markdown_lists, 
                      :inline_markdown_reflink, :inline_markdown_link]
    @rules = rules.collect do |rule|
        case rule
        when :markdown
            markdown_rules
        when :textile
            textile_rules
        else
            rule
        end
    end.flatten

    # standard clean up
    incoming_entities text 
    clean_white_space text 

    # start processor
    @pre_list = []
    rip_offtags text
    no_textile text
    escape_html_tags text
    hard_break text 
    unless @lite_mode
        refs text
        # need to do this before text is split by #blocks
        block_textile_quotes text
        blocks text
    end
    inline text
    smooth_offtags text

    retrieve text

    text.gsub!( /<\/?notextile>/, '' )
    text.gsub!( /x%x%/, '&#38;' )
    clean_html text if filter_html
    text.strip!
    text

end