Class: Text::Hyphen::Language

Inherits:
Object
  • Object
show all
Defined in:
lib/text/hyphen/language.rb

Overview

Language scaffolding support for Text::Hyphen. Language hyphenation patterns are defined as instances of this class -- and only this class. This is a deliberate "breaking" of Ruby's concept of duck-typing and is intended to provide an indication that the patterns have been converted from TeX encodings to other encodings (e.g., latin1 or UTF-8) that are more suitable to general text manipulations.

Constant Summary

WORD_START_RE =
%r{^\.}
WORD_END_RE =

:nodoc: :nodoc:

%r{\.$}
DIGIT_RE =
%r{\d}
NONDIGIT_RE =

:nodoc: :nodoc:

%r{\D}
DASH_RE =

:nodoc:

%r{-}
EXCEPTION_DASH0_RE =

:nodoc:

%r{[^-](?=[^-])}
EXCEPTION_DASH1_RE =
%r{[^-]-}
EXCEPTION_NONUM_RE =

:nodoc: :nodoc:

%r{[^01]}
ZERO_INSERT_RE =

:nodoc:

%r{(\D)(?=\D)}
ZERO_START_RE =

:nodoc:

%r{^(?=\D)}
DEFAULT_ENCODING =

:nodoc:

if RUBY_VERSION < "1.9.1" #:nodoc:
  "latin1"
else
  "utf-8"
end

Instance Attribute Summary (collapse)

Class Method Summary (collapse)

Instance Method Summary (collapse)

Constructor Details

- (Language) initialize(language = nil) {|_self| ... }

Creates a new language implementation. If a language object is provided, the default values will be set from the provided language. An exception will be thrown if a value is provided for language that is not an instance of Text::Hyphen::Language.

Yields:

  • (_self)

Yield Parameters:



141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
# File 'lib/text/hyphen/language.rb', line 141

def initialize(language = nil)
  if language.nil?
    self.encoding DEFAULT_ENCODING
    self.patterns ""
    self.exceptions ""
    self.left = 2
    self.right = 2
    self.isocode = nil
  elsif language.kind_of? Text::Hyphen::Language
    self.encoding language.encoding
    self.patterns language.instance_variable_get(:@pattern_text)
    self.exceptions language.instance_variable_get(:@exception_text)
    self.left = language.left
    self.right = language.right
    self.isocode = language.isocode
  else
    raise "Languages can only be created from descendants of Text::Hyphen::Language."
  end

  yield self if block_given?
end

Instance Attribute Details

- (Object) isocode

The ISO language code for this language. Generally only used when there are multiple hyphenation tables available for a language.



135
136
137
# File 'lib/text/hyphen/language.rb', line 135

def isocode
  @isocode
end

- (Object) left

The minimum number of letters that can be on the left side of a hyphenation for this language. Defaults to 2.



128
129
130
# File 'lib/text/hyphen/language.rb', line 128

def left
  @left
end

- (Object) right

The minimum number of letters that can be on the right side of a hyphenation for this language. Defaults to 2.



131
132
133
# File 'lib/text/hyphen/language.rb', line 131

def right
  @right
end

Class Method Details

+ (Object) aliases_for(mapping)

Creates language constant aliases for the language.



164
165
166
167
168
169
170
171
172
173
174
175
176
177
# File 'lib/text/hyphen/language.rb', line 164

def self.aliases_for(mapping)
  mapping.each do |language, alias_names|
    unless const_defined? language
      warn "Aliases not created for #{language}; it has not been defined."
      next
    end
    language = const_get(language)

    [ alias_names ].flatten.each do |alias_name|
      next if const_defined? alias_name
      const_set(alias_name, language)
    end
  end
end

Instance Method Details

- (Object) both

Patterns that match either the beginning or end of a word.



41
42
43
# File 'lib/text/hyphen/language.rb', line 41

def both
  @patterns[:both]
end

- (Object) encoding(enc = nil)

The encoding of the hyphenation definitions. The text to be compared must be of the same type.



35
36
37
38
# File 'lib/text/hyphen/language.rb', line 35

def encoding(enc = nil)
  return @encoding if enc.nil?
  @encoding = enc
end

- (Object) exceptions(exc = nil)

Exceptions to the hyphenation patterns.



110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
# File 'lib/text/hyphen/language.rb', line 110

def exceptions(exc = nil)
  return @exceptions if exc.nil?

  @exception_text = exc.dup
  @exceptions = {}

  @exception_text.split.each do |word|
    tag   = word.gsub(DASH_RE,'')
    value = "0" + word.gsub(EXCEPTION_DASH0_RE, '0').gsub(EXCEPTION_DASH1_RE, '1')
    value.gsub!(EXCEPTION_NONUM_RE, '0')
    @exceptions[tag] = value.scan(self.scan_re).map { |c| c.to_i }
  end

  true
end

- (Object) hyphen

Patterns that hyphenate mid-word.



56
57
58
# File 'lib/text/hyphen/language.rb', line 56

def hyphen
  @patterns[:hyphen]
end

- (Object) patterns(pats = nil)

The hyphenation patterns for this language.



61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
# File 'lib/text/hyphen/language.rb', line 61

def patterns(pats = nil)
  return @patterns if pats.nil?

  @pattern_text = pats.dup

  @patterns = {
    :both   => {}, 
    :start  => {},
    :stop   => {},
    :hyphen => {}
  }

  plist = @pattern_text.split($/).map { |ln| ln.gsub(%r{%.*$}, '') }
  plist.each do |line|
    line.split.each do |word|
      next if word.empty?

      start = stop = false

      start = true if word.sub!(WORD_START_RE, '')
      stop  = true if word.sub!(WORD_END_RE, '')

      # Insert zeroes and start with some digit
      word.gsub!(ZERO_INSERT_RE) { "#{$1}0" }
      word.gsub!(ZERO_START_RE, "0")

      # This assumes that the pattern lists are already in lowercase
      # form only.
      tag   = word.gsub(DIGIT_RE, '')
      value = word.gsub(NONDIGIT_RE, '')

      if start and stop
        set = :both
      elsif start
        set = :start
      elsif stop
        set = :stop
      else
        set = :hyphen
      end

      @patterns[set][tag] = value
    end
  end

  true
end

- (Object) scan_re

The character scan regular expression to use.



26
27
28
29
30
31
# File 'lib/text/hyphen/language.rb', line 26

def scan_re #:nodoc:
  if RUBY_VERSION < '1.9.1'
    return %r{.}u if @encoding =~ /utf-?8/i
  end
  return %r{.}
end

- (Object) start

Patterns that match the beginning of a word.



46
47
48
# File 'lib/text/hyphen/language.rb', line 46

def start
  @patterns[:start]
end

- (Object) stop

Patterns that match the end of a word.



51
52
53
# File 'lib/text/hyphen/language.rb', line 51

def stop
  @patterns[:stop]
end