Class: Text::Hyphen::Language
- Inherits:
-
Object
- Object
- Text::Hyphen::Language
- Defined in:
- lib/text/hyphen/language.rb
Overview
Language scaffolding support for Text::Hyphen. Language hyphenation patterns are defined as instances of this class -- and only this class. This is a deliberate "breaking" of Ruby's concept of duck-typing and is intended to provide an indication that the patterns have been converted from TeX encodings to other encodings (e.g., latin1 or UTF-8) that are more suitable to general text manipulations.
Constant Summary
- WORD_START_RE =
%r{^\.}- WORD_END_RE =
:nodoc: :nodoc:
%r{\.$}- DIGIT_RE =
%r{\d}- NONDIGIT_RE =
:nodoc: :nodoc:
%r{\D}- DASH_RE =
:nodoc:
%r{-}- EXCEPTION_DASH0_RE =
:nodoc:
%r{[^-](?=[^-])}- EXCEPTION_DASH1_RE =
%r{[^-]-}- EXCEPTION_NONUM_RE =
:nodoc: :nodoc:
%r{[^01]}- ZERO_INSERT_RE =
:nodoc:
%r{(\D)(?=\D)}- ZERO_START_RE =
:nodoc:
%r{^(?=\D)}- DEFAULT_ENCODING =
:nodoc:
if RUBY_VERSION < "1.9.1" #:nodoc: "latin1" else "utf-8" end
Instance Attribute Summary (collapse)
-
- (Object) isocode
The ISO language code for this language.
-
- (Object) left
The minimum number of letters that can be on the left side of a hyphenation for this language.
-
- (Object) right
The minimum number of letters that can be on the right side of a hyphenation for this language.
Class Method Summary (collapse)
-
+ (Object) aliases_for(mapping)
Creates language constant aliases for the language.
Instance Method Summary (collapse)
-
- (Object) both
Patterns that match either the beginning or end of a word.
-
- (Object) encoding(enc = nil)
The encoding of the hyphenation definitions.
-
- (Object) exceptions(exc = nil)
Exceptions to the hyphenation patterns.
-
- (Object) hyphen
Patterns that hyphenate mid-word.
-
- (Language) initialize(language = nil) {|_self| ... }
constructor
Creates a new language implementation.
-
- (Object) patterns(pats = nil)
The hyphenation patterns for this language.
-
- (Object) scan_re
The character scan regular expression to use.
-
- (Object) start
Patterns that match the beginning of a word.
-
- (Object) stop
Patterns that match the end of a word.
Constructor Details
- (Language) initialize(language = nil) {|_self| ... }
Creates a new language implementation. If a language object is provided, the default values will be set from the provided language. An exception will be thrown if a value is provided for language that is not an instance of Text::Hyphen::Language.
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 |
# File 'lib/text/hyphen/language.rb', line 141 def initialize(language = nil) if language.nil? self.encoding DEFAULT_ENCODING self.patterns "" self.exceptions "" self.left = 2 self.right = 2 self.isocode = nil elsif language.kind_of? Text::Hyphen::Language self.encoding language.encoding self.patterns language.instance_variable_get(:@pattern_text) self.exceptions language.instance_variable_get(:@exception_text) self.left = language.left self.right = language.right self.isocode = language.isocode else raise "Languages can only be created from descendants of Text::Hyphen::Language." end yield self if block_given? end |
Instance Attribute Details
- (Object) isocode
The ISO language code for this language. Generally only used when there are multiple hyphenation tables available for a language.
135 136 137 |
# File 'lib/text/hyphen/language.rb', line 135 def isocode @isocode end |
- (Object) left
The minimum number of letters that can be on the left side of a hyphenation for this language. Defaults to 2.
128 129 130 |
# File 'lib/text/hyphen/language.rb', line 128 def left @left end |
- (Object) right
The minimum number of letters that can be on the right side of a hyphenation for this language. Defaults to 2.
131 132 133 |
# File 'lib/text/hyphen/language.rb', line 131 def right @right end |
Class Method Details
+ (Object) aliases_for(mapping)
Creates language constant aliases for the language.
164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
# File 'lib/text/hyphen/language.rb', line 164 def self.aliases_for(mapping) mapping.each do |language, alias_names| unless const_defined? language warn "Aliases not created for #{language}; it has not been defined." next end language = const_get(language) [ alias_names ].flatten.each do |alias_name| next if const_defined? alias_name const_set(alias_name, language) end end end |
Instance Method Details
- (Object) both
Patterns that match either the beginning or end of a word.
41 42 43 |
# File 'lib/text/hyphen/language.rb', line 41 def both @patterns[:both] end |
- (Object) encoding(enc = nil)
The encoding of the hyphenation definitions. The text to be compared must be of the same type.
35 36 37 38 |
# File 'lib/text/hyphen/language.rb', line 35 def encoding(enc = nil) return @encoding if enc.nil? @encoding = enc end |
- (Object) exceptions(exc = nil)
Exceptions to the hyphenation patterns.
110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 |
# File 'lib/text/hyphen/language.rb', line 110 def exceptions(exc = nil) return @exceptions if exc.nil? @exception_text = exc.dup @exceptions = {} @exception_text.split.each do |word| tag = word.gsub(DASH_RE,'') value = "0" + word.gsub(EXCEPTION_DASH0_RE, '0').gsub(EXCEPTION_DASH1_RE, '1') value.gsub!(EXCEPTION_NONUM_RE, '0') @exceptions[tag] = value.scan(self.scan_re).map { |c| c.to_i } end true end |
- (Object) hyphen
Patterns that hyphenate mid-word.
56 57 58 |
# File 'lib/text/hyphen/language.rb', line 56 def hyphen @patterns[:hyphen] end |
- (Object) patterns(pats = nil)
The hyphenation patterns for this language.
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
# File 'lib/text/hyphen/language.rb', line 61 def patterns(pats = nil) return @patterns if pats.nil? @pattern_text = pats.dup @patterns = { :both => {}, :start => {}, :stop => {}, :hyphen => {} } plist = @pattern_text.split($/).map { |ln| ln.gsub(%r{%.*$}, '') } plist.each do |line| line.split.each do |word| next if word.empty? start = stop = false start = true if word.sub!(WORD_START_RE, '') stop = true if word.sub!(WORD_END_RE, '') # Insert zeroes and start with some digit word.gsub!(ZERO_INSERT_RE) { "#{$1}0" } word.gsub!(ZERO_START_RE, "0") # This assumes that the pattern lists are already in lowercase # form only. tag = word.gsub(DIGIT_RE, '') value = word.gsub(NONDIGIT_RE, '') if start and stop set = :both elsif start set = :start elsif stop set = :stop else set = :hyphen end @patterns[set][tag] = value end end true end |
- (Object) scan_re
The character scan regular expression to use.
26 27 28 29 30 31 |
# File 'lib/text/hyphen/language.rb', line 26 def scan_re #:nodoc: if RUBY_VERSION < '1.9.1' return %r{.}u if @encoding =~ /utf-?8/i end return %r{.} end |
- (Object) start
Patterns that match the beginning of a word.
46 47 48 |
# File 'lib/text/hyphen/language.rb', line 46 def start @patterns[:start] end |
- (Object) stop
Patterns that match the end of a word.
51 52 53 |
# File 'lib/text/hyphen/language.rb', line 51 def stop @patterns[:stop] end |