Module: Nuggets::Array::HistogramMixin

Included in:
Array
Defined in:
lib/nuggets/array/histogram_mixin.rb

Defined Under Namespace

Classes: HistogramItem

Constant Summary

FORMATS =

Provides some default formats for #formatted_histogram.

Example:

(default)         ab  [==]  2
(percent)         xyz [===] 3 (37.50%)
(numeric)          42 [==]  2
(numeric_percent) 123 [=]   1 (12.50%)

The “numeric” variants format the item as a (decimal) number.

{
  :default         => '%-*s [%s]%*s %*d',
  :percent         => '%-*s [%s]%*s %*d (%.2f%%)',
  :numeric         =>  '%*d [%s]%*s %*d',
  :numeric_percent =>  '%*d [%s]%*s %*d (%.2f%%)'
}

Instance Method Summary (collapse)

Instance Method Details

- (Object) annotated_histogram

call-seq:

array.annotated_histogram => anArray
array.annotated_histogram { |hist_item| ... } => aHash

Calculates the #histogram for array and yields each histogram item (see HistogramItem) to the block or returns an Array of the histogram items.



97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
# File 'lib/nuggets/array/histogram_mixin.rb', line 97

def annotated_histogram
  hist, items = histogram, []

  percentage = size / 100.0

  max_freq = hist.values.max
  max_freq_length = max_freq.to_s.length

  max_item_length = hist.keys.map { |item| item.to_s.length }.max

  # try to sort the histogram hash
  begin
    hist = hist.sort
  rescue ::ArgumentError
  end

  hist.each { |item, freq|
    hist_item = HistogramItem.new(
      item, freq, max_freq, max_freq_length, max_item_length, freq / percentage
    )

    block_given? ? yield(hist_item) : items << hist_item
  }

  block_given? ? hist : items
end

- (Object) formatted_histogram(format = :default, indicator = '=')

call-seq:

array.formatted_histogram([format[, indicator]]) => aString

Returns the #histogram of array as a formatted String according to format, using indicator to draw the frequency bar.

format may be a Symbol indicating one of the provided default formats (see FORMATS) or a format String (see Kernel#sprintf) that will receive the following arguments (in order):

  1. max_item_length (Integer)

  2. item (String)

  3. “frequency_bar” (String)

  4. “padding” (String)

  5. max_freq_length (Integer)

  6. freq (Integer)

  7. percentage (Float, optional)

See HistogramItem for further details on the individual arguments.

Raises:

  • (::TypeError)


143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
# File 'lib/nuggets/array/histogram_mixin.rb', line 143

def formatted_histogram(format = :default, indicator = '=')
  format = FORMATS[format] if FORMATS.key?(format)
  raise ::TypeError, "String expected, got #{format.class}" unless format.is_a?(::String)

  include_percentage = format.include?('%%')
  indicator_length   = indicator.length

  lines = []

  annotated_histogram { |hist|
    arguments = [
      hist.max_item_length, hist.item,                     # item (padded)
      indicator * hist.freq,                               # indicator bar
      (hist.max_freq - hist.freq) * indicator_length, '',  # indicator padding
      hist.max_freq_length, hist.freq                      # frequency (padded)
    ]

    arguments << hist.percentage if include_percentage     # percentage (optional)

    lines << format % arguments
  }

  lines.join("\n")
end

- (Object) histogram

call-seq:

array.histogram => aHash
array.histogram { |x| ... } => aHash

Calculates the frequency histogram of the values in array. Returns a Hash that maps any value, or the result of the value yielded to the block, to its frequency.



69
70
71
72
73
# File 'lib/nuggets/array/histogram_mixin.rb', line 69

def histogram
  hist = ::Hash.new(0)
  each { |x| hist[block_given? ? yield(x) : x] += 1 }
  hist
end

- (Object) probability_mass_function(&block) Also known as: pmf

call-seq:

array.probability_mass_function => aHash
array.probability_mass_function { |x| ... } => aHash

Calculates the probability mass function (normalized histogram) of the values in array. Returns a Hash that maps any value, or the result of the value yielded to the block, to its probability (via #histogram).



83
84
85
86
# File 'lib/nuggets/array/histogram_mixin.rb', line 83

def probability_mass_function(&block)
  hist, n = histogram(&block), size.to_f
  hist.each { |k, v| hist[k] = v / n }
end