Class: Jobs::Analysis::SingleTermVectors

Inherits:
Base show all
Defined in:
lib/jobs/analysis/single_term_vectors.rb

Overview

Write out the term vectors for a single document

Instance Attribute Summary

Attributes inherited from Base

#dataset_id, #user_id

Class Method Summary (collapse)

Instance Method Summary (collapse)

Methods inherited from Base

#error, job_list, view_path

Methods inherited from Base

#==, #attributes, #error, #initialize, #max_attempts

Constructor Details

This class inherits a constructor from Jobs::Base

Class Method Details

+ (Boolean) download?

We don't want users to download the YAML file

Returns:

  • (Boolean)


55
# File 'lib/jobs/analysis/single_term_vectors.rb', line 55

def self.download?; false ; end

Instance Method Details

- (undefined) perform

Export the term vectors for a one-document dataset

This job writes out the term vector array to YAML, and will only run on a dataset containing a single document.

Examples:

Start a job for exporting term vectors

Delayed::Job.enqueue Jobs::Analysis::SingleTermVectors.new(
  :user_id => current_user.to_param, 
  :dataset_id => dataset.to_param)

Returns:

  • (undefined)

Raises:

  • (ArgumentError)


20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
# File 'lib/jobs/analysis/single_term_vectors.rb', line 20

def perform        
  # Fetch the user based on ID
  user = User.find(user_id)
  raise ArgumentError, 'User ID is not valid' unless user

  # Fetch the dataset based on ID
  dataset = user.datasets.find(dataset_id)
  raise ArgumentError, 'Dataset ID is not valid' unless dataset
  
  # Make sure the dataset has one entry (you shouldn't
  # be able to start this task unless that's true)
  raise ArgumentError, 'Dataset has too many entries' unless dataset.entries.count == 1
  
  # Make a new analysis task
  @task = dataset.analysis_tasks.create(:name => "Term frequency information", :job_type => 'SingleTermVectors')
  
  # Get the document
  doc = Document.find_with_fulltext dataset.entries[0].shasum
  
  # Get the term vectors
  term_vectors = doc.term_vectors
  raise ArgumentError, 'Document does not have any term vectors' unless term_vectors
  
  # Write them out
  @task.result_file = Download.create_file('single_term_vectors.yml') do |file|
    file.write(term_vectors.to_yaml)
    file.close
  end
  
  # Make sure the task is saved, setting 'finished_at'
  @task.finished_at = DateTime.current
  @task.save
end