Class: RDF::NTriples::Writer

Inherits:
Writer
  • Object
show all
Includes:
Util::Logger
Defined in:
lib/rdf/ntriples/writer.rb

Overview

N-Triples serializer.

Output is serialized for UTF-8, to serialize as ASCII (with) unicode escapes, set encoding: Encoding::ASCII as an option to #initialize.

Examples:

Obtaining an NTriples writer class

RDF::Writer.for(:ntriples)     #=> RDF::NTriples::Writer
RDF::Writer.for("etc/test.nt")
RDF::Writer.for(file_name:      "etc/test.nt")
RDF::Writer.for(file_extension: "nt")
RDF::Writer.for(content_type:   "application/n-triples")

Serializing RDF statements into an NTriples file

RDF::NTriples::Writer.open("etc/test.nt") do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

Serializing RDF statements into an NTriples string

RDF::NTriples::Writer.buffer do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

Serializing RDF statements into an NTriples string with escaped UTF-8

RDF::NTriples::Writer.buffer(encoding: Encoding::ASCII) do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

See Also:

Direct Known Subclasses

RDF::NQuads::Writer

Constant Summary collapse

ESCAPE_PLAIN =
/\A[\x20-\x21\x23-\x26\x28#{Regexp.escape '['}#{Regexp.escape ']'}-\x7E]*\z/m.freeze
ESCAPE_PLAIN_U =
/\A(?:#{Reader::IRI_RANGE}|#{Reader::UCHAR})*\z/.freeze

Constants included from Util::Logger

Util::Logger::IOWrapper

Instance Attribute Summary

Attributes inherited from Writer

#options

Class Method Summary collapse

Instance Method Summary collapse

Methods included from Util::Logger

#log_debug, #log_depth, #log_error, #log_fatal, #log_info, #log_recover, #log_recovering?, #log_statistics, #log_warn, #logger

Methods inherited from Writer

accept?, #base_uri, buffer, #canonicalize?, dump, each, #encoding, #flush, for, format, #format_list, #format_term, #node_id, open, options, #prefix, #prefixes, #prefixes=, #puts, #quoted, #to_sym, to_sym, #uri_for, #validate?, #version, #write_epilogue, #write_statement, #write_triples

Methods included from Util::Aliasing::LateBound

#alias_method

Methods included from Writable

#<<, #insert, #insert_graph, #insert_reader, #insert_statement, #insert_statements, #writable?

Methods included from Util::Coercions

#coerce_statements

Constructor Details

#initialize(output = $stdout, validate: true, **options) {|writer| ... } ⇒ Writer

Initializes the writer.

Parameters:

  • output (IO, File) (defaults to: $stdout)

    the output stream

  • validate (Boolean) (defaults to: true)

    (true) whether to validate terms when serializing

  • options (Hash{Symbol => Object})

    ({}) any additional options. See Writer#initialize

Yields:

  • (writer)

    self

Yield Parameters:

Yield Returns:

  • (void)


211
212
213
# File 'lib/rdf/ntriples/writer.rb', line 211

def initialize(output = $stdout, validate: true, **options, &block)
  super
end

Class Method Details

.escape(string, encoding = nil) ⇒ String

Escape Literal and URI content. If encoding is ASCII, all unicode is escaped, otherwise only ASCII characters that must be escaped are escaped.

Parameters:

  • string (String)
  • encoding (Encoding) (defaults to: nil)

Returns:

  • (String)

See Also:



57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
# File 'lib/rdf/ntriples/writer.rb', line 57

def self.escape(string, encoding = nil)
  ret = case
    when string.match?(ESCAPE_PLAIN) # a shortcut for the simple case
      string
    when string.ascii_only?
      StringIO.open do |buffer|
        buffer.set_encoding(Encoding::ASCII)
        string.each_byte { |u| buffer << escape_ascii(u, encoding) }
        buffer.string
      end
    when encoding && encoding != Encoding::ASCII
      # Not encoding UTF-8 characters
      StringIO.open do |buffer|
        buffer.set_encoding(encoding)
        string.each_char do |u|
          buffer << case u.ord
          when (0x00..0x7F)
            escape_ascii(u, encoding)
          when (0xFFFE..0xFFFF)
            # NOT A CHARACTER
            # @see https://corp.unicode.org/~asmus/proposed_faq/private_use.html#history1
            escape_uchar(u)
          else
            u
          end
        end
        buffer.string
      end
    else
      # Encode ASCII && UTF-8 characters
      StringIO.open do |buffer|
        buffer.set_encoding(Encoding::ASCII)
        string.each_codepoint { |u| buffer << escape_unicode(u, encoding) }
        buffer.string
      end
  end
  encoding ? ret.encode(encoding) : ret
end

.escape_ascii(u, encoding) ⇒ String

Standard ASCII escape sequences. If encoding is ASCII, use Test-Cases sequences, otherwise, assume the test-cases escape sequences. Otherwise, the N-Triples recommendation includes \b and \f escape sequences.

Within STRING_LITERAL_QUOTE, only the characters U+0022, U+005C, U+000A, U+000D are encoded using ECHAR. ECHAR must not be used for characters that are allowed directly in STRING_LITERAL_QUOTE.

Parameters:

  • u (Integer, #ord)

Returns:

  • (String)

Raises:

  • (ArgumentError)

    if u is not a valid Unicode codepoint

See Also:



128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
# File 'lib/rdf/ntriples/writer.rb', line 128

def self.escape_ascii(u, encoding)
  case (u = u.ord)
  when (0x08)       then "\\b"
  when (0x09)       then "\\t"
  when (0x0A)       then "\\n"
  when (0x0C)       then "\\f"
  when (0x0D)       then "\\r"
  when (0x22)       then "\\\""
  when (0x5C)       then "\\\\"
  when (0x00..0x1F) then escape_uchar(u)
  when (0x7F)       then escape_uchar(u)  # DEL
  when (0x20..0x7E) then u.chr
  else
    raise ArgumentError.new("expected an ASCII character in (0x00..0x7F), but got 0x#{u.to_s(16)}")
  end
end

.escape_uchar(u) ⇒ String

Parameters:

  • u (Integer, #ord)

Returns:

  • (String)

See Also:

Since:

  • 3.4.4



150
151
152
153
154
155
156
157
158
# File 'lib/rdf/ntriples/writer.rb', line 150

def self.escape_uchar(u)
  #require 'byebug'; byebug
  case u.ord
  when (0x00..0xFFFF)
    sprintf("\\u%04X", u.ord)
  else
    sprintf("\\U%08X", u.ord)
  end
end

.escape_unicode(u, encoding) ⇒ String

Escape ascii and unicode characters. If encoding is UTF_8, only ascii characters are escaped.

Parameters:

  • u (Integer, #ord)
  • encoding (Encoding)

Returns:

  • (String)

Raises:

  • (ArgumentError)

    if u is not a valid Unicode codepoint

See Also:



105
106
107
108
109
110
111
112
113
114
# File 'lib/rdf/ntriples/writer.rb', line 105

def self.escape_unicode(u, encoding)
  case (u = u.ord)
    when (0x00..0x7F)        # ECHAR
      escape_ascii(u, encoding)
    when (0x80...0x10FFFF)   # UCHAR
      escape_uchar(u)
    else
      raise ArgumentError.new("expected a Unicode codepoint in (0x00..0x10FFFF), but got 0x#{u.to_s(16)}")
  end
end

.escape_utf16(u) ⇒ String

Deprecated.

use escape_uchar, this name is non-intuitive

Parameters:

  • u (Integer, #ord)

Returns:

  • (String)

See Also:



165
166
167
# File 'lib/rdf/ntriples/writer.rb', line 165

def self.escape_utf16(u)
  sprintf("\\u%04X", u.ord)
end

.escape_utf32(u) ⇒ String

Deprecated.

use escape_uchar, this name is non-intuitive

Parameters:

  • u (Integer, #ord)

Returns:

  • (String)

See Also:



174
175
176
# File 'lib/rdf/ntriples/writer.rb', line 174

def self.escape_utf32(u)
  sprintf("\\U%08X", u.ord)
end

.serialize(value) ⇒ String

Returns the serialized N-Triples representation of the given RDF value.

Parameters:

Returns:

  • (String)

Raises:

  • (ArgumentError)

    if value is not an RDF::Statement or RDF::Term



185
186
187
188
189
190
191
192
193
194
195
196
197
# File 'lib/rdf/ntriples/writer.rb', line 185

def self.serialize(value)
  writer = (@serialize_writer_memo ||= self.new)
  case value
    when nil then nil
    when FalseClass then value.to_s
    when RDF::Statement
      writer.format_statement(value) + "\n"
    when RDF::Term
      writer.format_term(value)
    else
      raise ArgumentError, "expected an RDF::Statement or RDF::Term, but got #{value.inspect}"
  end
end

Instance Method Details

#format_literal(literal, **options) ⇒ String

Returns the N-Triples representation of a literal.

Parameters:

  • literal (RDF::Literal, String, #to_s)
  • options (Hash{Symbol => Object})

    ({})

Returns:

  • (String)


338
339
340
341
342
343
344
345
346
347
348
349
350
# File 'lib/rdf/ntriples/writer.rb', line 338

def format_literal(literal, **options)
  case literal
    when RDF::Literal
      # Note, escaping here is more robust than in Term
      text = quoted(escaped(literal.value))
      text << "@#{literal.language}" if literal.language?
      text << "--#{literal.direction}" if literal.direction?
      text << "^^<#{uri_for(literal.datatype)}>" if literal.datatype?
      text
    else
      quoted(escaped(literal.to_s))
  end
end

#format_node(node, unique_bnodes: false, **options) ⇒ String

Returns the N-Triples representation of a blank node.

Parameters:

  • node (RDF::Node)
  • unique_bnodes (Boolean) (defaults to: false)

    (false) Serialize node using unique identifier, rather than any used to create the node.

  • options (Hash{Symbol => Object})

    ({})

Returns:

  • (String)


285
286
287
# File 'lib/rdf/ntriples/writer.rb', line 285

def format_node(node, unique_bnodes: false, **options)
  unique_bnodes ? node.to_unique_base : node.to_s
end

#format_statement(statement, **options) ⇒ String

Returns the N-Triples representation of a statement.

Parameters:

  • statement (RDF::Statement)
  • options (Hash{Symbol => Object})

    ({})

Returns:

  • (String)


251
252
253
# File 'lib/rdf/ntriples/writer.rb', line 251

def format_statement(statement, **options)
  format_triple(*statement.to_triple, **options)
end

#format_triple(subject, predicate, object, **options) ⇒ String

Returns the N-Triples representation of a triple.

Parameters:

Returns:

  • (String)


273
274
275
# File 'lib/rdf/ntriples/writer.rb', line 273

def format_triple(subject, predicate, object, **options)
  "%s %s %s ." % [subject, predicate, object].map { |value| format_term(value, **options) }
end

#format_tripleTerm(statement, **options) ⇒ String

Returns the N-Triples representation of an RDF 1.2 triple term.

Parameters:

  • statement (RDF::Statement)
  • options (Hash{Symbol => Object})

    ({})

Returns:

  • (String)


261
262
263
# File 'lib/rdf/ntriples/writer.rb', line 261

def format_tripleTerm(statement, **options)
  "<<( %s %s %s )>>" % statement.to_a.map { |value| format_term(value, **options) }
end

#format_uri(uri, **options) ⇒ String

Returns the N-Triples representation of a URI reference using write encoding.

Parameters:

  • uri (RDF::URI)
  • options (Hash{Symbol => Object})

    ({})

Returns:

  • (String)


295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
# File 'lib/rdf/ntriples/writer.rb', line 295

def format_uri(uri, **options)
  string = uri.to_s
  iriref = case
    when string.match?(ESCAPE_PLAIN_U) # a shortcut for the simple case
      string
    when string.ascii_only? || (encoding && encoding != Encoding::ASCII)
      StringIO.open do |buffer|
        buffer.set_encoding(encoding)
        string.each_char do |u|
          buffer << case u.ord
            when (0x00..0x20) then self.class.escape_uchar(u)
            when 0x22, 0x3c, 0x3e, 0x5c, 0x5e, 0x60, 0x7b, 0x7c, 0x7d # "<>\^`{|}
              self.class.escape_uchar(u)
            else u
          end
        end
        buffer.string
      end
    else
      # Encode ASCII && UTF-8/16 characters
      StringIO.open do |buffer|
        buffer.set_encoding(Encoding::ASCII)
        string.each_byte do |u|
          buffer << case u
            when (0x00..0x20) then self.class.escape_uchar(u)
            when 0x22, 0x3c, 0x3e, 0x5c, 0x5e, 0x60, 0x7b, 0x7c, 0x7d # "<>\^`{|}
              self.class.escape_uchar(u)
            when (0x80..0x10FFFF) then self.class.escape_uchar(u)
            else u
          end
        end
        buffer.string
      end
  end
  encoding ? "<#{iriref}>".encode(encoding) : "<#{iriref}>"
end

#write_comment(text)

This method returns an undefined value.

Outputs an N-Triples comment line.

Parameters:

  • text (String)


230
231
232
# File 'lib/rdf/ntriples/writer.rb', line 230

def write_comment(text)
  puts "# #{text.chomp}" # TODO: correctly output multi-line comments
end

#write_prologueself

This method is abstract.

Output VERSION directive, if specified and not canonicalizing

Returns:

  • (self)


219
220
221
222
223
# File 'lib/rdf/ntriples/writer.rb', line 219

def write_prologue
  puts %(VERSION #{version.inspect}) if version && !canonicalize?
  @logged_errors_at_prolog = log_statistics[:error].to_i
  super
end

#write_triple(subject, predicate, object)

This method returns an undefined value.

Outputs the N-Triples representation of a triple.

Parameters:



241
242
243
# File 'lib/rdf/ntriples/writer.rb', line 241

def write_triple(subject, predicate, object)
  puts format_triple(subject, predicate, object, **@options)
end