Module: MultiXML Private

Extended by:
Helpers, Options, ParseSupport, ParserResolution
Defined in:
lib/multi_xml.rb,
lib/multi_xml/errors.rb,
lib/multi_xml/parser.rb,
lib/multi_xml/helpers.rb,
lib/multi_xml/options.rb,
lib/multi_xml/version.rb,
lib/multi_xml/constants.rb,
lib/multi_xml/file_like.rb,
lib/multi_xml/deprecated.rb,
lib/multi_xml/parsers/ox.rb,
lib/multi_xml/concurrency.rb,
lib/multi_xml/parsers/oga.rb,
lib/multi_xml/parse_support.rb,
lib/multi_xml/parsers/rexml.rb,
lib/multi_xml/parsers/libxml.rb,
lib/multi_xml/parsers/nokogiri.rb,
lib/multi_xml/parser_resolution.rb,
lib/multi_xml/parsers/dom_parser.rb,
lib/multi_xml/parsers/libxml_sax.rb,
lib/multi_xml/parsers/sax_handler.rb,
lib/multi_xml/parsers/nokogiri_sax.rb,
lib/multi_xml/options_normalization.rb,
sig/multi_xml.rbs

Overview

This module is part of a private API. You should avoid using this module if possible, as it may be removed or be changed in the future.

Deprecated public API kept around for one major release

Each method here emits a one-time deprecation warning on first call and delegates to its current-API counterpart. The whole file is loaded by MultiXML so the deprecation surface stays out of the main module definition.

Defined Under Namespace

Modules: Concurrency, FileLike, Helpers, Options, OptionsNormalization, ParseSupport, Parser, ParserResolution, Parsers, _Parser Classes: DisallowedTypeError, FileIO, NoParserError, ParseError, ParserLoadError

Constant Summary collapse

DEPRECATION_WARNINGS_SHOWN =

Tracks which deprecation warnings have already been emitted so each one fires at most once per process. Stored as a Set rather than a Hash so presence checks have unambiguous semantics for mutation tests.

Returns:

  • (Set[Symbol])
Set.new
VERSION =

The current version of MultiXML

Returns:

  • (Gem::Version)

    the gem version

Gem::Version.create("0.9.1")
TEXT_CONTENT_KEY =

Hash key for storing text content within element hashes

Examples:

Accessing text content

result = MultiXML.parse('<name>John</name>')
result["name"] #=> "John" (simplified, but internally uses __content__)

Returns:

  • (String)

    the key "content" used for text content

"__content__".freeze
RUBY_TYPE_TO_XML =

Maps Ruby class names to XML type attribute values

Examples:

Check XML type for a Ruby class

RUBY_TYPE_TO_XML["Integer"] #=> "integer"

Returns:

  • (Hash{String => String})

    mapping of Ruby class names to XML types

{
  "Symbol" => "symbol",
  "Integer" => "integer",
  "BigDecimal" => "decimal",
  "Float" => "float",
  "TrueClass" => "boolean",
  "FalseClass" => "boolean",
  "Date" => "date",
  "DateTime" => "datetime",
  "Time" => "datetime",
  "Array" => "array",
  "Hash" => "hash"
}.freeze
DISALLOWED_TYPES =

XML type attributes disallowed by default for security

These types are blocked to prevent code execution vulnerabilities.

Examples:

Check default disallowed types

DISALLOWED_TYPES #=> ["symbol", "yaml"]

Returns:

  • (Array<String>)

    list of disallowed type names

%w[symbol yaml].freeze
FALSE_BOOLEAN_VALUES =

Values that represent false in XML boolean attributes

Examples:

Check false values

FALSE_BOOLEAN_VALUES.include?("0") #=> true

Returns:

  • (Set<String>)

    values considered false

Set.new(%w[0 false]).freeze
NAMESPACE_MODES =

Supported values for the :namespaces parse option

Examples:

Parse with namespace preservation

MultiXML.parse(xml, namespaces: :preserve)

Returns:

  • (Array<Symbol>)

    the valid namespace handling modes

%i[strip preserve].freeze
DEFAULT_OPTIONS =

Default parsing options

Examples:

View defaults

DEFAULT_OPTIONS[:symbolize_names] #=> false

Returns:

  • (Hash)

    default options for parse method

{
  typecast_xml_value: true,
  disallowed_types: DISALLOWED_TYPES,
  symbolize_names: false,
  namespaces: :strip
}.freeze
PARSER_PREFERENCE =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

Array of [library_name, parser_symbol] pairs

Returns:

  • (Array[Array[String | Symbol]])
if RUBY_ENGINE == "truffleruby"
  [
    ["ox", :ox],
    ["rexml/document", :rexml],
    ["libxml-ruby", :libxml],
    ["oga", :oga],
    ["nokogiri", :nokogiri]
  ].freeze
else
  [
    ["ox", :ox],
    ["libxml-ruby", :libxml],
    ["nokogiri", :nokogiri],
    ["oga", :oga],
    ["rexml/document", :rexml]
  ].freeze
end
PARSE_DATETIME =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

Parses datetime strings, trying Time first then DateTime

Returns:

  • (Proc)

    lambda that parses datetime strings

lambda do |string|
  Time.parse(string).utc
rescue ArgumentError
  begin
    DateTime.parse(string).to_time.utc
  rescue ArgumentError, NoMethodError
    MultiXML.send(:parse_iso_week_datetime, string)
  end
end
ISO_WEEK_DATE =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

Regex matching ISO week dates like YYYY-Www or YYYY-Www-d.

Returns:

  • (Regexp)
/\A(?<year>\d{4})-W(?<week>\d{2})(?:-(?<day>\d))?\z/
FILE_CONVERTER =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

Lambda for creating file-like StringIO from base64 content Uses untyped for content because unpack1 returns various types Uses untyped for entity because hash values are xmlValue but we access specific String keys

Returns:

  • (^(untyped, untyped) -> StringIO)
lambda do |content, entity|
  StringIO.new(content.unpack1("m")).tap do |io|
    io.extend(FileLike)
    file_io = io # : FileIO
    file_io.original_filename = entity["name"]
    file_io.content_type = entity["content_type"]
  end
end
TYPE_CONVERTERS =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

Type converters keyed by XML type attribute string Uses untyped key because hash returns xmlValue, and Hash#[] with non-String returns nil

Returns:

  • (Hash[untyped, Proc | Method])
{
  "symbol" => ->(s) { s.to_sym },
  "string" => :to_s.to_proc,
  "integer" => :to_i.to_proc,
  "float" => :to_f.to_proc,
  "double" => :to_f.to_proc,
  "decimal" => ->(s) { BigDecimal(s) },
  "boolean" => ->(s) { !FALSE_BOOLEAN_VALUES.include?(s.strip) },
  "date" => Date.method(:parse),
  "datetime" => PARSE_DATETIME,
  "dateTime" => PARSE_DATETIME,
  "base64Binary" => ->(s) { s.unpack1("m") },
  "binary" => ->(s, entity) { (entity["encoding"] == "base64") ? s.unpack1("m") : s },
  "file" => FILE_CONVERTER,
  "yaml" => lambda do |string|
    YAML.safe_load(string, permitted_classes: [Symbol, Date, Time])
  rescue ArgumentError, Psych::SyntaxError
    string
  end
}.freeze

Constants included from ParserResolution

ParserResolution::LOADED_PARSER_CHECKS

Constants included from Options

Options::EMPTY_OPTIONS

Class Method Summary collapse

Methods included from Helpers

apply_converter, convert_hash, convert_text_content, disallowed_type?, empty_value?, extract_array_entries, find_array_entries, self?.apply_converter, self?.convert_hash, self?.convert_text_content, self?.disallowed_type?, self?.empty_value?, self?.extract_array_entries, self?.find_array_entries, self?.symbolize_keys, self?.transform_keys, self?.typecast_array, self?.typecast_children, self?.typecast_hash, self?.typecast_xml_value, self?.undasherize_keys, self?.unwrap_file_if_present, self?.unwrap_if_simple, self?.wrap_and_typecast, symbolize_keys, transform_keys, typecast_array, typecast_children, typecast_hash, typecast_xml_value, undasherize_keys, unwrap_file_if_present, unwrap_if_simple, wrap_and_typecast

Methods included from Options

parse_options, parse_options=

Class Method Details

.apply_postprocessingxmlHash

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Apply typecasting and key-symbolization as configured

Parameters:

  • result (xmlHash)
  • options (Hash[Symbol, untyped])

Returns:

  • (xmlHash)


164
# File 'sig/multi_xml.rbs', line 164

def self.apply_postprocessing: (xmlHash result, Hash[Symbol, untyped] options) -> xmlHash

.camelizeString

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Convert snake_case to CamelCase

Parameters:

  • name (String)

Returns:

  • (String)


125
# File 'sig/multi_xml.rbs', line 125

def self.camelize: (String name) -> String

.detect_parserSymbol, String

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Detect the best available parser

Returns:

  • (Symbol, String)


128
# File 'sig/multi_xml.rbs', line 128

def self.detect_parser: () -> (Symbol | String)

.find_available_parserString, ...

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Try to find an available parser by requiring libraries

Returns:

  • (String, Symbol, nil)


134
# File 'sig/multi_xml.rbs', line 134

def self.find_available_parser: () -> (String | Symbol | nil)

.find_loaded_parserSymbol?

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Find an already-loaded parser library

Returns:

  • (Symbol, nil)


131
# File 'sig/multi_xml.rbs', line 131

def self.find_loaded_parser: () -> Symbol?

.loadxmlHash

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Deprecated alias for parse

Parameters:

  • xml (String, StringIO)
  • options (Hash[Symbol, untyped])

Returns:

  • (xmlHash)


112
# File 'sig/multi_xml.rbs', line 112

def self.load: (String | StringIO xml, ?Hash[Symbol, untyped] options) -> xmlHash

.load_parserModule

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Load a parser module by name

Parameters:

  • name (Symbol, String)

Returns:

  • (Module)


122
# File 'sig/multi_xml.rbs', line 122

def self.load_parser: (Symbol | String name) -> Module

.normalize_inputObject

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Convert String to StringIO, pass through IO-like objects Uses respond_to?(:read) duck typing - returns input unchanged if IO-like

Parameters:

  • xml (String, StringIO)

Returns:

  • (Object)


145
# File 'sig/multi_xml.rbs', line 145

def self.normalize_input: (String | StringIO xml) -> untyped

.parse(xml, options = {}) ⇒ xmlHash

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Public API: Parse XML into a Ruby Hash Uses untyped for options because values vary by key (:parser, :symbolize_keys, :disallowed_types, :typecast_xml_value)

Parameters:

  • xml (String, StringIO)
  • options (Hash[Symbol, untyped]) (defaults to: {})

Returns:

  • (xmlHash)


119
120
121
122
123
124
125
126
127
128
129
# File 'lib/multi_xml.rb', line 119

def parse(xml, options = {})
  call_site = OptionsNormalization.normalize_symbolize_option(options)
  global = OptionsNormalization.normalize_symbolize_option(parse_options(call_site))
  options = DEFAULT_OPTIONS.merge(global, call_site)
  namespaces = validate_namespaces_mode(options.fetch(:namespaces))
  io = normalize_input(xml)
  return {} if io.eof?

  result = parse_with_error_handling(io, xml, resolve_parse_parser(options), namespaces)
  apply_postprocessing(result, options)
end

.parse_with_error_handlingxmlHash

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Parse with error handling and key normalization xml_parser implements _Parser interface; original_input uses respond_to? duck typing

Parameters:

  • io (IO, StringIO)
  • original_input (Object)
  • xml_parser (Object)
  • namespaces (Symbol)

Returns:

  • (xmlHash)


149
# File 'sig/multi_xml.rbs', line 149

def self.parse_with_error_handling: ((IO | StringIO) io, untyped original_input, untyped xml_parser, Symbol namespaces) -> xmlHash

.parse_with_namespaces_compatibilityxmlHash?

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Dispatch to parsers that may or may not accept namespaces:

Parameters:

  • io (IO, StringIO)
  • xml_parser (Object)
  • namespaces (Symbol)

Returns:

  • (xmlHash, nil)


152
# File 'sig/multi_xml.rbs', line 152

def self.parse_with_namespaces_compatibility: ((IO | StringIO) io, untyped xml_parser, Symbol namespaces) -> xmlHash?

.parserModule

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Public API: Get the current XML parser module

Returns:

  • (Module)


77
78
79
80
81
82
# File 'lib/multi_xml.rb', line 77

def parser
  override = Fiber[:multi_xml_parser]
  return override if override

  @parser ||= resolve_parser(detect_parser)
end

.parser=(new_parser) ⇒ Module

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Public API: Set the XML parser to use

Parameters:

  • new_parser (Symbol, String, Module)

Returns:

  • (Module)


96
97
98
# File 'lib/multi_xml.rb', line 96

def parser=(new_parser)
  @parser = resolve_parser(new_parser)
end

.parser_supports_namespaces_keyword?Boolean

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Check whether a parser accepts the namespaces: keyword

Parameters:

  • xml_parser (Object)

Returns:

  • (Boolean)


161
# File 'sig/multi_xml.rbs', line 161

def self.parser_supports_namespaces_keyword?: (untyped xml_parser) -> bool

.raise_no_parser_errorbot

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Raise NoParserError - never returns

Returns:

  • (bot)


141
# File 'sig/multi_xml.rbs', line 141

def self.raise_no_parser_error: () -> bot

.resolve_parse_parserModule

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Resolve which parser this parse call should use, honoring the :parser option

Parameters:

  • options (Hash[Symbol, untyped])

Returns:

  • (Module)


158
# File 'sig/multi_xml.rbs', line 158

def self.resolve_parse_parser: (Hash[Symbol, untyped] options) -> Module

.resolve_parserModule

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Resolve a parser specification (Symbol, String, or Module) to a parser

Parameters:

  • spec (Symbol, String, Module)

Returns:

  • (Module)


116
# File 'sig/multi_xml.rbs', line 116

def self.resolve_parser: (Symbol | String | Module spec) -> Module

.try_requireBoolean

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Attempt to require a library, returning success/failure Kernel#require accepts String; library may be Symbol from PARSER_PREFERENCE (coerced at runtime)

Parameters:

  • library (Object)

Returns:

  • (Boolean)


138
# File 'sig/multi_xml.rbs', line 138

def self.try_require: (untyped library) -> bool

.validate_namespaces_modeSymbol

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Validate the :namespaces option value

Parameters:

  • mode (Object)

Returns:

  • (Symbol)


155
# File 'sig/multi_xml.rbs', line 155

def self.validate_namespaces_mode: (untyped mode) -> Symbol

.validate_parser!Module

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Validate that a parser satisfies the contract

Parameters:

  • parser (Module)

Returns:

  • (Module)


119
# File 'sig/multi_xml.rbs', line 119

def self.validate_parser!: (Module parser) -> Module

.warn_deprecation_once(key, message) ⇒ void

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

This method returns an undefined value.

Emit a deprecation warning at most once per process

Parameters:

  • key (Symbol)
  • message (String)


50
51
52
53
54
55
56
57
# File 'lib/multi_xml.rb', line 50

def self.warn_deprecation_once(key, message)
  Concurrency.synchronize(:deprecation_warnings) do
    return if DEPRECATION_WARNINGS_SHOWN.include?(key)

    Kernel.warn(message, category: :deprecated)
    DEPRECATION_WARNINGS_SHOWN.add(key)
  end
end

.with_parser(new_parser) ⇒ void

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

This method returns an undefined value.

Public API: Execute a block with a temporarily-swapped parser



145
146
147
148
149
150
151
# File 'lib/multi_xml.rb', line 145

def self.with_parser(new_parser)
  previous_override = Fiber[:multi_xml_parser]
  Fiber[:multi_xml_parser] = resolve_parser(new_parser)
  yield
ensure
  Fiber[:multi_xml_parser] = previous_override
end