W3cubDocs

/Nokogiri

class Nokogiri::HTML4::SAX::Parser

Parent:
Nokogiri::XML::SAX::Parser

💡 This class is an alias for Nokogiri::HTML4::SAX::Parser as of v1.12.0.

This class lets you perform SAX style parsing on HTML with HTML error correction.

Here is a basic usage example:

class MyDoc < Nokogiri::XML::SAX::Document
  def start_element name, attributes = []
    puts "found a #{name}"
  end
end

parser = Nokogiri::HTML4::SAX::Parser.new(MyDoc.new)
parser.parse(File.read(ARGV[0], mode: 'rb'))

For more information on SAX parsers, see Nokogiri::XML::SAX

Public Instance Methods

parse_file(filename, encoding = "UTF-8") { |ctx| ... } Show source
# File lib/nokogiri/html4/sax/parser.rb, line 50
def parse_file(filename, encoding = "UTF-8")
  raise ArgumentError unless filename
  raise Errno::ENOENT unless File.exist?(filename)
  raise Errno::EISDIR if File.directory?(filename)
  ctx = ParserContext.file(filename, encoding)
  yield ctx if block_given?
  ctx.parse_with(self)
end

Parse a file with filename

parse_io(io, encoding = "UTF-8") { |ctx| ... } Show source
# File lib/nokogiri/html4/sax/parser.rb, line 40
def parse_io(io, encoding = "UTF-8")
  check_encoding(encoding)
  @encoding = encoding
  ctx = ParserContext.io(io, ENCODINGS[encoding])
  yield ctx if block_given?
  ctx.parse_with(self)
end

Parse given io

parse_memory(data, encoding = "UTF-8") { |ctx| ... } Show source
# File lib/nokogiri/html4/sax/parser.rb, line 30
def parse_memory(data, encoding = "UTF-8")
  raise ArgumentError unless data
  return if data.empty?
  ctx = ParserContext.memory(data, encoding)
  yield ctx if block_given?
  ctx.parse_with(self)
end

Parse html stored in data using encoding

© 2008–2018 Aaron Patterson, Mike Dalessio, Charles Nutter, Sergio Arbeo,
Patrick Mahoney, Yoko Harada, Akinori MUSHA, John Shahid, Lars Kanis
Licensed under the MIT License.
https://nokogiri.org/rdoc/Nokogiri/HTML4/SAX/Parser.html