class Nokogiri::HTML::SAX::Parser


This class lets you perform SAX style parsing on HTML with HTML error correction.

Here is a basic usage example:

class MyDoc < Nokogiri::XML::SAX::Document
  def start_element name, attributes = []
    puts "found a #{name}"

parser = Nokogiri::HTML::SAX::Parser.new(MyDoc.new)
parser.parse(File.read(ARGV[0], mode: 'rb'))

For more information on SAX parsers, see Nokogiri::XML::SAX

Public Instance Methods

parse_file(filename, encoding = 'UTF-8') { |ctx| ... } Show source

Parse a file with filename

# File lib/nokogiri/html/sax/parser.rb, line 41
def parse_file filename, encoding = 'UTF-8'
  raise ArgumentError unless filename
  raise Errno::ENOENT unless File.exist?(filename)
  raise Errno::EISDIR if File.directory?(filename)
  ctx = ParserContext.file(filename, encoding)
  yield ctx if block_given?
  ctx.parse_with self
parse_memory(data, encoding = 'UTF-8') { |ctx| ... } Show source

Parse html stored in data using encoding

# File lib/nokogiri/html/sax/parser.rb, line 31
def parse_memory data, encoding = 'UTF-8'
  raise ArgumentError unless data
  return unless data.length > 0
  ctx = ParserContext.memory(data, encoding)
  yield ctx if block_given?
  ctx.parse_with self

