💡 This class is an alias for Nokogiri::HTML4::SAX::Parser
as of v1.12.0.
This class lets you perform SAX
style parsing on HTML
with HTML
error correction.
Here is a basic usage example:
class MyDoc < Nokogiri::XML::SAX::Document def start_element name, attributes = [] puts "found a #{name}" end end parser = Nokogiri::HTML4::SAX::Parser.new(MyDoc.new) parser.parse(File.read(ARGV[0], mode: 'rb'))
For more information on SAX
parsers, see Nokogiri::XML::SAX
# File lib/nokogiri/html4/sax/parser.rb, line 50 def parse_file(filename, encoding = "UTF-8") raise ArgumentError unless filename raise Errno::ENOENT unless File.exist?(filename) raise Errno::EISDIR if File.directory?(filename) ctx = ParserContext.file(filename, encoding) yield ctx if block_given? ctx.parse_with(self) end
Parse a file with filename
# File lib/nokogiri/html4/sax/parser.rb, line 40 def parse_io(io, encoding = "UTF-8") check_encoding(encoding) @encoding = encoding ctx = ParserContext.io(io, ENCODINGS[encoding]) yield ctx if block_given? ctx.parse_with(self) end
Parse given io
# File lib/nokogiri/html4/sax/parser.rb, line 30 def parse_memory(data, encoding = "UTF-8") raise ArgumentError unless data return if data.empty? ctx = ParserContext.memory(data, encoding) yield ctx if block_given? ctx.parse_with(self) end
Parse html stored in data
using encoding
© 2008–2018 Aaron Patterson, Mike Dalessio, Charles Nutter, Sergio Arbeo,
Patrick Mahoney, Yoko Harada, Akinori MUSHA, John Shahid, Lars Kanis
Licensed under the MIT License.
https://nokogiri.org/rdoc/Nokogiri/HTML4/SAX/Parser.html