Sourabh Raje

In this post I am just going to show 3 ways to parse and extract HTML documents - a useful way of scraping websites, analysis and conversion of offline documents etc.

  • Nokogiri gem -

    The nokogiri gem is a popular Ruby HTML/XML parser which uses libxml2(a software library for parsing XML documents). Parse HTML with nokogiri using the Nokogiri::HTML method:


 require 'nokogiri'
 document = Nokogiri::HTML(input)
  • Oga gem -

    The oga gem is a Ruby XML/HTML parser with a small...

What do you do when you need to run the same test multiple times, but with different parameters? If you copy and paste the test, you end up with a hard-to-read test file. You can’t easily tell how the tests differ from one another. Worse, when you need to change one, you need to change them all. Take the following simple test file:

def double_it(number)
 number * 2

describe '#double_it' do
 it 'doubles 1 into 2' do

