Parsing with Mechanize
Hello ! In my previous article , we discussed about Nokigiri , a very useful HTML and XML parser i.e it helps in parsing or scraping the information available on a web page to use it in the Ruby program.
But sometimes we just want to trigger some action , click button or automate the process on a website from a Ruby program.For example scheduling a social media post , automating a feed back (Yes , we are forced to give the feed back to the faculties on a web portal) or just submitting a form and many other features that are not provide by that particular website but you have to do it manually.This is where Mechanize comes to the rescue . It is one of the coolest ruby library that is very useful in scrapping as well as triggering any action on a web page.
What's Mechanize ?
According to the The Bastards Book of Ruby ,
"Sometimes, the challenge is navigating to the right page.
This is the case with many websites that require you to fill out a form. It's not enough to know the URL of a remote script and pass it parameters using the RestClient gem, as you would for a public API. The program that backs a website might perform a variety of checks – such as the existence and state of a browser cookie – before letting you submit a request. These kind of checks are handled invisibly through your browser, but the simple fetching scripts I've written so far don't.
This is where the Mechanize gem comes in. It leverages Nokogiri (or another parser of your choice) to parse a page for the relevant forms and buttons and provides a simplified interface for manipulating a webform."
How to use it ?
1. Install the gem 'mechanize' by typing "gem install
2. Include the library in your program by typing "
Example
Few days ago one of my friend asked how to parse the result of his class 12th batch all at once.We are gonna take the same example here.
We are going to write a ruby program that will parse the result of a range of a roll numbers and write it into a HTML file as the scraped data will be raw HTML so writing it to an HTML file will look more presentable.
Here is the snapshot of the code with details as comments :
Code @ https://gist.github.com/AnkDos/aba5b45d3f7806aa9ed0ff816ca2b9fd .
Thank you!
But sometimes we just want to trigger some action , click button or automate the process on a website from a Ruby program.For example scheduling a social media post , automating a feed back (Yes , we are forced to give the feed back to the faculties on a web portal) or just submitting a form and many other features that are not provide by that particular website but you have to do it manually.This is where Mechanize comes to the rescue . It is one of the coolest ruby library that is very useful in scrapping as well as triggering any action on a web page.
What's Mechanize ?
According to the The Bastards Book of Ruby ,
"Sometimes, the challenge is navigating to the right page.
This is the case with many websites that require you to fill out a form. It's not enough to know the URL of a remote script and pass it parameters using the RestClient gem, as you would for a public API. The program that backs a website might perform a variety of checks – such as the existence and state of a browser cookie – before letting you submit a request. These kind of checks are handled invisibly through your browser, but the simple fetching scripts I've written so far don't.
This is where the Mechanize gem comes in. It leverages Nokogiri (or another parser of your choice) to parse a page for the relevant forms and buttons and provides a simplified interface for manipulating a webform."
How to use it ?
1. Install the gem 'mechanize' by typing "gem install
mechanize".
2. Include the library in your program by typing "
require 'mechanize'" .
Example
Few days ago one of my friend asked how to parse the result of his class 12th batch all at once.We are gonna take the same example here.
We are going to write a ruby program that will parse the result of a range of a roll numbers and write it into a HTML file as the scraped data will be raw HTML so writing it to an HTML file will look more presentable.
Here is the snapshot of the code with details as comments :
Code @ https://gist.github.com/AnkDos/aba5b45d3f7806aa9ed0ff816ca2b9fd .
Thank you!