For my educational benefit and as part of the process at the Flatiron School, I recently wrote a CLI gem in Ruby. This was a great learning experience and a bit of challenge for a newbie like myself. That said, it was also rewarding and I have gained a much better understanding of things Ruby. Mission accomplished.
My idea was simple, since this was my first effort. So, I decided to make a CLI that searches for a list of doctors in a particular zipcode and then allow the user to get more details about a particular doctor. I cleverly named this app "doctor_finder." Why? Because it finds doctors!
At first, I thought I would use one of the large, well-known sites, like WebMD or HealthGrades. I also looked at insurance companies websites and government websites. All of them posed challenges that I could not overcome. The commercial sites blocked scrapers, and I wasn't savvy enough to get around their blocks. Other sites required the user to fill out forms that I could not easily get past in an automated way.
I finally settled on Zocdoc. This website met my needs and allowed me to search for a doctor by zipcode by using a customized URL. It returned 20 doctors to me, that I could parse with Nokogiri.
But first things first. The first step in this whole process was setting up properly for a Ruby Gem. This process is made simple thanks to Bundler. Bundler includes a gem
command that creates the directories and basic files you need to get started. It also includes some automation in your gemspec
file so you don't have to manually maintain your list of files. Finally, it sets up a git repo for you. Basically it gets you off to a good start:
bundler gem doctor_finder
Bundler will even ask you if you want to include a license and a code of conduct. The basic structure of a gem includes a bin
directory for storing executable files, a lib
directory for storing the main code, modules and classes, etc, and a spec
or test
directory for your testing tools. I didn't use any tests for this project, but I'm sure I will in the future.
With that done, the next step is to figure out how you will set up your dependencies and files to get things working. Since I was building a CLI, it made sense to start there. I decided that my CLI would call a CLI object and run a method on that object - and that that CLI object would take care of things from then on. So, to get that going, I needed to create a file in the bin
directory called 'doctor_finder.rb' which would serve as my main executable file. So I also needed to chmod +x doctor_finder.rb
to make it executable. Then, I added this code to the new file:
#!/usr/bin/env ruby
#
require_relative "../lib/doctor_finder"
DoctorFinder::CLI.new.call
So, the key line is the DoctorFinder::CLI.new.call
. First, I assume a moduel (namespace) called "DoctorFinder" that is actually created by bundler in the versions
file. Then I assume an object called CLI
, I create a new one, and I call the method call
.
So I need to create the CLI
class and include a method called call
. I'm pasting in the completed object below. This file would be found in the doctor_finder/lib/doctor_finder
directory.
# CLI Main file defining the CLI class
#
class DoctorFinder::CLI
def call
puts "\nWelcome to Hooper's Doctor Finder."
puts "\nWith HDF you can retrieve a list of doctors by zipcode and then
get more details about a particular doctor on that list. It's easy!"
show_list(get_zipcode)
get_choice_from_list
farewell
end
def get_zipcode
# Gets a valid zip code from the user
zip = ""
while !iszipcode?(zip)
puts "\nPlease enter a valid zipcode:"
zip = gets.chomp[0..4]
end
zip
end
def iszipcode?(zipcode)
# Provides a basic level of validation for user input of zipcode.
if zipcode.length == 5 && zipcode.scan(/\D/).empty?
true
else
false
end
end
def show_list(zipcode)
# Calls scraper and prints a list of doctors based on the zip code entered by the user.
DoctorFinder::Doctor.clear
docs = DoctorFinder::Scraper.scrape_by_zipcode(zipcode)
docs.each.with_index(1) do |doc, i|
puts "#{i}. #{doc.name} - #{doc.speciality} - #{doc.city}, #{doc.state} #{doc.zip}"
end
end
def get_choice_from_list
# Gets a valid choice from the list of Doctors.
choice = nil
while choice != "exit" && choice != "q"
puts "\n[1..#{DoctorFinder::Doctor.all.length}] Select Doctor |
[zip] Start over with new zipcode | [exit] To quit"
choice = gets.chomp
if choice.to_i > 0 && choice.to_i < DoctorFinder::Doctor.all.length+1
doc = DoctorFinder::Scraper.scrape_for_details(DoctorFinder::Doctor.all[choice.to_i-1])
puts "======================================\n"
puts doc.name
puts doc.street
puts doc.city + ', ' + doc.state + ' ' + doc.zip
puts "--------------------------------------\n"
puts "Areas of Specialty:"
puts doc.areas
puts doc.details
elsif choice == "zip"
show_list(get_zipcode)
end
end
end
def farewell
# Tells the user goodbye.
puts "\n\nThank you for using Hooper's Doctor Finder.
This was an educational experiment, and I learned a lot.
At first it seemed hard, but then it got easier.\n\nSee you next time.\n\n\n\n"
end
end
Let me explain the purpose each method:
call: This is the initial method that is called from the executbable itself. All it does is print some introductory content and then call other methods. First I want to get a list of doctors based on a zipcode the user enters. So, I call a function list_doctors
and I pass it get_zipcode
. After that, I want to run the main menu loop, and then, when the user is done, say farewell
.
get_zipcode: This method simply asks the user for a zipcode and then returns that zipcode.
iszipcode?: This method runs some very basic validation on the user input. The good news is that Zocdoc is very tolerant of user input - so even if they don't enter a proper zipcode Zocdoc will do its best to return a list of doctors to us.
list_doctors: Prints out a list of doctors based on a zipcode passed to it. In order to accopmlish this, it makes calls to two other classes, Scraper and Doctor. I'll show you those below.
get_choice_from_list: This is my poorly named main menu loop. Until the user enters valid input, it shows a list of doctors, pulled from the Doctor object. If they choose a valid doctor, it prints that doctors details.
farewell: Says goodbye when the user exits.
Next, lets take a look at the Doctor object. All this class does is store data for a doctor and it maintains an array of all the doctors:
# The Doctor class
#
class DoctorFinder::Doctor
attr_accessor :name, :url, :speciality, :street, :city, :state, :zip, :details, :areas
@@all = []
def initialize
@@all << self
end
def self.all
@@all
end
def self.clear
@@all = []
end
end
Not much going on here. But if this was part of a larger application, having the Doctor class as a separate class might prove useful. One thing to note - I made a clear
method to empty out the array of doctors so I could do a new search in a new zipcode without creating another instance. For the rest of the app, the action is in the Scraper class:
# The Scraper class
#
class DoctorFinder::Scraper
BASE_URL = "https://www.zocdoc.com/"
def self.scrape_by_zipcode(zipcode)
html = Nokogiri::HTML(open(
"#{BASE_URL}search?address=#{zipcode}&insurance_carrier=-1&day_filter=AnyDay&gender=-1
&language=-1&offset=0&insurance_plan=-1&reason_visit=75&after_5pm=false&before_10am=false
&sees_children=false&sort_type=Default&dr_specialty=153&"))
slice = html.css('.js-prof-row-container')
slice.each do |doctor| # will go through the HTML and create new doctor instances
doc = DoctorFinder::Doctor.new
doc.name = doctor.css('.js-profile-link').text.strip.gsub("\n", ' ').squeeze(' ')
doc.speciality = doctor.css('.ch-prof-row-speciality').text.strip
doc.url = BASE_URL + doctor.css('.js-profile-link')[0]['href']
address = doctor.css('.js-search-prof-row-address').text.strip
doc.street = address.slice(/^\d+[ ][\w+[ ]]+/)
# To format the text correctly, had to use some regex
doc.city = address[/[ ][ ]+[\w+[.]*[ ]]*[,]/].strip.chop
doc.state = address[/[A-Z][A-Z]/]
doc.zip = address[/\d{5}/]
end
DoctorFinder::Doctor.all
end
def self.scrape_for_details(doctor)
html = Nokogiri::HTML(open(doctor.url))
doctor.details = html.css('.profile-professional-statement').text.squeeze(' ')
if doctor.details.strip == ""
doctor.details = "No further details were available."
end
doctor.areas = html.css('li.specialty').text.squeeze(" ").gsub("\r\n \r\n ", "\r\n").lstrip
doctor
end
end
The Scraper class has two methods and one constant. The BASE_URL
constant was just a helpful way to desingate the web site I was scraping. The other methods:
scrape_by_zipcode(zipcode) takes a zipcode and scapes Zocdocs for a list of doctors based on that zipcode. I encode the zipcode in a search string nad usethat from my open-uri
request. I slice a chunk of html out of the whole site to narrow in on just the doctor info. Then I iterate through the array from Nokogiri tossing the data I want into a new Doctor instance. I use a number of regular expressions and string methods to get only the data I want in the object. I return the array of all the doctors.
scrape_for_details(doctor) takes a doctor and fills in more data about that doctor by scraping the doctor's detial page. I use the url I scraped before when I constructed the doctor list and pull down the details. Then, like before, I use text methods and regular expressions to get the data I want in the right place. I return the doctor, but now with more details.
So, that's basically how this things works. As I said at the begninng, I learned a lot playing with this. Now, onward!