Import Comments Into Disqus with Ruby and Sinatra

Jim Mulholland

This is part 3 of our blog migration to Webby. In part 1, Wynn blogged about the 12 reasons Webby makes him happy. In part 2, I blogged about importing our blog posts to webby via RSS feeds.

In this post, I will walk through the process of migrating our comments over from our old blogging system into Disqus via RSS feeds.

To do the import, I created a simple single page Sinatra web application. For those of you unfamiliar with Sinatra, it is a very light-weight Ruby web framework. It is good to use for simple micro-apps when Rails and/or Merb would be overkill. Sinatra fit the bill perfectly for my Disqus importer since it is only a single form without a database backend.

The devver blog has an excellent post about creating an iPhone web application in under 50 lines using Sinatra.

For this Disqus import application, we require 4 Ruby Gems.:

  • Sinatra – Simple Ruby web framework
  • FeedTools – Tool to parse or RSS feeds
  • REST Client – REST client for Ruby to access REST api calls
  • json – Parse the JSON objects returned from Disqus

This micro-app has a form with the following 5 fields:

  • Disqus API Key – The api key generated by Disqus. It can be accessed at http://disqus.com/api/get_my_key/
  • Forum Short Name – This is the short name given by you when creating a new Disqus “website”
  • Existing Blog Comment RSS URL – This is the RSS url with the comments you will be importing into Disqus
  • The Blog Article RSS URL – This is the RSS url of the blog site that Disqus will be using for your comments. This is needed so that the comments you are importing will be tied to the correct blog url from Disqus.
  • Number of comments to import – This is used for testing to verify a small subset of comments look good in Disqus after importing. As far as I can tell, there is no way via the Disqus api to “remove” comments once they are added. So I would copy over 1 or 2 comments, log into my Disqus account to verify they look okay, remove those comments, and import all of them in again.

To get this application to work locally, you would do the following:

  • Pull the code from the GitHub Project
  • “sudo gem install” the required gems
  • Go to the new “disqus-sinatra-importer” directory in your terminal
  • IMPORTANT: As of this post, you will probably have to manually update line 55 of the import.rb file. Here, I am assuming that all of your comments have a title of “Comment on” + blog title + “by (comment author)”. This is important because I am using the “blog title” to tie the comment to the blog article the comment was for. If your comments have a title different than this pattern, these regular expressions will have to be updated in order to retrieve the blog article title only.
  • Start the application by typing “ruby import.rb”
  • In your browser, goto http://localhost:4567/new

Other points:

  • Again, the “title” field of the comments has to be the same as the title field of the blog article. This is done in line 55 of the code. We should probably just add 2 fields in the form to allow this to be done without updating code.
  • This import is only as good as your RSS feeds. If your rss feed does not include email addresses, email addresses will obviously not be imported into Disqus which includes any Gravatars that may be tied to these emails.
  • Most RSS feeds only include the last 10 or so articles. You will have to remove this limitation (if possible) in order to import all articles.

You can view the code in its current state below. The most up to date version will be on GitHub.

Please let me know if you have any comments, questions, or ridicule via the Disqus comment form below!

 1         require 'rubygems'
 2         require 'feed_tools'
 3         require 'sinatra'
 4         require 'rest_client'
 5         require 'json'
 6 
 7         # New action creates the import form with the following fields:
 8         # * Disqus API Key (http://disqus.com/api/get_my_key/)
 9         # * Forum Short Name (from Disqus)
10         # * Existing Blog Comment RSS URL (The url for the comments rss feed of the blog we are importing into Disqus)
11         # * The blog article RSS url of the blog that will be using Disqus (We need this to tie the blog url with Disqus comments)
12         # * Number of comments to import (this is used for testing to verify a small subset of comments look good in Disqus after importing)
13         
14         get '/new' do
15           body <<-eos
16             <h3>Import comments into Disqus</h3>
17             <form action='/create' method='post'>
18             <label for='api_key'>Disqus API Key</label><br />
19             <input type='text' name='api_key' /><br />
20             <label for='short_name'>Forum Short Name</label><br />
21             <input type='text' name='short_name' /><br />
22             <label for='comment_rss_url'>Existing Blog Comment RSS URL</label><br />
23             <input type='text' name='comment_rss_url' /><br />
24             <label for='blog_rss_url'>New Blog Article RSS URL</label><br />
25             <input type='text' name='blog_rss_url' /><br />
26             <label for='number_of_comments'>Number Of Commments To Import (Leave blank to import all)</label><br />
27             <input type='text' name='number_of_comments' /><br /><br />
28             <input type='submit' name='submit' value='Import Into Disqus' />
29             </form>
30             eos
31         end
32 
33         post '/create' do
34           disqus_url = 'http://disqus.com/api'
35           user_api_key, old_blog_comment_rss, forum_shortname, current_blog_rss = params[:api_key], params[:comment_rss_url], params[:short_name], params[:blog_rss_url]
36           number_of_comments = params[:number_of_comments]
37 
38           resource = RestClient::Resource.new disqus_url
39           forums = JSON.parse(resource['/get_forum_list?user_api_key='+user_api_key].get)
40           forum_id = forums["message"].select {|forum| forum["shortname"]==forum_shortname}[0]["id"]
41           forum_api_key = JSON.parse(resource['/get_forum_api_key?user_api_key='+user_api_key+'&forum_id='+forum_id].get)["message"]
42 
43           # Get all of the comments from the old blog site
44           comments = FeedTools::Feed.open(old_blog_comment_rss)
45 
46           # Get all of the articles from the current blog site
47           articles = FeedTools::Feed.open(current_blog_rss)
48 
49           comment_text = ""
50           failed_imports = ""
51           successful_imports = ""
52 
53           comments_to_import = number_of_comments.blank? ? comments.items : comments.items[0..number_of_comments.to_i-1]
54 
55           comments_to_import.each do |comment|
56             comment_article_title = comment.title.sub(/^Comment on /, "").sub(/by.*$/, "").strip
57 
58             # Get the blog article for the current comment thread
59             article = articles.items.select {|a| a.title.downcase == comment_article_title.downcase}[0]
60 
61             if article
62               article_url = article.link  
63 
64               thread = JSON.parse(resource['/get_thread_by_url?forum_api_key='+forum_api_key+'&url='+article_url].get)["message"]
65 
66               # If a Disqus thread is not found with the current url, create a new thread and add the url.
67               if thread.nil?  
68                 thread = JSON.parse(resource['/thread_by_identifier'].post(:forum_api_key => forum_api_key, :identifier => comment.title, :title => comment.title))["message"]["thread"]
69 
70                 # Update the Disqus thread with the current article url
71                 resource['/update_thread'].post(:forum_api_key => forum_api_key, :thread_id => thread["id"], :url => article_url) 
72               end
73 
74               # Import posts here
75               begin
76                 post = resource['/create_post'].post(:forum_api_key => forum_api_key, :thread_id => thread["id"], :message => comment.description, :author_name => comment.author.name, :author_email => comment.author.email, :created_at => comment.time.strftime("%Y-%m-%dT%H:%M"))
77               rescue
78                 failed_imports += "<li>"+comment.description[0..100]+" <a href=#{article_url}>link</a>" + "</li>"
79               else
80                 successful_imports += "<li>"+comment.description[0..100]+" <a href=#{article_url}>link</a>" + "</li>"
81               end
82             end
83           end
84 
85           failed_import_message = "<h2>The following comments failed to import</h2><ul>" + failed_imports + "</ul>"
86           successful_import_message = "<h2>The following comments were imported successfully</h2><ul>" + successful_imports + "</ul>"
87 
88           output_message =  successful_import_message + "<br />"
89           output_message += failed_import_message unless failed_imports.blank?
90 
91           body output_message
92         end