Combinatorics for fun and profit

During my programming for fun moments I tend to always encounter a problem that needs to find combinations without repetitions for a set of data (numbers, objects, strings, letters, ...).

I have never been able to solve that problem in a way that satisfied me, and the languages I used didn’t have libraries for combinatorics that had a way to generate combinations without repetitions.

Yesterday a good friend of mine introduced me to the Set Puzzle, and, like I did for Word Challenge I had to try and attack the problem from a programming angle.

To do that I had to find once and for all a reusable way to generate combinations without repetitions. I did this time, and SetSolver was born.

You can check it out on GitHub, but I can’t help myself and not post the combinatorics code on this blog :)

module ArrayExtensions
  def combinations_without_repetitions(k)
    combine(self, k)
  end
 
  private
  def combine(array, k)
    return [array] if k == array.size
    return array.collect {|e| [e]} if k == 1  
    results = []
    array[0..(array.size - k)].each_with_index do |val, idx|
      results += combine(array[(idx+1)..-1], k - 1).collect {|e| [val, e].flatten}
    end
    results
  end
end
 
Array.class_eval do
  include ArrayExtensions
end

Word Cheat, a simple anagram engine

Those of you who follow this blog since its inception will be aware of my obsession with word games, and specifically algorithms that solve word games.

Yesterday I was playing Word Challenge on Facebook, a simple game where you have to find all the words that can be composed with six random letters. After a couple of trials I decided to try and solve it with a ruby program, but since I didn’t want to use my previous code I decided to try a new approach to solve this problem.

First of all I took an extensive list of Italian words and filtered it to suit my needs, leaving only the words ranging from three to six letters.

Once I had my wordlist I had to build a data structure that could be used for fast anagrams retrieval, thus the WordHash class was born.

class WordHash
  attr_reader :words
 
  def initialize(wordlist)
    @words = {}
    wordlist.each do |word|
      if @words.has_key?(w = word.strip.signature)
        @words[w] << word.strip
      else
        @words[w] = [word.strip]
      end
    end
  end
 
  def get_anagrams(word)
    result = []
    wordlist = @words.each do |k,w|
      result << w if word.contains?(k)
    end
    result
  end
end

This class makes use of two helper methods I added to String, that I think should be really part of the Ruby standard library:

module StringExtensions
  def signature
    self.split('').sort.join
  end
 
  def contains?(search)
   !Regexp.new(search.signature.split("").join(".*")).match(self.signature).nil?
  end
end
 
String.class_eval do
  include StringExtensions
end
The way it all works is quite simple, I build a Hash whose keys are the different signatures of words in the wordlist and whose values are arrays made up by the words with the same signature, so fetching all the anagrams for a word means just accessing word_hash[word.signature].

The get_anagrams method is used to also get anagrams with a length less than that of the original word.

The whole project, complete with tests and a helper script is available on GitHub, feel free to contribute.

Also, big thanks to Stefano Cobianchi, who contributed with the contains? method, that I really couldn’t code :)

Radiant iPhone Extension 0.0.1

Fellow Mikamai – er, all around nice guy, great guitarist and top notch developer Andrea “Pilu” Franz has just released an incredibile extension for the Radiant CMS, the iPhone Extension, it allows you to access your radiant admin interface via iPhone using a nice iPhone optimized GUI. Please check the original article on his blog.

Radiant iPhone extension 0.0.1

After some works for iPhone I decided to create an extension for Radiant that adds an iPhone tailored ui for the admin panel. It’s the first version and for now it just allows to edit existing pages and add new page parts.

You can find more info in the README file on github.

Radiant iPhone extension

Two improvements to your Capfiles

I have lately started using a pattern that has become quite common among capistrano users: setting the server names and locations in a task. Doing this allows you to have multiple deployment environments, like development, staging, production, and so on.

desc "deploy to development environment"
task :development do
  set :deploy_to, "/var/apps/#{application}"
 
  role :web, "servername.mikamai.com", :primary => true
  role :app, "servername.mikamai.com", :primary => true
  role :db, "servername.mikamai.com", :primary => true
 
  set :user, "username"
  set :password, "secr3t"
  set :remote_mysqldump, "/usr/bin/mysqldump"
 
  set :db_user, "username"
  set :db_password, "secre7"
  set :db_name, "db_name"
end

This technique has proven itself to be really useful, especially when clients start to ask for deployments on their test servers, and you still want to be able to deploy to your development servers.

While refactoring my Capfiles I also took the time to rewrite the drupal:db namespace, adding the much needed tasks that allow you dump the remote databases and download them to your development box.

  namespace :db do    
    namespace :dump do
      desc "Deletes old database dumps, leaves only the latest on the server"
      task :cleanup, :roles => :db do
        dumps = capture("ls -xt #{shared_path}/dumps").split.reverse
        run "cd #{shared_path}/dumps; rm #{dumps[0..-2].join(" ")}"
      end
 
      desc "Dumps the local database"
      task :local, :roles => :db do
        raise RuntimeError.new("failed dump") unless system "#{local_mysqldump} -u #{local_db_user} --password=#{local_db_password} #{local_db_name} > dump.sql"
      end
 
      namespace :remote do
        desc "Dumps the remote database"
        task :default, :roles => :db do
          filename = "#{Time.now.to_i.to_s}.dump.sql"
          run "cd #{shared_path}/dumps; #{remote_mysqldump} -u #{db_user} --password=#{db_password} #{db_name} > #{filename}"
          run "cd #{shared_path}/dumps; bzip2 #{filename}"
        end          
 
        namespace :download do
          desc "Dumps and downloads the remote database"
          task :default do
            drupal::db::dump::remote::default
            latest
          end
 
          desc "Downloads the latest database dump"
          task :latest, :roles => :db do
            dumps = capture("ls -xt #{shared_path}/dumps").split.reverse
            get("#{shared_path}/dumps/#{dumps.last}", "./#{dumps.last}")
          end
 
        end
 
      end
 
    end

Oh yeah!

It’s been a while since the last post. I’ve been super busy working on many projects, unfortunately not always with ruby. I worked with python and django, with php and drupal and other technologies. What I enjoyed the most was developing with cocoa for iphone. But obviously my favorite language is still ruby :)

Interviewed!

A couple of weeks ago the nice guys at Vodafone Lab interviewed me and my friend Lorenzo for the work we did for Montalbano.tv.

This is not the first time I write about Montalbano (as you can see from my previous drupal centered post), but last time I wrote because I wanted to show a clever approach to deploying Drupal, this time I just wanted to share with you a happy moment.

Joining Mikamai has proved itself to be one of the wisest decisions of my adult life, it gave me the opportunity to work with smart people on interesting projects.

If you’re a developer and are interested in working with us drop us a line, maybe we’ll find a way to do something fun together :)

Legacy Path Handler, a Radiant Extension

We’re preparing to deploy the new Mikamai site (not up at the time of this post), that runs on the wonderful Rails-based RadiantCMS.

The VPS we’re deploying to runs on Phusion Passenger, and that means we can’t use mod_alias or mod_rewrite to 301-redirect the old URLs, already indexed by Google, to their new locations.

To solve this problem I wrote a little Radiant Extension, called LegacyPathHandler, that reads a simple list of URLs from a text file and does a 301 redirection on them before handling the control to Radiant’s default SiteController.

It works quite fine for us, but it has no specs/tests or documentation. Please feel free to contribute to the project if you feel you can improve it.

No more del.icio.us on Tempe.st

I have decided to remove the daily del.icio.us post with my links, it created too much noise and diluted good content.

How to ease Drupal development with Capistrano

Drupal is a great piece of software, unfortunately it stores so much stuff in the db that people struggle keeping in sync the development server/box and a staging server to show their customers how the work is proceeding.

Today I will share the Capistrano tasks I use to sync my development box with the staging server. What I basically do is dumping the development db, sending it to the server via capistrano and then use the dump to replace the server’s database.

The following tasks should be used together with the tasks in my Deploying drupal with Capistrano article. I took advantage of deploy:cold not being needed with Drupal, and added a callback to it, so if you want to do a deploy that also updated the database you should use deploy:cold.

You should also have two settings files (usually stored in drupal_root/sites/default), one called settings.development.php, with your local database setup and one called settings.production.php with the remote database setup, the capistrano tasks will take care of choosing the correct one.

# Callbacks
before 'deploy:start', 'drupal:db:import:production'
before 'deploy:restart', 'drupal:configure:production'
before 'deploy:start', 'drupal:configure:production'
before 'deploy:cold', 'drupal:db:dump:development'
 
# DB Stuff
set :mysqldump, "/path/to/mysqldump"
set :local_db_user, "local_mysql_username"
set :local_db_password, "local_mysql_password"
set :local_db_name, "local_db_name"
set :db_user, "remote_mysql_username"
set :db_password, "remote_mysql_password"
set :db_name, "remote_db_name"
 
namespace :drupal do
  namespace :configure do
    task :production do
      sudo "cp #{latest_release}/sites/default/settings.production.php #{latest_release}/sites/default/settings.php"
    end
 
    task :development do
      sudo "cp #{latest_release}/sites/default/settings.development.php #{latest_release}/sites/default/settings.php"
    end
  end
 
  namespace :db do
    namespace :dump do
      task :development do
        raise RuntimeError.new("failed dump") unless system "#{mysqldump} -u #{local_db_user} --password=#{local_db_password} #{local_db_name} > dump.sql"
      end
    end
 
    namespace :import do
      task :production do
        ENV["FILES"] = "dump.sql"
        deploy::upload
        run "mysql -u #{db_user} --password=#{db_password} #{db_name} < #{latest_release}/dump.sql"
      end
    end
  end
end

Deploying Drupal with Capistrano

Mikamai, the company I work for, has just released Montalbano.tv, the companion site to one of the most successful TV shows in Italy.

I was the technical director of this Drupal based project, and while I was happy we chose Drupal, because it allowed us to deliver all the features they needed on time, I almost panicked when they told us the production setup would have two servers, both with database and web serving duties.

The database replication was standard MySql master-master setup, but I had to develop a strategy to keep the two code-bases on the two servers synchronized.

Being a Ruby programmer at heart, I selected the only tool that never fails me in circumstances like the one we had: Capistrano.

Unfortunately, while Capistrano is all easy to use with Rails, I had to write a custom Drupal-tailored Capfile.

Here it is, in its entirety, in case you ever need to deploy Drupal with cap (now I always deploy Drupal with cap, since I have the recipe ready :)):

load 'deploy' if respond_to?(:namespace) # cap2 differentiator
 
# Standard configuration
set :user, "username"
set :password, "password"
set :application, "application.name"
 
# I like to deploy the code in /var/apps
# and then link it to the webserver directory
set :deploy_to, "/var/apps/#{application}"
 
# SCM Stuff configure to taste, just remember the repository
# here I used github as main repository
set :repository,  "git@github.com:username/project.git"
set :scm, :git
set :branch, "master"
set :repository_cache, "git_master"
set :deploy_via, :remote_cache
set :scm_verbose,  true
 
# Two servers, double fun
# You really don't need app, web and db here,
# but I used all of them just to be sure.
# Usually only web is ok.
role :app, "first.server.address.com"
role :app, "second.server.address.com", :primary => true
role :web, "first.server.address.com"
role :web, "second.server.address.com", :primary => true
role :db, "first.server.address.com"
role :db, "second.server.address.com", :primary => true
 
after 'deploy:setup', 'drupal:setup' # Here we setup the shared files directory
after 'deploy:symlink', 'drupal:symlink' # After symlinking the code we symlink the shared dirs
 
# Before restarting the webserver we fix all the 
# permissions and then symlink it to production
before 'deploy:restart', 'mikamai:permissions:fix', 'mikamai:symlink:application'
 
 
namespace :drupal do
  # shared directories
  task :setup, :except => { :no_release => true } do
    sudo "mkdir -p #{shared_path}/files"
    sudo "chown -R #{user}:#{user} #{deploy_to}"
  end
 
  # symlink shared directories
  task :symlink, :except => { :no_release => true } do
    sudo "ln -s #{shared_path}/files #{latest_release}"
  end
end
 
namespace :deploy do
  # adjusted finalize_update, removed non rails stuff
  task :finalize_update, :except => { :no_release => true } do
    sudo "chmod -R g+w #{latest_release}" if fetch(:group_writable, true)
  end
 
  task :restart do
    # nothing to do here since we're on mod-php
  end
end
 
namespace :mikamai do
  # symlinking to production
  namespace :symlink do
    task :application, :except => { :no_release => true } do
      sudo "rm -rf /var/www/montalbano"
      sudo "ln -s #{latest_release} /var/www/montalbano"
    end
  end
 
  # change ownership
  namespace :permissions do
    task :fix, :except => { :no_release => true } do
      sudo "chown -R www-data:www-data #{latest_release}"
    end
  end
 
end

Getting Exif data using ImageScience

In my current rails project I need to upload photos and save some exif data taken from them. I use attachment_fu as uploading system that let me choose which image processor to use. Using rmagick and mini_magick I can extract exif data with the following code:

# rmagick
image = Magick::ImageList.new(filename).first
puts image['EXIF:Model'] # The camera model used to take the picture

# mini_magick
image = MiniMagick::Image.from_file(filename)
puts image["EXIF:Model"]

The problem is that I can’t do the same thing with image_science, because it has no methods that return exif data, so I want to add a method to the ImageScience class to do that.
Looking the FreeImage documentation I found some helpful functions, FreeImage_GetMetadata and FreeImage_TagToString. With these 2 functions I’m able to get an exif tag and convert it to a readable string. Each one of the available tags belongs to one of the following meta models:

FI_ENUM(FREE_IMAGE_MDMODEL) {
  FIMD_NODATA         = -1,
  FIMD_COMMENTS       = 0,	// single comment or keywords
  FIMD_EXIF_MAIN      = 1,	// Exif-TIFF metadata
  FIMD_EXIF_EXIF      = 2,	// Exif-specific metadata
  FIMD_EXIF_GPS       = 3,	// Exif GPS metadata
  FIMD_EXIF_MAKERNOTE = 4,	// Exif maker note metadata
  FIMD_EXIF_INTEROP   = 5,	// Exif interoperability metadata
  FIMD_IPTC           = 6,	// IPTC/NAA metadata
  FIMD_XMP            = 7,	// Abobe XMP metadata
  FIMD_GEOTIFF        = 8,	// GeoTIFF metadata
  FIMD_ANIMATION      = 9,	// Animation metadata
  FIMD_CUSTOM         = 10	// Used to attach other metadata types to a dib
};

Ok, now I can extract the model of the camera:

FreeImage_GetMetadata(FIMD_EXIF_MAIN, bitmap, "Model", &tag);
printf(FreeImage_TagToString(FIMD_EXIF_MAIN, tag, NULL));

As you can see, I need to pass the model of the “Model” tag. But if I don’t know which model to use, I can loop through all of them until the returned value of the FreeImage_GetMetadata function is not NULL:

for(model = 0; model < 11; model++) {
  if(FreeImage_GetMetadata(model, bitmap, tagName, &tag))
    return rb_str_new2(FreeImage_TagToString(model, tag, NULL));
}

Finally I can write a ruby module that extends ImageScience and adds the ability to get an exif tag:

module ImageScienceExifData

  def [](key)
    if key =~ /^EXIF:(\w+)?/
      get_exif($1)
    end
  end

  inline do |builder|
    if test ?d, "/opt/local" then
      builder.add_compile_flags "-I/opt/local/include"
      builder.add_link_flags "-L/opt/local/lib"
    end
    builder.add_link_flags "-lfreeimage"
    builder.add_link_flags "-lstdc++" # only needed on PPC for some reason. lame
    builder.include '"FreeImage.h"'

    builder.prefix <<-"END"
      #define GET_BITMAP(name) FIBITMAP *(name); Data_Get_Struct(self, FIBITMAP, (name)); if (!(name)) rb_raise(rb_eTypeError, "Bitmap has already been freed")
    END

    builder.c <<-"END"
      VALUE get_exif(char *tagName) {
        GET_BITMAP(bitmap);
        FITAG *tag = NULL;
        const char *value;
        int model;

        for(model = 0; model < 11; model++) {
          if(FreeImage_GetMetadata(model, bitmap, tagName, &tag))
            return rb_str_new2(FreeImage_TagToString(model, tag, NULL));
        }

        return Qnil;
      }
    END

  end
end
ImageScience.send(:include, ImageScienceExifData)

ImageScience.with_image(filename) do |img|
  puts img["EXIF:Model"]
end

The output with the picture I used is the following :

NIKON
COOLPIX S3
2006:12:10 12:09:17

It doesn’t work with all the exif names but for now it’s ok for my needs. The next step is to add the code above in my rails application and use it with attachment_fu. I’ll write another post about that soon.

Moving to github

The github revolution has hit me too. I’m moving all my Radiant extension to my account on github.  Follow me and fork my projects :)!

Radiant Newsletter extension has stats

Some weeks ago Casper Fabricius sent me a patch for the Radiant Newsletter extension. He added a statistics system to track how many times sent emails are opened. I have finally found the time to apply it and make a commit to my repository. Thank you very very much for your work Casper! I’ll write an article about this extension as soon as possible.

WillPaginate with ajax and unobtrusive js

Every day I use will_paginate plugin to paginate list of records. Today I need to paginate with ajax, and googling I only found a patch for the plugin that adds some code inside the generated links. Thus I decided to write a few lines of javascript to generate the same behaviour:

var Pagination = {  

  initLinks: function() {
    $('container').select('div.pagination a').invoke('observe', 'click', Pagination.linkHandler);
  },

  linkHandler: function(event) {
    event.stop();
    new Ajax.Updater('container', event.element().getAttribute('href'),{
      method: 'get',
      onComplete: Pagination.initLinks
    });
  }

}

document.observe('dom:loaded', Pagination.initLinks);

Obviously the code it’s not optimized, it’s just an example, but it works. You only need create a list of records inside a div caled ‘container’, and that div will be update with the content loaded by the ajax request.

[UPDATE] I like the prototype OO way to write js. So I wote the same thing with a class:

var Pagination = Class.create({ 

  initialize: function() {
    this.options = Object.extend({
      container: 'container'
    }, arguments[0] || {});
    this.initLinks();
  },  

  initLinks: function() {
    $(this.options.container).select('div.pagination a').invoke('observe', 'click', this.linkHandler.bind(this));
  },  

  linkHandler: function(event) {
    event.stop();
    new Ajax.Updater('container', event.element().getAttribute('href'),{
      method: 'get',
      onComplete: this.initLinks.bind(this)
    });
  }

});

document.observe('dom:loaded', function() {
  new Pagination();
});

The latest version is more customizable because the constructor could receive a list of options. For now it’s only one, you can specify the container that will be updated, but you could add more options like the name of the spinner to show during requests and so on.