This is something different, at least for me.

I wanted to build a prototype framework that made use of Sikuli with Ruby… which turned into a JRuby + Sikuli + CucumberWebdriver framework.  One idea I had was to make automation tests that could validate spots in a  video stream to verify that the video is

  • Playing back
  • Playing in sync with the time stamp
  • Showing a clean image (no distortion)

Something like this could be run on a schedule against many different video streams to ensure quality metrics throughout a day 24/7.  With the add on of PESQ scores for the expected audio at each point, the audio in the video could also be validated for quality as well.

This type of framework could have other useful applications where image recognition is required. Below is a screencast of a working demo of this on YouTube:

To do this, I made use of Watir-webdriver, to drive the browser at a series of time stamps on a youtube video.  Then used Sikuli’s computer vision api to validate the image expected at those time stamps was correct.

Here’s the code/walkthrough….


The JRuby part on a mac, involved using RVM to install JRuby… easy enough.  Once installed I created a new Ruby project and made sure it was using the JRuby SDK.  The reason for JRuby, is that I want to make use of the Sikuli GEM.


This was pretty tricky.  It was easier to do in Groovy, then it was in JRuby.  Basically I want to call Sikuli via a scripting language so I can use logic when necessary in the validation process.  To get Siklui set up on JRuby, I found a tutorial online that actually worked (most don’t work at all)… here’s this guys’ tutorial:  So hats of to the author there, for actually providing a working example of JRuby and Sikuli.

I copied the Siklui-Java.jar into the project itself and referenced it in my imports:

require ‘java’
require “#{Dir.pwd}/features/support/lib/sikuli-java.jar”
import org.sikuli.script.App;
import org.sikuli.script.Screen;
import org.sikuli.script.Pattern;
import org.sikuli.script.KeyModifier;

s =

The Cucumber steps for validation call Sikuli are dynamically validating the video stream in a method like this:

Then(/^that point should correspond to (.*)$/) do |image|
   image_path = “images/video/#{image}”
   p image_path


Why Cucumber?  To manage the tests. I love Cucumber.  I love how the features are so easy to understand:

Scenario Outline: We validate several points in a video for expected results
  Given a user goes to <point> on a known video at YouTube
  Then that point should correspond to <image>
  And we close the browser

  |0s | 1s.png|
  |8s | 10s.png|
  |28s | 30s.png|
  |1m3s| 1m5s.png|

The above is one of my scenario’s I’m running in test.  I’m going to go to a dynamic point (time stamp) in the video and validate that that point is equivalent to the image expected in the video stream at that time stamp.  I did roll the entry point back a second or two, to give the video time to buffer up.  This way we go to 8 seconds in the video, and we will match the 10 second point to the screenshot of the video at 10 seconds.  So as it rolls forwards Sikuli will be waiting a few seconds to see if it catches the expected result.

If you want to see this fail, just simply change a time stamp to something the expected image does not correspond to.


Watir-webdriver was used to automate the browser.   This time I didn’t use Sikuli for browser automation.  It’s much faster to bypass Sikuli and it’s better in the long run to limit the use of computer vision, especially for web automation.

The browser is defined in the classic: browser = :ff way.  Then we call the URL with browser.goto “http://….”

I’m dynamically dropping in the URL parameter that scrubs to different times in the video, so that was done with a method like this:

Given(/^a user goes to (.*) on a known video at YouTube$/) do |point|
   @browser = :ff
   @browser.goto “{point}”


No responses yet

Leave a Reply

Your email address will not be published. Required fields are marked *