HTML Compressor, Closure Compiler, and Ant.

by Jason on June 6, 2012

The other day a friend of mine posted on Facebook that she was 10,000 days old.

About an hour later, I had created the Days Since Birth utility.

Still wanting to code more, I decided to make this project a bit more mature.

So, using Ant I was able to setup a simple target that compresses the HTML and runs the JavaScript through Google’s Closure Compiler.

I am considering other options, such as Linter’s (even though the code is already JSLint compatible), but would love other’s thoughts on it.

Source

Python Function Timer Decorator

by Jason on March 26, 2012

I find myself often timing how long code blocks take to execute so I’m able to discover bottlenecks and compare performance between iterations.

A quick & simple way to do this is with a handy Python decorator.

import time
import logging as log

def log_timing():
    '''Decorator generator that logs the time it takes a function to execute'''
    #Decorator generator
    def decorator(func_to_decorate):
        def wrapper(*args, **kwargs):
            start = time.time()
            result = func_to_decorate(*args, **kwargs)
            elapsed = (time.time() - start)
            
            log.debug("[TIMING]:%s - %s" % (func_to_decorate.__name__, elapsed))
            
            return result
        wrapper.__doc__ = func_to_decorate.__doc__
        wrapper.__name__ = func_to_decorate.__name__
        return wrapper
    return decorator

Automating Fixture Instantiation in Mocha using Async

by Jason on January 31, 2012

My latest Node.JS project has been entirely test-driven and I always strive to write very simple & concise tests. Convoluted code in a test will be far less likely to be maintained.

So when test driving my latest model, I found myself using hundreds of lines of code just to instantiate models using pre-built fixtures in the before function. Not ideal.

I much prefer this type of code:


var my_model_fixtures = require(app.root+'/spec/models/fixtures/my_model_fixtures.js');

describe('My Model', function() {  
  before(function(done) {    
    this.my_models = new Array;
    var that=this;
    
    async.map(['fixture1', 'fixture2'],
      function(my_fixture, iterator_callback) {
        var my_model = new MyModel(my_model_fixtures[my_fixture]);
        my_model.save(function(err,data){
          if(!err) {
            iterator_callback(null, my_model);
          } else {
            iterator_callback(data, null);
          }
        });
      },
      function(err,results){
        if(err) {
          console.log("Error instantiating fixtures: "+err);
        }
        done(err);
      });
  });  
});

Self Updating Solr Stopwords

by Jason on December 17, 2011

If you are looking to edge out a bit of performance from Solr one of the many things you can do is optimize your Solr Stopwords file. The more entries you have in this file, the less terms that end up in the Solr index.

Creating a Stopwords File

When developing a Stopwords file, it’s good to initially think about what terms you might want to ignore. A good starting point could be browsing the Stopwords community.

Self Updating Stopwords File

Once your system has been running for a while and you have a decent index you want to constantly examine the indexed content for data that shouldn’t be indexed.

A great way to do this is to use the TermsComponent. Using the TermsComponent, you can easily determine the top keywords (and the number of indexed documents it’s associated with) for a given field.

Example:
http://url.to.solr/solr/terms?terms.fl=MY_FIELD&terms.limit=1000

This will return the top indexed keywords (sorted by frequency descending). The example is returning the top 1000 indexed keywords for MY_FIELD.

Note: It is wise to perform this operation on a Tokenized field. Doing so on a non-tokenized field will simply result in the frequency of entire sentences (or more) occurring and not the frequency of keywords as you might expect.

Once you have this list in front of you comb through it and look at all of the keywords that are indexed which have absolutely no value to you. Immediately add these terms to your Stopwords file and ReIndex.

Some smart (but perhaps not wise) person could quite easily create a script to automatically perform a TermsComponent query, extract the top terms, and auto-update the Stopwords file. But, this would require some really tight controls so as not to nullify your entire index.

Setting RailwayJS Environment

by Jason on December 9, 2011

Recently I’ve been working with RailwayJS and integrating Mocha for testing. However, I wasn’t able to figure out how to explicitly set the environment. As such, it was always defaulting to “development” which is not ideal.

For those wondering, you want to do this:

process.env.NODE_ENV = ‘test’; //change to whatever environment you need

You will place this before:
var app = module.exports = require(‘railway’).createServer();

Apache Bench – Setting Multiple Cookies

by Jason on October 12, 2011

Recently I’ve been doing some load testing on some web applications I’ve developed.

Reading the documentation they say you can set multiple cookies using multiple “-C” flags. However, this simply DID NOT WORK for me. Tried everything.

Since I’m sure others have run into this I’m offering an alternative.
Simply set the Cookies in a header.

Example:

ab -c 5 -n 100 -H “Cookie: PHPSESSID=115821732; another_cookie=asdfasdfasdf”

Happy load testing!

Zend_Service_Spotify

by Jason on October 11, 2011

I just posted the initial beta release of Zend_Service_Spotify to my Github account.

I’ve always really appreciated the Zend Framework components due to their coding style, code comments (not to be confused with real documentation), and inter-operability.

So, I decided to add to the mix with Zend_Service_Spotify. Zend_Service_Spotify implements the Zend_Service abstract and provides hooks into all of the functionality of the Spotify Metadata API.

Things that will be added in the near future:

  • Common Iterable Interface – the XML and JSON data structures are totally different. For instance with XML you iterate over “album” when searching for an album, whereas with JSON you iterate over “albums”. I believe I should make a common iterable interface and have the response format simply be a setting.
  • Include Tests – I currently have 58 tests written in SimpleTest. I plan on including these shortly.
  • Implement Last-Modified and Expires Headers – you know… so we don’t anger Spotify’s server admins :)

Rails Flash Notices in Zend Framework

by Jason on June 7, 2011

Anyone that has used Zend Framework, especially Ruby on Rails developers, have most likely implemented the FlashMessenger Controller Action Helper and wished that you could organize the messages into different types.

As a Ruby on Rails developer I have been spoiled with the fantastic Flash Notices system in Rails.

I always wanted this functionality in Zend, so I took it upon myself to create it. Now I give it to you. Feedback is welcome!

Zend Framework RailsMessenger

Updated jQuery Form Field Default Value Plugin

by Jason on May 27, 2011

Hi everyone,

I finally got around to updating my jQuery plugin.

I’ve cleaned up the code a bit and also ensured full compatibility with jQuery 1.6.

A few new features:

Option for property-based default value
Now, if you choose, you can add a label=”my default value” as a property to your input field and the plugin will utilize this value for the default. You can still utilize the old method as well.

Option to allow form submission to send default values
By default, if default values are detected when a form is submitted, the values are emptied so an empty value is sent through. Now, you have the option of allowing the default value to be submitted.

Simply use the following code:

jQuery.fn.DefaultValue.settings.clear_defaults_on_submit=false;

You can now find the latest version of the jQuery plugin on Github!

Creating a Tag Cloud with Solr and PHP

by Jason on May 26, 2011

Tag clouds are a really fantastic way to summarize the prominent keywords/tags/words utilized in a system. There really isn’t a better way, visually, to represent this data.

So, how do we create a Tag Cloud using Solr?

The first step is to create a field of the type “text”.

The default Solr schema configures the text type with a whitespace tokenizer, so later on when we query our data, it will be setup to return individual words.

Let’s assume that our text field is named “product_description” and it contains paragraphs of text.

Now, we use a Facet Query to gather the most prominent keywords

Solr Facet Queries can be compared to a relational database’s GROUP BY aggregate query.

Our query parameters:

  • q: *:*
  • facet: true
  • facet.field: product_description
  • wt: json

Full example query:
http://my-solr-server:8983/solr/select?q=*:*&facet=true&facet.field=product_description&wt=json

This example query will select all records (*:*), facet on the product_description field, and return the data set in JSON format.

Normalize the Facet Data

The first thing we need to do when creating a tag cloud is decide the maximum font size and minimum font size.

I personally prefer my maximum font to be 41px and minimum to be 14px, so we’ll go with that.

The hit count for our returned results can be extremely high and/or extremely low, so we’ll need to make sure we normalize the hit counts to be between 14 and 41. We’ll do this with a simple ratio.

Note: For simplicity, I’m going to assume you know how to use CURL, thus I will leave out the code to actually execute the query and assume the RAW JSON data lives inside $data.


/* interpret raw JSON data returned from Solr */
$data = json_decode($data);

/* define minimum and maximum font size */
$max_font = 41.0;
$min_font = 14.0;

//extract facet information
$tags = $data->facet_counts->facet_fields->product_description;

//solr returns the results sorted by the facet count in descending order
//this means the most prominent keyword is first
//extract the first hit count to determine the weight ratio
$keyword_weight_ratio = (float) ($max_font / (float) reset($tags));

//loop through returned result and normalize keyword hit counts
foreach($tags as $keyword=>$weight) {
$tags[$keyword] = round($weight*$keyword_weight_ratio);
}

//return the modified array
return $data;

Create the Tag Cloud

Now that we’ve queried Solr and Normalized the Facet Data, we are ready to create our tag cloud. The approach I’m taking to display the Tag Cloud is the same approach used by Last.fm.


<div class="tag_cloud" style="width: 500px; min-height: 400px; line-height: 1.5px;">
<?php foreach($tags as $keyword=>$font_size): ?>
<span style="font-size: <?= $font_size ?>px">
<a href="#"<?= $keyword ?>/a>
</span>
</div>

With a little styling, the end result will look like this.