<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Stream Hacker</title>
	<atom:link href="http://streamhacker.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://streamhacker.wordpress.com</link>
	<description>Weotta be Hacking</description>
	<lastBuildDate>Wed, 30 Mar 2011 13:30:06 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='streamhacker.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Stream Hacker</title>
		<link>http://streamhacker.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://streamhacker.wordpress.com/osd.xml" title="Stream Hacker" />
	<atom:link rel='hub' href='http://streamhacker.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Building a NLTK FreqDist on Redis</title>
		<link>http://streamhacker.wordpress.com/2009/05/20/building-a-nltk-freqdist-on-redis/</link>
		<comments>http://streamhacker.wordpress.com/2009/05/20/building-a-nltk-freqdist-on-redis/#comments</comments>
		<pubDate>Wed, 20 May 2009 20:27:43 +0000</pubDate>
		<dc:creator>Jacob</dc:creator>
				<category><![CDATA[python]]></category>
		<category><![CDATA[nltk]]></category>
		<category><![CDATA[probability]]></category>
		<category><![CDATA[redis]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://streamhacker.wordpress.com/?p=251</guid>
		<description><![CDATA[Say you want to build a frequency distribution of many thousands of samples with the following characteristics: fast to build persistent data network accessible (with no locking requirements) can store large sliceable index lists The only solution I know that meets those requirements is Redis. NLTK&#8217;s FreqDist is not persistent , shelve is far too [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=251&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Say you want to build a <a title="Frequency Distribution" href="http://en.wikipedia.org/wiki/Frequency_distribution">frequency distribution</a> of many thousands of samples with the following characteristics:</p>
<ul>
<li>fast to build</li>
<li>persistent data</li>
<li>network accessible (with no locking requirements)</li>
<li>can store large sliceable index lists</li>
</ul>
<p>The only solution I know that meets those requirements is <a title="Redis Key-Value Database" href="http://code.google.com/p/redis/">Redis</a>. <a title="NLTK Probability module" href="http://nltk.googlecode.com/svn/trunk/doc/api/toc-nltk.probability-module.html">NLTK&#8217;s</a> <a title="NLTK FreqDist class" href="http://nltk.googlecode.com/svn/trunk/doc/api/nltk.probability.FreqDist-class.html">FreqDist</a> is not persistent , <a title="Python object persistence" href="http://docs.python.org/library/shelve.html">shelve</a> is far too slow, <a title="What BerkeleyDB is not" href="http://doc.gnu-darwin.org/intro/dbisnot.html">BerkeleyDB is not network accessible</a> (and is generally a PITA to manage), and AFAIK there&#8217;s no other key-value store that makes sliceable lists really easy to create &amp; access. So far I&#8217;ve been quite pleased with <a title="Redis README" href="http://code.google.com/p/redis/wiki/README">Redis</a>, especially given how new it is. It&#8217;s quite <a title="Introducing Redis: a fast key-value database" href="http://antoniocangiano.com/2009/03/11/introducing-redis-a-key-value-database/">fast</a>, is network accessible, atomic operations make locking unnecessary, supports <a title="Sorting in a key-value data model" href="http://antirez.com/post/Sorting-in-key-value-data-model.html">sortable</a> and <a title="Redis not just another key-value store" href="http://highscalability.com/product-redis-not-just-another-key-value-store">sliceable list structures</a>, and is very easy to configure.</p>
<h3>Classification</h3>
<p>Building a <a title="NLTK FreqDist class" href="http://nltk.googlecode.com/svn/trunk/doc/api/nltk.probability.FreqDist-class.html">FreqDist</a> allows you to create a <a title="NLTK ProbDist interface" href="http://nltk.googlecode.com/svn/trunk/doc/api/nltk.probability.ProbDistI-class.html">ProbDist</a>, which in turn can be used for <a title="NLTK Classify module" href="http://nltk.googlecode.com/svn/trunk/doc/api/toc-nltk.classify-module.html">classification</a>. Having it be persistent lets you examine the data later. And the ability to create sliceable lists allows you to make sorted indexes for paging thru your samples.</p>
<p>Here&#8217;s some more concrete use cases for persistent frequency distributions:</p>
<ul>
<li><a title="Frequency Analysis" href="http://en.wikipedia.org/wiki/Frequency_analysis">frequency analysis</a></li>
<li><a title="Bayesian Spam Filtering" href="http://en.wikipedia.org/wiki/Bayesian_spam_filtering">spam classification</a></li>
<li><a title="Learning to Classify Text with NLTK" href="http://nltk.googlecode.com/svn/trunk/doc/book/ch06.html">text categorization</a></li>
</ul>
<h3>RedisFreqDist</h3>
<p>I put the code I&#8217;ve been using to build <a title="how you would develop a frequency-sorted list of the ten thousand most-used words in the English language" href="http://asserttrue.blogspot.com/2009/05/one-of-toughest-job-interview-questions.html">frequency distributions over large sets of words</a> up at <a title="Extra modules for NLTK" href="http://bitbucket.org/japerk/nltk-extras/">BitBucket</a>. <a title="Redis NLTK probablity classes" href="http://bitbucket.org/japerk/nltk-extras/src/tip/probability.py">probablity.py</a> contains <code>RedisFreqDist</code>, which works just like the <a title="NLTK FreqDist class" href="http://nltk.googlecode.com/svn/trunk/doc/api/nltk.probability.FreqDist-class.html">NTLK FreqDist</a>, except it stores samples and frequencies as keys and values in <a title="Redis Key-Value Database" href="http://code.google.com/p/redis/">Redis</a>. That means <strong>samples must be strings</strong><em>.</em> Internally, <code>RedisFreqDist</code> also stores a set of all the samples under the key <em>__samples__</em> for efficient lookup and sorting. Here&#8217;s some example code for using it. For more info, checkout the <a title="NLTK Extra Modules wiki" href="http://bitbucket.org/japerk/nltk-extras/wiki/Home">wiki</a>, or read the <a title="Redis NLTK probablity classes" href="http://bitbucket.org/japerk/nltk-extras/src/tip/probability.py">code</a>.</p>
<pre class="brush: python;">
def make_freq_dist(samples, host='localhost', port=6379, db=0):
	freqs = RedisFreqDist(host=host, port=port, db=db)

	for sample in samples:
		freqs.inc(sample)
</pre>
<p>Unfortunately, I had to muck about with some of <a title="NLTK FreqDist class" href="http://nltk.googlecode.com/svn/trunk/doc/api/nltk.probability.FreqDist-class.html">FreqDist&#8217;s</a> internal implementation to remain compatible, so I can&#8217;t promise the code will work beyond <a title="Natural Language ToolKit" href="http://www.nltk.org/">NLTK</a> version <a title="NLTK 0.9.9" href="http://nltk.googlecode.com/files/nltk-0.9.9.zip">0.9.9</a>. <a title="Redis NLTK probablity classes" href="http://bitbucket.org/japerk/nltk-extras/src/tip/probability.py">probablity.py</a> also includes <code>ConditionalRedisFreqDist</code> for creating <a title="NLTK ConditionalProbDist class" href="http://nltk.googlecode.com/svn/trunk/doc/api/nltk.probability.ConditionalProbDist-class.html">ConditionalProbDists</a>.</p>
<h3>Lists</h3>
<p>For creating lists of samples, that very much depends on your use case, but here&#8217;s some example code for doing so. <code>r</code> is a <a title="Redis python interface" href="http://bitbucket.org/japerk/nltk-extras/src/tip/redis.py">redis</a> object, <code>key</code> is the index key for storing the list, and <code>samples</code> is assumed to be a sorted list. The <code>get_samples</code> function demonstrates how to get a slice of samples from the list.</p>
<pre class="brush: python;">
def index_samples(r, key, samples):
	r.delete(key)

	for word in words:
		r.push(key, word, tail=True)

def get_samples(r, key, start, end):
	return r.lrange(key, start, end)
</pre>
<p>Yes, <a title="Redis Key-Value Database" href="http://code.google.com/p/redis/">Redis</a> is still fairly alpha, so I wouldn&#8217;t use it for critical systems. But I&#8217;ve had very few issues so far, especially compared to dealing with <a title="Oracle Berkeley DB" href="http://www.oracle.com/technology/products/berkeley-db/index.html">BerkeleyDB</a>. I highly recommend it for your non-critical computational needs <img src='http://s2.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<br />Posted in python Tagged: nltk, probability, redis, statistics <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/streamhacker.wordpress.com/251/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/streamhacker.wordpress.com/251/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/streamhacker.wordpress.com/251/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/streamhacker.wordpress.com/251/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/streamhacker.wordpress.com/251/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/streamhacker.wordpress.com/251/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/streamhacker.wordpress.com/251/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/streamhacker.wordpress.com/251/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/streamhacker.wordpress.com/251/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/streamhacker.wordpress.com/251/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/streamhacker.wordpress.com/251/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/streamhacker.wordpress.com/251/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/streamhacker.wordpress.com/251/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/streamhacker.wordpress.com/251/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=251&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://streamhacker.wordpress.com/2009/05/20/building-a-nltk-freqdist-on-redis/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/348a20c3a576b2cb26674f1bc9eaf012?s=96&#38;d=identicon" medium="image">
			<media:title type="html">jacob</media:title>
		</media:content>
	</item>
		<item>
		<title>Deploying Django with Mercurial, Fab and Nginx</title>
		<link>http://streamhacker.wordpress.com/2009/04/26/deploying-django-with-mercurial-fab-and-nginx/</link>
		<comments>http://streamhacker.wordpress.com/2009/04/26/deploying-django-with-mercurial-fab-and-nginx/#comments</comments>
		<pubDate>Sun, 26 Apr 2009 18:00:11 +0000</pubDate>
		<dc:creator>Jacob</dc:creator>
				<category><![CDATA[python]]></category>
		<category><![CDATA[django]]></category>
		<category><![CDATA[fab]]></category>
		<category><![CDATA[fastcgi]]></category>
		<category><![CDATA[mercurial]]></category>
		<category><![CDATA[nginx]]></category>

		<guid isPermaLink="false">http://streamhacker.wordpress.com/?p=226</guid>
		<description><![CDATA[Writing web apps with Django can be a lot of fun, but deploying them can be a chore, even if you&#8217;re using Apache. Here&#8217;s a setup I&#8217;ve been using that makes deployment fast and easy. This all assumes you&#8217;ve got sudo access on a remote server running Ubuntu or something similar. Mercurial This setup assumes [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=226&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Writing web apps with <a title="Django Python Web Framework" href="http://www.djangoproject.com/">Django</a> can be a lot of fun, but <a title="Deploying Django" href="http://gnuvince.wordpress.com/2008/01/10/deploying-django/">deploying</a> them can be a chore, even if you&#8217;re using <a title="Deploying Django with Apache and mod_wsgi" href="http://docs.djangoproject.com/en/dev/howto/deployment/modwsgi/#howto-deployment-modwsgi">Apache</a>. Here&#8217;s a setup I&#8217;ve been using that makes <a title="Deploying Django" href="http://www.djangobook.com/en/beta/chapter21/">deployment</a> fast and easy. This all assumes you&#8217;ve got <code>sudo</code> access on a remote server running <a title="Ubuntu Linux" href="http://www.ubuntu.com/">Ubuntu</a> or something similar.</p>
<h3>Mercurial</h3>
<p>This setup assumes you&#8217;ve got 2 <a title="Mercurial Python Version Control" href="http://www.selenic.com/mercurial/wiki/">mercurial</a> repositories: 1 on your local machine, and 1 on the remote server you&#8217;re deploying to. In the remote repository, add the following to <a title="hgrc man page" href="http://www.selenic.com/mercurial/hgrc.5.html">.hg/hgrc</a></p>
<pre><code>[hooks]
changegroup = hg up</code></pre>
<p>This makes mercurial run <code>hg up</code> whenever you push new code. Then in your local repo&#8217;s <a title="hgrc man page" href="http://www.selenic.com/mercurial/hgrc.5.html">.hg/hgrc</a>, make sure the default path is to your remote repo. Here&#8217;s an example</p>
<pre>[paths]
default = ssh://user@domain.com/repo</pre>
<p>Now when you run <code>hg push</code>, you don&#8217;t need to include the path to the repo, and your code will be updated immediately.</p>
<h3>FastCGI</h3>
<p>Since I&#8217;m using <a title="nginx" href="http://nginx.net/">nginx</a> instead of <a title="The Apache HTTPD Server Project" href="http://httpd.apache.org/">Apache</a>, we&#8217;ll be <a title="How to use Django with FastCGI" href="http://docs.djangoproject.com/en/dev/howto/deployment/fastcgi/#howto-deployment-fastcgi">deploying Django with FastCGI</a>. Here&#8217;s an example script you can use to start and restart your Django FastCGI server. Add this script to your mercurial repo as <code>run_fcgi.sh</code>.</p>
<pre><code>#!/bin/bash
PIDFILE="/tmp/django.pid"
SOCKET="/tmp/django.sock"

# kill current fcgi process if it exists
if [ -f $PIDFILE ]; then
    kill `cat -- $PIDFILE`
    rm -f -- $PIDFILE
fi

python manage.py runfcgi socket=$SOCKET pidfile=$PIDFILE method=prefork</code></pre>
<p><strong>Important note:</strong> the FastCGI socket file will need to be readable &amp; writable by nginx worker processes, which run as the <em>www-data</em> user in Ubuntu. This will be handled by the <code>fab restart</code> command below, or you could add <code>chmod a+w $SOCKET</code> to the end of the above script.</p>
<h3>Nginx</h3>
<p>Nginx is a great <a title="Apache vs Nginx Webserver Performance Deathmatch" href="http://www.joeandmotorboat.com/2008/02/28/apache-vs-nginx-web-server-performance-deathmatch/">high performance</a> web server with <a title="Ubuntu Nginx Configuration" href="http://articles.slicehost.com/2008/5/15/ubuntu-hardy-nginx-configuration/">simple configuration</a>. Here&#8217;s a simple example server config for proxying to your Django <a title="Nginx Http Fcgi Module" href="http://wiki.nginx.org/NginxHttpFcgiModule">FastCGI</a> process. Add this config to your mercurial repo as <code>django.nginx</code>.</p>
<pre><code>server {
    listen 80;
    # change to your FQDN
    server_name YOUR.DOMAIN.COM;

    location / {
        # must be the same socket file as in the above fcgi script
        fastcgi_pass unix:/tmp/django.sock;
    }
}
</code></pre>
<p>On the remote server, make sure the following lines are in the <code>http</code> section of <code>/etc/nginx/nginx.conf</code></p>
<pre><code>include /etc/nginx/sites-enabled/*;
# fastcgi_params should contain a lot of fastcgi_param variables
include /etc/nginx/fastcgi_params;</code></pre>
<p>You must also make sure there is a link in <code>/etc/nginx/sites-enabled</code> to your <code>django.nginx</code> config. Don&#8217;t worry if <code>django.nginx</code> doesn&#8217;t exist yet, it will once you run <code>fab nginx</code> the first time.</p>
<pre><code>you@remote.ubuntu$ cd /etc/nginx/sites-enabled
you@remote.ubuntu$ sudo ln -s ../sites-available/django.nginx django.nginx</code></pre>
<h3>Fab</h3>
<p>Fab, or properly <a title="Fabric: simple pythonic deployment" href="http://www.nongnu.org/fab/">Fabric</a>, is my favorite new tool. It&#8217;s designed specifically for making remote deployment simple and easy. You create a <code>fabfile</code> where each function is a <a title="Fabric user guide" href="http://www.nongnu.org/fab/user_guide.html">fab command</a> that can run <a title="Fab remote command" href="http://www.nongnu.org/fab/api.html#operations_run">remote</a> and <a title="Fab sudo command" href="http://www.nongnu.org/fab/api.html#operations_sudo">sudo</a> commands on one or more remote hosts. So let&#8217;s <a title="Deploying Django with Fabric" href="http://blog.thescoop.org/archives/2008/12/02/deploying-django-with-fabric/">deploy Django using fab</a>. Here&#8217;s an example <code>fabfile</code> with 2 commands: <code>restart</code> and <code>nginx</code>. These commands should only be run after you&#8217;ve done a <code>hg push</code>.</p>
<pre class="brush: python;">
config.fab_hosts = ['YOUR.DOMAIN.COM']
config.projdir = '/PATH/TO/YOUR/REMOTE/HG/REPO'

def restart():
    sudo('cd %(projdir)s; run_fcgi.sh', user='www-data', fail='abort')

def nginx():
    sudo('cp %(projdir)s/django.nginx /etc/nginx/sites-available/', fail='abort')
    sudo('killall -HUP nginx', fail='abort')
</pre>
<h4>restart</h4>
<p>You only need to run <code>fab restart</code> if you&#8217;ve changed the actual python code. Changes to templates or static files don&#8217;t require a restart and will be used automatically (because of the <code>hg up</code> changegroup hook). Executing <code>run_fcgi.sh</code> as the <em>www-data</em> user ensures that nginx can read &amp; write the socket.</p>
<h4>nginx</h4>
<p>If you&#8217;ve changed your nginx server config, you can run <code>fab nginx</code> to install and reload the new server config without restarting the nginx server.</p>
<h3>Wrap Up</h3>
<p>Now that everything is setup, the next time you want to deploy some new code, it&#8217;s as simple as <code>hg push &amp;&amp; fab restart</code>. And if you&#8217;ve only changed templates, all you need to do is <code>hg push</code>. I hope this helps make your Django development life easier. It has certainly done so for me <img src='http://s2.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<br />Posted in python Tagged: django, fab, fastcgi, mercurial, nginx <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/streamhacker.wordpress.com/226/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/streamhacker.wordpress.com/226/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/streamhacker.wordpress.com/226/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/streamhacker.wordpress.com/226/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/streamhacker.wordpress.com/226/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/streamhacker.wordpress.com/226/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/streamhacker.wordpress.com/226/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/streamhacker.wordpress.com/226/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/streamhacker.wordpress.com/226/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/streamhacker.wordpress.com/226/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/streamhacker.wordpress.com/226/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/streamhacker.wordpress.com/226/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/streamhacker.wordpress.com/226/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/streamhacker.wordpress.com/226/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=226&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://streamhacker.wordpress.com/2009/04/26/deploying-django-with-mercurial-fab-and-nginx/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/348a20c3a576b2cb26674f1bc9eaf012?s=96&#38;d=identicon" medium="image">
			<media:title type="html">jacob</media:title>
		</media:content>
	</item>
		<item>
		<title>Django Datetime Snippets</title>
		<link>http://streamhacker.wordpress.com/2009/04/13/django-datetime-snippets/</link>
		<comments>http://streamhacker.wordpress.com/2009/04/13/django-datetime-snippets/#comments</comments>
		<pubDate>Mon, 13 Apr 2009 16:12:20 +0000</pubDate>
		<dc:creator>Jacob</dc:creator>
				<category><![CDATA[python]]></category>
		<category><![CDATA[datetime]]></category>
		<category><![CDATA[dateutil]]></category>
		<category><![CDATA[django]]></category>
		<category><![CDATA[forms]]></category>
		<category><![CDATA[json]]></category>
		<category><![CDATA[templates]]></category>
		<category><![CDATA[utc]]></category>

		<guid isPermaLink="false">http://streamhacker.wordpress.com/?p=214</guid>
		<description><![CDATA[I&#8217;ve started posting over at Django snippets, which is a great resource for finding useful bits of functionality. My first set of snippets is focused on datetime conversions. The Snippets FuzzyDateTimeField is a drop in replacement for the standard DateTimeField that uses dateutil.parser with fuzzy=True to clean the value, allowing the parser to be more [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=214&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve started posting over at <a title="Django Snippets" href="http://www.djangosnippets.org/users/japerk/">Django snippets</a>, which is a great resource for finding useful bits of functionality. My first set of snippets is focused on datetime conversions.</p>
<h3>The Snippets</h3>
<p><a title="Django Snippets FuzzyDateTimeField" href="http://www.djangosnippets.org/snippets/1422/">FuzzyDateTimeField</a> is a drop in replacement for the standard <a href="http://docs.djangoproject.com/en/dev/ref/forms/fields/#datetimefield">DateTimeField</a> that uses <a href="http://labix.org/python-dateutil#head-a23e8ae0a661d77b89dfb3476f85b26f0b30349c">dateutil.parser</a> with <code>fuzzy=True</code> to clean the value, allowing the parser to be more liberal with the input formats it accepts.</p>
<p>The <a title="Django Snippets isoutc template filter" href="http://www.djangosnippets.org/snippets/1424/">isoutc template filter</a> produces an ISO format UTC datetime string from a <a href="http://docs.python.org/library/datetime.html#datetime.tzinfo">timezone aware</a> <a href="http://docs.python.org/library/datetime.html#datetime.datetime">datetime</a> object.</p>
<p>The <a title="Django Snippets timeto template filter" href="http://www.djangosnippets.org/snippets/1426/">timeto template filter</a> is a more compact version of django&#8217;s <a href="http://docs.djangoproject.com/en/dev/ref/templates/builtins/#timeuntil">timeuntil</a> filter that only shows hours &amp; minutes, such as &#8220;1hr 30min&#8221;.</p>
<p><a title="Django Snippets JSON encode ISO UTC datetime" href="http://www.djangosnippets.org/snippets/1435/">JSON encode ISO UTC datetime</a> is a way to encode <a title="Python datetime objects" href="http://docs.python.org/library/datetime.html#datetime-objects">datetime objects</a> as ISO strings just like the <a title="Django Snippets isoutc template filter" href="http://www.djangosnippets.org/snippets/1424/">isoutc template filter</a>.</p>
<p><a title="Django Snippets JSON decode datetime" href="http://www.djangosnippets.org/snippets/1436/">JSON decode datetime</a> is a <a title="simplejson" href="http://code.google.com/p/simplejson/">simplejson</a> object hook for converting the <code>datetime</code> attribute of a JSON object to a <a title="Python datetime objects" href="http://docs.python.org/library/datetime.html#datetime-objects">python datetime object</a>. This is especially useful if you have a list of objects that all have <code>datetime</code> attributes that need to be decoded.</p>
<h3>Use Case</h3>
<p>Imagine you&#8217;re making a time based search engine for movies and/or events. Because your data will span many timezones, you decide that all dates &amp; times should be stored on the server as UTC. This pushes local timezone conversion to the client side, where it belongs, simplifying the server side data structures and search operations. You want your search engine to be AJAX enabled, but you don&#8217;t like XML because it&#8217;s so verbose, so you go with JSON for serialization. You also want users to be able to input their own range based queries without being forced to use specific datetime formats. Leaving out all the hard stuff, the above snippets can be used for communication between a django webapp and a time based search engine.</p>
<br />Posted in python Tagged: datetime, dateutil, django, forms, json, templates, utc <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/streamhacker.wordpress.com/214/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/streamhacker.wordpress.com/214/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/streamhacker.wordpress.com/214/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/streamhacker.wordpress.com/214/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/streamhacker.wordpress.com/214/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/streamhacker.wordpress.com/214/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/streamhacker.wordpress.com/214/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/streamhacker.wordpress.com/214/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/streamhacker.wordpress.com/214/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/streamhacker.wordpress.com/214/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/streamhacker.wordpress.com/214/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/streamhacker.wordpress.com/214/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/streamhacker.wordpress.com/214/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/streamhacker.wordpress.com/214/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=214&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://streamhacker.wordpress.com/2009/04/13/django-datetime-snippets/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/348a20c3a576b2cb26674f1bc9eaf012?s=96&#38;d=identicon" medium="image">
			<media:title type="html">jacob</media:title>
		</media:content>
	</item>
		<item>
		<title>Dates and Times in Python and Javascript</title>
		<link>http://streamhacker.wordpress.com/2009/04/02/dates-and-times-in-python-and-javascript/</link>
		<comments>http://streamhacker.wordpress.com/2009/04/02/dates-and-times-in-python-and-javascript/#comments</comments>
		<pubDate>Thu, 02 Apr 2009 16:06:53 +0000</pubDate>
		<dc:creator>Jacob</dc:creator>
				<category><![CDATA[javascript]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[datejs]]></category>
		<category><![CDATA[datetime]]></category>
		<category><![CDATA[dateutil]]></category>
		<category><![CDATA[parsing]]></category>
		<category><![CDATA[timezones]]></category>

		<guid isPermaLink="false">http://streamhacker.wordpress.com/?p=191</guid>
		<description><![CDATA[If you are dealing with dates &#38; times in python and/or javascript, there are two must have libraries. Datejs python-dateutil Datejs Datejs, being javascript, is designed for parsing and creating human readable dates &#38; times. It&#8217;s powerful parse() function can handle all the dates &#38; times you&#8217;d expect, plus fuzzier human readable date words. Here [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=191&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>If you are dealing with dates &amp; times in python and/or javascript, there are two must have libraries.</p>
<ol>
<li><a title="Datejs - A Javascript Date Library" href="http://www.datejs.com/">Datejs</a></li>
<li><a title="python-dateutil Labix" href="http://labix.org/python-dateutil">python-dateutil</a></li>
</ol>
<h3>Datejs</h3>
<p><a title="Datejs - A Javascript Date Library" href="http://www.datejs.com/">Datejs</a>, being javascript, is designed for parsing and creating human readable dates &amp; times. It&#8217;s powerful <a title="Datejs Parse Function" href="http://code.google.com/p/datejs/wiki/APIDocumentation#parse">parse()</a> function can handle all the dates &amp; times you&#8217;d expect, plus fuzzier human readable date words. Here are some examples from their site.</p>
<pre class="brush: jscript;">
Date.parse(&quot;February 20th 1973&quot;);
Date.parse(&quot;Thu, 1 July 2004 22:30:00&quot;);
Date.parse(&quot;today&quot;);
Date.parse(&quot;next thursday&quot;);
</pre>
<p>And if you are programmatically creating <a title="Javascript Date Object" href="http://www.w3schools.com/jsref/jsref_obj_date.asp">Date objects</a>, here&#8217;s a few functions I find myself using frequently.</p>
<pre class="brush: jscript;">
// get a new Date object set to local date
var dt = Date.today();
// get that same Date object set to current time
var dt = Date.today().setTimeToNow();

// set the local time to 10:30 AM
var dt = Date.today().set({hour: 10, minute: 30});
// produce an ISO formatted datetime string converted to UTC
dt.toISOString();
</pre>
<p>There&#8217;s plenty more in the <a title="Datejs API Documentation" href="http://code.google.com/p/datejs/wiki/APIDocumentation">documentation</a>; pretty much everything you need for <a title="Datejs add Function" href="http://code.google.com/p/datejs/wiki/APIDocumentation#add">manipulation</a>, <a title="Datejs compareTo Function" href="http://code.google.com/p/datejs/wiki/APIDocumentation#compareTo">comparison</a>, and <a title="Datejs toString Function" href="http://code.google.com/p/datejs/wiki/APIDocumentation#toString">string conversion</a>. Datejs cleanly extends the default <a title="Javascript Date Object" href="http://www.w3schools.com/jsref/jsref_obj_date.asp">Date object</a>, has been integrated into a couple <a title="Date Range Picker using jQuery 1.7" href="http://www.filamentgroup.com/lab/date_range_picker_using_jquery_ui_16_and_jquery_ui_css_framework/">date</a> <a title="Sliding Date Picker" href="http://ajaxorized.com/introducing-the-sliding-date-picker/">pickers</a>, and supports culture specific parsing for <a title="Internationalization and Localization" href="http://en.wikipedia.org/wiki/Internationalization_and_localization">i18n</a>.</p>
<h3>python-dateutil</h3>
<p>Like Datejs, <a title="python dateutil" href="http://labix.org/python-dateutil">dateutil</a> also has a powerful <a title="python dateutil parse function" href="http://labix.org/python-dateutil#head-c0e81a473b647dfa787dc11e8c69557ec2c3ecd2">parse()</a> function. While it can&#8217;t handle words like &#8220;today&#8221; or &#8220;tomorrow&#8221;, it can handle nearly every (American) date format that exists. Here&#8217;s a few examples.</p>
<pre class="brush: python;">
&gt;&gt;&gt; from dateutil import parser
&gt;&gt;&gt; parser.parse(&quot;Thu, 4/2/09 09:00 PM&quot;)
datetime.datetime(2009, 4, 2, 21, 0)
&gt;&gt;&gt; parser.parse(&quot;04/02/09 9:00PM&quot;)
datetime.datetime(2009, 4, 2, 21, 0)
&gt;&gt;&gt; parser.parse(&quot;04-02-08 9pm&quot;)
datetime.datetime(2009, 4, 2, 21, 0)
</pre>
<p>An option that comes especially in handy is to pass in <em>fuzzy=True</em>. This tells <a title="python dateutil parse function" href="http://labix.org/python-dateutil#head-c0e81a473b647dfa787dc11e8c69557ec2c3ecd2">parse()</a> to ignore unknown tokens while parsing. This next example would raise a <em>ValueError</em> without <em>fuzzy=True</em>.</p>
<pre class="brush: python;">
&gt;&gt;&gt; parser.parse(&quot;Thurs, 4/2/09 09:00 PM&quot;, fuzzy=True)
</pre>
<p>It don&#8217;t know how well it works for international date formats, but <a title="python dateutil parse function" href="http://labix.org/python-dateutil#head-c0e81a473b647dfa787dc11e8c69557ec2c3ecd2">parse()</a> does have options for reading days first and years first, so I&#8217;m guessing it can be made to work.</p>
<p><a title="python dateutil" href="http://labix.org/python-dateutil">dateutil</a> also provides some great <a title="Working with dates, times, and timezones in python" href="http://code.davidjanes.com/blog/2008/12/22/working-with-dates-times-and-timezones-in-python/">timezone</a> support. I&#8217;ve always been <a title="Relativity of time - shortcomings in Python datetime, and workaround" href="http://blog.redinnovation.com/2008/06/30/relativity-of-time-shortcomings-in-python-datetime-and-workaround/">surprised</a> at python&#8217;s lack of concrete <a title="python datetime tzinfo" href="http://docs.python.org/library/datetime.html#datetime.tzinfo">tzinfo</a> classes, but <a title="python dateutil tzinfo" href="http://labix.org/python-dateutil#head-587bd3efc48f897f55c179abc520a34330ee0a62">dateutil.tz</a> more than makes up for it (there&#8217;s also <a title="pytz" href="http://pytz.sourceforge.net/">pytz</a>, but I haven&#8217;t figured out why I need it instead of or in addition to <a title="python dateutil tzinfo" href="http://labix.org/python-dateutil#head-587bd3efc48f897f55c179abc520a34330ee0a62">dateutil.tz</a>). Here&#8217;s a function for parsing a string and returning a UTC datetime object.</p>
<pre class="brush: python;">
from dateutil import parser, tz
def parse_to_utc(s):
    dt = parser.parse(s, fuzzy=True)
    dt = dt.replace(tzinfo=tz.tzlocal())
    return dt.astimezone(tz.tzutc())
</pre>
<p><a title="python dateutil" href="http://labix.org/python-dateutil">dateutil</a> does a lot more than provide tzinfo objects and parse datetimes; it can also calculate <a title="python dateutil relativedelta" href="http://labix.org/python-dateutil#head-ba5ffd4df8111d1b83fc194b97ebecf837add454">relative deltas</a> and handle iCal <a title="python dateutil rrule" href="http://labix.org/python-dateutil#head-470fa22b2db72000d7abe698a5783a46b0731b57">recurrence rules</a>. I&#8217;m sure a whole calendar application could be built based on dateutil, but my interest is in parsing and converting datetimes to and from UTC, and in that respect dateutil excels.</p>
<br />Posted in javascript, python Tagged: datejs, datetime, dateutil, parsing, timezones <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/streamhacker.wordpress.com/191/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/streamhacker.wordpress.com/191/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/streamhacker.wordpress.com/191/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/streamhacker.wordpress.com/191/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/streamhacker.wordpress.com/191/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/streamhacker.wordpress.com/191/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/streamhacker.wordpress.com/191/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/streamhacker.wordpress.com/191/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/streamhacker.wordpress.com/191/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/streamhacker.wordpress.com/191/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/streamhacker.wordpress.com/191/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/streamhacker.wordpress.com/191/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/streamhacker.wordpress.com/191/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/streamhacker.wordpress.com/191/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=191&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://streamhacker.wordpress.com/2009/04/02/dates-and-times-in-python-and-javascript/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/348a20c3a576b2cb26674f1bc9eaf012?s=96&#38;d=identicon" medium="image">
			<media:title type="html">jacob</media:title>
		</media:content>
	</item>
		<item>
		<title>mapfilter</title>
		<link>http://streamhacker.wordpress.com/2009/03/03/mapfilter/</link>
		<comments>http://streamhacker.wordpress.com/2009/03/03/mapfilter/#comments</comments>
		<pubDate>Wed, 04 Mar 2009 01:06:56 +0000</pubDate>
		<dc:creator>Jacob</dc:creator>
				<category><![CDATA[erlang]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[filter]]></category>
		<category><![CDATA[functional programming]]></category>
		<category><![CDATA[map]]></category>
		<category><![CDATA[mapfilter]]></category>

		<guid isPermaLink="false">http://streamhacker.wordpress.com/?p=179</guid>
		<description><![CDATA[Have you ever mapped a list, then filtered it? Or filtered first, then mapped? Why not do it all in one pass with mapfilter? mapfilter? mapfilter is a function that combines the traditional map &#38; filter of functional programming by using the following logic: if your function returns false, then the element is discarded any [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=179&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Have you ever <a title="Python MapReduce" href="http://memojo.com/~sgala/blog/2007/09/29/Python-Erlang-Map-Reduce">mapped a list, then filtered it</a>? Or <a title="Enumerate, Map, Filter, Accumulate" href="http://gensym.org/2007/4/7/enumerate-map-filter-accumulate">filtered first, then mapped</a>? Why not do it all in one pass with <em>mapfilter</em>?</p>
<h3>mapfilter?</h3>
<p><em>mapfilter</em> is a function that combines the traditional <a title="Introduction to Functional Programming" href="http://www.freenetpages.co.uk/hp/alan.gauld/tutfctnl.htm">map &amp; filter of functional programming</a> by using the following logic:</p>
<ol>
<li>if your function returns false, then the element is discarded</li>
<li>any other return value is mapped into the list</li>
</ol>
<h3>Why?</h3>
<p>Doing a map and then a filter is <em>O(2N)</em>, whereas <em>mapfilter</em> is <a title="Big O Notation" href="http://en.wikipedia.org/wiki/Big_O_notation"><em>O(N)</em></a>. That&#8217;s twice as a fast! If you are dealing with large lists, this can be a huge time saver. And for the case where a large list contains small IDs for looking up a larger data structure, then using <em>mapfilter</em> can result in half the number of database lookups.</p>
<p>Obviously, <em>mapfilter</em> won&#8217;t work if you want to produce a list of boolean values, as it would filter out all the false values. But why would you want to map to a list booleans?</p>
<h3>Erlang Code</h3>
<p>Here&#8217;s some <a title="Erlang" href="http://www.erlang.org/">erlang</a> code I&#8217;ve been using for a while:</p>
<pre>mapfilter(F, List) -&gt; lists:reverse(mapfilter(F, List, [])).

mapfilter(_, [], Results) -&gt;
        Results;
mapfilter(F, [Item | Rest], Results) -&gt;
        case F(Item) of
                false -&gt; mapfilter(F, Rest, Results);
                Term -&gt; mapfilter(F, Rest, [Term | Results])
        end.</pre>
<p>Has anyone else done this for themselves? Does <em>mapfilter</em> exist in any <a title="Can Your Programming Language Do This?" href="http://www.joelonsoftware.com/items/2006/08/01.html">programming language</a>? If so, please leave a comment. I think <em>mapfilter</em> is a very simple &amp; useful concept that should be a included in the standard library of every (functional) programming language. Erlang already has <a title="Erlang stdlib lists module" href="http://www.erlang.org/doc/man/lists.html">mapfoldl</a> (map-reduce in one pass), so why not also have <em>mapfilter</em>?</p>
<br />Posted in erlang, programming Tagged: filter, functional programming, map, mapfilter <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/streamhacker.wordpress.com/179/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/streamhacker.wordpress.com/179/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/streamhacker.wordpress.com/179/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/streamhacker.wordpress.com/179/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/streamhacker.wordpress.com/179/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/streamhacker.wordpress.com/179/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/streamhacker.wordpress.com/179/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/streamhacker.wordpress.com/179/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/streamhacker.wordpress.com/179/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/streamhacker.wordpress.com/179/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/streamhacker.wordpress.com/179/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/streamhacker.wordpress.com/179/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/streamhacker.wordpress.com/179/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/streamhacker.wordpress.com/179/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=179&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://streamhacker.wordpress.com/2009/03/03/mapfilter/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/348a20c3a576b2cb26674f1bc9eaf012?s=96&#38;d=identicon" medium="image">
			<media:title type="html">jacob</media:title>
		</media:content>
	</item>
		<item>
		<title>Chunk Extraction with NLTK</title>
		<link>http://streamhacker.wordpress.com/2009/02/23/chunk-extraction-with-nltk/</link>
		<comments>http://streamhacker.wordpress.com/2009/02/23/chunk-extraction-with-nltk/#comments</comments>
		<pubDate>Tue, 24 Feb 2009 00:02:27 +0000</pubDate>
		<dc:creator>Jacob</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[chunking]]></category>
		<category><![CDATA[information extraction]]></category>
		<category><![CDATA[nlp]]></category>
		<category><![CDATA[nltk]]></category>
		<category><![CDATA[parsing]]></category>
		<category><![CDATA[tagging]]></category>

		<guid isPermaLink="false">http://streamhacker.wordpress.com/?p=164</guid>
		<description><![CDATA[Chunk extraction is a useful preliminary step to information extraction, that creates parse trees from unstructured text. Once you have a parse tree of a sentence, you can do more specific information extraction, such as named entity recognition and relation extraction. Chunking is basically a 3 step process: Tag a sentence Chunk the tagged sentence [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=164&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Chunk extraction is a useful preliminary step to <a title="Information Extraction" href="http://en.wikipedia.org/wiki/Information_extraction">information extraction</a>, that creates parse trees from unstructured text. Once you have a <a title="Parse Tree" href="http://en.wikipedia.org/wiki/Parse_tree">parse tree</a> of a sentence, you can do more specific information extraction, such as <a title="Named Entity Recognition" href="http://en.wikipedia.org/wiki/Named_entity_recognition">named entity recognition</a> and <a title="Relationship Extraction" href="http://en.wikipedia.org/wiki/Relationship_extraction">relation extraction</a>.</p>
<p><a title="Chunking" href="http://nltk.googlecode.com/svn/trunk/doc/book/ch07.html#sec-chunking">Chunking</a> is basically a 3 step process:</p>
<ol>
<li>Tag a sentence</li>
<li>Chunk the tagged sentence</li>
<li>Analyze the parse tree to extract information</li>
</ol>
<p>I&#8217;ve already written about how to train a <a title="Part of Speech Tagging with NLTK - Part 1" href="/2008/11/03/part-of-speech-tagging-with-nltk-part-1/">part</a> of <a title="Part of Speech Tagging with NLTK - Part 2" href="/2008/11/10/part-of-speech-tagging-with-nltk-part-2/">speech</a> <a title="Part of Speech Tagging with NLTK - Part 3" href="/2008/11/03/part-of-speech-tagging-with-nltk-part-1/">tagger</a> and a <a title="How to Train a NLTK Chunker" href="/2008/12/29/how-to-train-a-nltk-chunker/">chunker</a>, so I&#8217;ll assume you&#8217;ve already done the training, and now you want to use your tagger and chunker to do something useful.</p>
<h3>Tag Chunker</h3>
<p>The previously trained chunker is actually a <em>chunk tagger</em>. It&#8217;s a <a title="NLTK Tag Module" href="http://nltk.googlecode.com/svn/trunk/doc/api/toc-nltk.tag-module.html">Tagger</a> that assigns IOB chunk tags to part-of-speech tags. In order to use it for proper chunking, we need some extra code to convert the <a title="Tags vs Trees" href="http://nltk.googlecode.com/svn/trunk/doc/book/ch07.html#representing-chunks-tags-vs-trees">IOB chunk tags</a> into a parse tree. I&#8217;ve created a wrapper class that complies with the <a title="Chunk Parser Interface" href="http://nltk.googlecode.com/svn/trunk/doc/api/nltk.chunk.api.ChunkParserI-class.html">nltk ChunkParserI</a> interface and uses the <a title="How to Train a NLTK Chunker" href="/2008/12/29/how-to-train-a-nltk-chunker/">trained chunk tagger</a> to get IOB tags and convert them to a proper parse <a title="NLTK Tree Module" href="http://nltk.googlecode.com/svn/trunk/doc/api/toc-nltk.tree-module.html">tree</a>.</p>
<pre class="brush: python;">
import nltk.chunk
import itertools

class TagChunker(nltk.chunk.ChunkParserI):
    def __init__(self, chunk_tagger):
        self._chunk_tagger = chunk_tagger

    def parse(self, tokens):
        # split words and part of speech tags
        (words, tags) = zip(*tokens)
        # get IOB chunk tags
        chunks = self._chunk_tagger.tag(tags)
        # join words with chunk tags
        wtc = itertools.izip(words, chunks)
        # w = word, t = part-of-speech tag, c = chunk tag
        lines = [' '.join([w, t, c] for (w, (t, c)) in wtc if c]
        # create tree from conll formatted chunk lines
        return nltk.chunk.conllstr2tree('\n'.join(lines))
</pre>
<h3>Chunk Extraction</h3>
<p>Now that we have a proper chunker, we can use it to extract chunks. Here&#8217;s a simple example that tags a sentence, chunks the tagged sentence, then prints out each noun phrase.</p>
<pre class="brush: python;">
# sentence should be a list of words
tagged = tagger.tag(sentence)
tree = chunker.parse(tagged)
# for each noun phrase sub tree in the parse tree
for subtree in tree.subtrees(filter=lambda t: t.node == 'NP'):
    # print the noun phrase as a list of part-of-speech tagged words
    print subtree.leaves()
</pre>
<p>Each <a title="NLTK Tree Class" href="http://nltk.googlecode.com/svn/trunk/doc/api/nltk.tree.Tree-class.html">sub tree</a> has a phrase tag, and the leaves of a sub tree are the tagged words that make up that chunk. Since we&#8217;re training the chunker on IOB tags, NP stands for Noun Phrase. As noted before, the results of this <a title="Natural Language Processing" href="http://en.wikipedia.org/wiki/Natural_language_processing">natural language processing</a> are heavily dependent on the training data. If your input text isn&#8217;t similar to the your training data, then you probably won&#8217;t be getting many chunks.</p>
<br />Posted in programming, python Tagged: chunking, information extraction, nlp, nltk, parsing, tagging <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/streamhacker.wordpress.com/164/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/streamhacker.wordpress.com/164/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/streamhacker.wordpress.com/164/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/streamhacker.wordpress.com/164/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/streamhacker.wordpress.com/164/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/streamhacker.wordpress.com/164/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/streamhacker.wordpress.com/164/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/streamhacker.wordpress.com/164/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/streamhacker.wordpress.com/164/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/streamhacker.wordpress.com/164/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/streamhacker.wordpress.com/164/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/streamhacker.wordpress.com/164/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/streamhacker.wordpress.com/164/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/streamhacker.wordpress.com/164/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=164&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://streamhacker.wordpress.com/2009/02/23/chunk-extraction-with-nltk/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/348a20c3a576b2cb26674f1bc9eaf012?s=96&#38;d=identicon" medium="image">
			<media:title type="html">jacob</media:title>
		</media:content>
	</item>
		<item>
		<title>Test Driven Development in Python</title>
		<link>http://streamhacker.wordpress.com/2009/02/05/test-driven-development-in-python/</link>
		<comments>http://streamhacker.wordpress.com/2009/02/05/test-driven-development-in-python/#comments</comments>
		<pubDate>Thu, 05 Feb 2009 16:46:35 +0000</pubDate>
		<dc:creator>Jacob</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[buildbot]]></category>
		<category><![CDATA[doctest]]></category>
		<category><![CDATA[make]]></category>
		<category><![CDATA[nose]]></category>
		<category><![CDATA[tdd]]></category>
		<category><![CDATA[testing]]></category>
		<category><![CDATA[unittest]]></category>

		<guid isPermaLink="false">http://streamhacker.wordpress.com/?p=151</guid>
		<description><![CDATA[One of my favorite aspects of Python is that it makes practicing TDD very easy. What makes it so frictionless is the doctest module. It allows you to write a test at the same time you define a function. No setup, no boilerplate, just write a function call and the expected output in the docstring. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=151&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>One of my favorite aspects of <a title="Python Programming Language" href="http://www.python.org/">Python</a> is that it makes practicing <a title="Test Driven Development" href="http://en.wikipedia.org/wiki/Test-driven_development">TDD</a> very easy. What makes it so frictionless is the <a title="doctest" href="http://docs.python.org/library/doctest.html">doctest module</a>. It allows you to write a test at the same time you define a function. No setup, no boilerplate, just write a function call and the expected output in the <a title="Python Docstring Conventions" href="http://www.python.org/dev/peps/pep-0257/">docstring</a>. Here&#8217;s a quick example of a <a title="Fibonacci Number" href="http://en.wikipedia.org/wiki/Fibonacci_number">fibonacci</a> function.</p>
<pre class="brush: python;">
def fib(n):
        '''Return the nth fibonacci number.
        &gt;&gt;&gt; fib(0)
        0
        &gt;&gt;&gt; fib(1)
        1
        &gt;&gt;&gt; fib(2)
        1
        &gt;&gt;&gt; fib(3)
        2
        &gt;&gt;&gt; fib(4)
        3
        '''
        if n == 0:
                return 0
        elif n == 1:
                return 1
        else:
                return fib(n - 1) + fib(n - 2)
</pre>
<p>If you want to run your doctests, just add the following three lines to the bottom of your module.</p>
<pre class="brush: python;">
if __name__ == '__main__':
        import doctest
        doctest.testmod()
</pre>
<p>Now you can run your module to run the doctests, like <em>python fib.py</em>.</p>
<p>So how well does this fit in with the <a title="By Example" href="http://www.amazon.com/gp/product/0321146530?ie=UTF8&amp;tag=cloudshadows-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0321146530">TDD philosophy</a>? Here&#8217;s the basic <a title="Test Driven Development" href="http://c2.com/cgi/wiki?TestDrivenDevelopment">TDD practices</a>.</p>
<ol>
<li>Think about what you want to test</li>
<li>Write a small test</li>
<li>Write just enough code to fail the test</li>
<li>Run the test and watch it fail</li>
<li>Write just enough code to pass the test</li>
<li>Run the test and watch it pass (if it fails, go back to step 4)</li>
<li>Go back to step 1 and repeat until done</li>
</ol>
<p>And now a step-by-step breakdown of how to do this with doctests, in excruciating detail.</p>
<h4>1. Define a new empty method</h4>
<pre class="brush: python;">
def fib(n):
        '''Return the nth fibonacci number.'''
        pass

if __name__ == '__main__':
        import doctest
        doctest.testmod()
</pre>
<h4>2. Write a doctest</h4>
<pre class="brush: python;">
def fib(n):
        '''Return the nth fibonacci number.
        &gt;&gt;&gt; fib(0)
        0
        '''
        pass
</pre>
<h4>3. Run the module and watch the doctest fail</h4>
<pre>python fib.py
**********************************************************************
File "fib1.py", line 3, in __main__.fib
Failed example:
    fib(0)
Expected:
    0
Got nothing
**********************************************************************
1 items had failures:
   1 of   1 in __main__.fib
***Test Failed*** 1 failures.</pre>
<h4>4. Write just enough code to pass the failing doctest</h4>
<pre class="brush: python;">
def fib(n):
        '''Return the nth fibonacci number.
        &gt;&gt;&gt; fib(0)
        0
        '''
        return 0
</pre>
<h4>5. Run the module and watch the doctest pass</h4>
<pre>python fib.py</pre>
<h4>6. Go back to step 2 and repeat</h4>
<p>Now you can start filling in the rest of function, one test at time. In practice, you may not write code exactly like this, but the point is that doctests provide a really easy way to test your code as you write it.</p>
<h3>Unit Tests</h3>
<p>Ok, so doctests are great for simple tests. But what if your tests need to be a bit more complex? Maybe you need some external data, or <a title="A Mock Object Framework for Python" href="http://google-opensource.blogspot.com/2008/07/check-out-mox-our-mock-object-framework.html">mock objects</a>. In that case, you&#8217;ll be better off with more traditional <a title="unittest" href="http://docs.python.org/library/unittest.html">unit tests</a>. But first, take a little time to see if you can decompose your code into a set of smaller functions that can be tested individually. I find that code that is <a title="An easy way to make your code more testable" href="http://webmat.wordpress.com/2007/12/13/an-easy-way-to-make-your-code-more-testable/">easier to test</a> is also <a title="Why Your Code Sucks" href="http://www.artima.com/weblogs/viewpost.jsp?thread=71730">easier to understand</a>.</p>
<h3>Running Tests</h3>
<p>For running my tests, I use <a title="Python Nose" href="http://code.google.com/p/python-nose/">nose</a>. I have a <em>tests/</em> directory with a simple configuration file, <em>nose.cfg</em></p>
<pre>[nosetests]
verbosity=3
with-doctest=1</pre>
<p>Then in my <a title="How to write a Makefile" href="http://www.hsrl.rutgers.edu/ug/make_help.html">Makefile</a>, I add a test command so I can run <em>make test</em>.</p>
<pre>test:
        @nosetests --config=tests/nose.cfg tests PACKAGE1 PACKAGE2</pre>
<p><em>PACKAGE1</em> and <em>PACKAGE2</em> are optional paths to your code. They could point to unit test packages and/or production code containing doctests.</p>
<p>And finally, if you&#8217;re looking for a <a title="Continuous Integration" href="http://martinfowler.com/articles/continuousIntegration.html">continuous integration</a> server, try <a title="Buildbot" href="http://buildbot.net/trac">Buildbot</a>.</p>
<br />Posted in programming, python Tagged: buildbot, doctest, make, nose, tdd, testing, unittest <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/streamhacker.wordpress.com/151/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/streamhacker.wordpress.com/151/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/streamhacker.wordpress.com/151/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/streamhacker.wordpress.com/151/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/streamhacker.wordpress.com/151/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/streamhacker.wordpress.com/151/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/streamhacker.wordpress.com/151/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/streamhacker.wordpress.com/151/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/streamhacker.wordpress.com/151/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/streamhacker.wordpress.com/151/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/streamhacker.wordpress.com/151/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/streamhacker.wordpress.com/151/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/streamhacker.wordpress.com/151/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/streamhacker.wordpress.com/151/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=151&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://streamhacker.wordpress.com/2009/02/05/test-driven-development-in-python/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/348a20c3a576b2cb26674f1bc9eaf012?s=96&#38;d=identicon" medium="image">
			<media:title type="html">jacob</media:title>
		</media:content>
	</item>
		<item>
		<title>Programming As Information Architecture</title>
		<link>http://streamhacker.wordpress.com/2009/01/21/programming-as-information-architecture/</link>
		<comments>http://streamhacker.wordpress.com/2009/01/21/programming-as-information-architecture/#comments</comments>
		<pubDate>Wed, 21 Jan 2009 21:29:05 +0000</pubDate>
		<dc:creator>Jacob</dc:creator>
				<category><![CDATA[design]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[documentation]]></category>
		<category><![CDATA[hacking]]></category>
		<category><![CDATA[labeling]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[navigation]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[strategy]]></category>

		<guid isPermaLink="false">http://streamhacker.wordpress.com/?p=114</guid>
		<description><![CDATA[Code = Information. [1] Therefore, Software Architecture can be approached as Information Architecture. Information Architecture can be defined as The structural design of shared information environments. The art and science of shaping information products. The above definitions and much of the inspiration for this article comes from the book Information Architecture for the World Wide [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=114&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Code = Information</strong>. [1] Therefore, <strong>Software Architecture</strong> can be approached as <strong>Information Architecture</strong>. <a title="Information Architecture" href="http://en.wikipedia.org/wiki/Information_architecture">Information Architecture</a> can be defined as</p>
<ol>
<li>The structural <strong>design</strong> of <strong>shared information environments</strong>.</li>
<li>The <strong>art</strong> and <strong>science</strong> of <strong>shaping information products</strong>.</li>
</ol>
<p>The above definitions and much of the inspiration for this article comes from the book <a title="Information Architecture for the World Wide Web" href="http://www.amazon.com/gp/product/0596527349?ie=UTF8&amp;tag=cloudshadows-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0596527349">Information Architecture for the World Wide Web</a>. My goal is to explain some of what an <a title="Four Essential Skills for Information Architects" href="http://www.uie.com/articles/ia_essential/">Information Architect</a> does, and that software developers, especially the lead developer, should approach their code as an<strong> </strong><a title="Information System" href="http://en.wikipedia.org/wiki/Information_system">information system</a>, applying the principles of <a title="Creating Order out of Chaos" href="http://www.treith.com/ia_presentation/index.html">Information Architecture</a>. Why? Because it will lead to more organized, better structured, easier to understand code, which will <a title="The Benefits of Software Architecting" href="http://www.ibm.com/developerworks/rational/library/may06/eeles/">reduce maintenance costs</a>, <a title="Usability" href="http://en.wikipedia.org/wiki/Usability">decrease training time</a>, and generally make it easier for you and your team to <a title="Smart and Gets Things Done" href="http://www.amazon.com/gp/product/1590598385?ie=UTF8&amp;tag=cloudshadows-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1590598385">get things done</a>.</p>
<p>So, what are the core focus areas for an <a title="Information Architect Pioneer" href="http://www.llrx.com/columns/guide52.htm">Information Architect</a>?</p>
<ul>
<li><strong>Organization Systems</strong></li>
<li><strong>Labeling Systems</strong></li>
<li><strong>Navigation and Search Systems</strong></li>
<li><strong>Controlled Vocabularies </strong>and <strong>Metadata</strong></li>
<li><strong>Research</strong></li>
<li><strong>Strategy</strong></li>
<li><strong>Design</strong> and <strong>Documentation</strong></li>
</ul>
<h2>Organization Systems</h2>
<p><a title="Knowledge Organization" href="http://en.wikipedia.org/wiki/Knowledge_organization">Organization Systems</a> are exactly what you think they are: <strong>systems to organize information</strong>. Imagine if you had to come into your current code base completely fresh, knowing nothing about it. Does that thought horrify you? If your code isn&#8217;t organized, then it can be very hard for new developers to come in and figure out what&#8217;s going on. Think of your <a title="Code Base" href="http://en.wikipedia.org/wiki/Code_base">code repository</a> as a <strong>shared information environment</strong>. If you are the only one that can navigate it, let alone modify it, then you&#8217;ll always be stuck maintaining it. Hopefully your goal is not <a title="Fear of Changing Code" href="http://agileotter.blogspot.com/2009/01/fear-of-changing-code.html">job security</a>, but to provide an <a title="Agile Cost of Change Curve" href="http://www.agilemodeling.com/essays/costOfChange.htm">environment conducive to change</a>.</p>
<p>So how should you organize your code? Unfortunately, that&#8217;s not something that&#8217;s really taught anywhere. My general practice is to follow the recommendations of the language/platform. If they say all code should go in a directory called <em>src/</em>, then that&#8217;s where I put it. If every class is supposed to be in its own file, then that&#8217;s what I do. And if the platform documentation doesn&#8217;t specify how to do something, I&#8217;ll find a major open source project and see how they do things. The key to an organization system is to maintain <strong>logical consistency</strong>. Then, as long as you know the logic, you can figure out where things are or where something should go.</p>
<h2>Labeling Systems</h2>
<p>Labeling Systems are basically <a title="Naming Conventions" href="http://en.wikipedia.org/wiki/Identifier_naming_convention">standard naming practices</a>. In IA, a labeling system specifies what <strong>label</strong> goes with each <strong>element</strong> in every <strong>context</strong>. For programming, you&#8217;ll want a <strong>consistent naming scheme</strong> to make sure the all your code objects are consistently and clearly labeled. Good labels are simple, make sense in context, and hint at the details of the labeled object. The goal is to <strong>communicate information efficiently</strong>. You are not just writing code for yourself, you&#8217;re writing code for the team. The best code is not only functional, it&#8217;s <strong>readable</strong>, <strong>concise</strong>, and even <a title="Leading Programmers Explain How They Think" href="http://www.amazon.com/gp/product/0596510047?ie=UTF8&amp;tag=cloudshadows-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0596510047">beautiful</a>. Clear labeling goes a long way towards achieving that ideal.</p>
<h2>Navigation and Search Systems</h2>
<p>Navigation and Search Systems are very important to information focused websites, but they don&#8217;t apply much to code by itself. However, good <a title="Website Navigation Models" href="http://www.webdesignfromscratch.com/ia-models.php">Navigation and Search Systems</a> are essential for <strong>API documentation</strong>. I believe that the quality of the API documentation has a huge effect on the <a title="Is Documentation Holding Open Source Back" href="http://www.devx.com/opensource/Article/11839">adoption rate</a> of libraries and platforms. [2] Good API docs can be great resource for quickly looking up a function and understanding how to use it. But if a developer can&#8217;t navigate and search your API documentation, then how will they figure out how to that function works? Luckily for us programmers, navigation is usually provided for free with a <a title="Documentation Generator" href="http://en.wikipedia.org/wiki/Documentation_generator">documentation generator</a>. And <a title="Google Site Search Taps Power of Cloud" href="http://googleblog.blogspot.com/2008/06/google-site-search-taps-power-of-cloud.html">google</a> can handle the search for you.</p>
<h2>Controlled Vocabularies and Metadata</h2>
<p>While all programming languages have a <a title="What is a Controlled Vocabulary" href="http://www.boxesandarrows.com/view/what_is_a_controlled_vocabulary_">controlled vocabulary</a>, in IA this refers to <strong>domain knowledge</strong>. The principle is to use words and <a title="The Jargon File" href="http://www.catb.org/jargon/html/index.html">jargon</a> that are common to whatever domain you are developing for.</p>
<p><a title="Metadata" href="http://en.wikipedia.org/wiki/Metadata">Metadata</a>, in this case, is information about the code, such as <strong>comments</strong> and <strong>documentation</strong>. Just as an <a title="Introduction to Information Architecture" href="http://www.mainss.com/index.php/2008/11/20/introduce-to-information-architecture/">Information Architect</a> is in charge of the language use within a system, the lead developer should be in charge of the <a title="Understanding Information Taxonomy Helps Build Better Apps" href="http://articles.techrepublic.com.com/5100-10878_11-5055268.html">domain language</a> and how to use it.</p>
<h2>Research</h2>
<p>The goal of <a title="Information Architecture Research" href="http://www.semanticstudios.com/publications/semantics/000030.php">IA Research</a> is to understand what needs to be designed and built before doing the work. In programming, you are often presented with problems you&#8217;ve never solved before. <a title="How to Become a Hacker" href="http://enjoyhack.blogspot.com/2008/11/how-to-become-hacker.html">Hacking</a>, or <a title="Exploratory Programming" href="http://en.wikipedia.org/wiki/Exploratory_programming">exploratory programming</a>, is a way to figure out and evaluate possible solutions. <strong>Hacking is Research</strong>. The goal of <a title="Doctor of Hacking" href="http://scottaaronson.com/blog/?p=18">research oriented hacking</a> is to figure out possible solutions, evaluate platforms and technologies, and understand the constraints that come with each technology and solution. The knowledge you gather from research is used to drive your strategic choices.</p>
<h2>Strategy</h2>
<p><a title="Planning an Information Architecture Strategy" href="http://www.slideshare.net/cfox74/making-ia-real-planning-an-information-architecture-strategy-presentation">IA Strategy</a> is about <strong>platform</strong>, <strong>process</strong> and <strong>design</strong>. What <a title="If Programming Languages were Celebrities" href="http://www.brandnoo.com/2008/04/07/programming-languages-and-their-celebrity-equivalents/">programming language(s)</a> will you use? What are the core <strong>design patterns</strong> and <strong>architectural choices</strong>? What <a title="7 Version Control Systems Reviewed" href="http://www.smashingmagazine.com/2008/09/18/the-top-7-open-source-version-control-systems/">version control system</a> will the team use? How will you <a title="The Pivotal Tracker Story" href="http://pivotallabs.com/users/chris/blog/articles/657-the-tracker-story-">track progress</a>? Your strategic decisions will set the <strong>design constraints</strong> of the implementation and <strong>drive the development process</strong>.</p>
<h2>Design and Documentation</h2>
<p>In software development, the <a title="What is Software Design?" href="http://www.developerdotstar.com/mag/articles/reeves_design.html">code is the design</a>, but not everyone will want to read your code to understand how things work. You may need to <strong>communicate the design</strong> in other ways, such as with <strong>diagrams</strong>, <strong>comments</strong>, and <strong>documentation</strong>. And if the code is being written by someone else, then it&#8217;s your job to communicate how their code will fit in to the rest of the system. <a title="Information Design" href="http://www.digital-web.com/articles/the_age_of_information_architecture/">Design documents</a> aren&#8217;t for you, they&#8217;re for the <a title="Coding for Violent Psychopaths" href="http://www.codinghorror.com/blog/archives/001137.html">other people on the team</a>. You do want other team members to understand your code, right? And if your diagrams and documentation are good enough, you might even get business people to think they understand your software too <img src='http://s2.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<h2>Conclusion</h2>
<p><a title="The Age of Information Architecture" href="http://www.digital-web.com/articles/the_age_of_information_architecture/">Information Architecture</a> provides a <strong>top-down</strong> view of your <strong>software system</strong>. As a lead developer or software architect,<strong> </strong>IA principles and practices can help make sure that your system is <a title="Quality with a Name" href="http://jamesshore.com/Articles/Quality-With-a-Name.html">well designed</a> and that the design is <a title="Soft Skill for Information Architecture" href="http://www.digital-web.com/articles/soft_skills_for_information_architecture/">communicated clearly</a> to all team members. For further reading, I recommend <a title="Information Architecture for the World Wide Web" href="http://www.amazon.com/gp/product/0596527349?ie=UTF8&amp;tag=cloudshadows-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0596527349">Information Architecture for the World Wide Web</a> and <a title="Documenting Software Architectures Views and Beyond" href="http://www.amazon.com/gp/product/0201703726?ie=UTF8&amp;tag=cloudshadows-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0201703726">Documenting Software Architectures</a>.</p>
<h3>Notes</h3>
<p>[1] Code = Data, Data = Information, Code = Information.</p>
<p>[2] I wish I had some data to back this up, but it&#8217;s certainly how I behave. Lack of clear documentation = fail.</p>
<br />Posted in design, programming Tagged: architecture, documentation, hacking, labeling, metadata, navigation, research, strategy <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/streamhacker.wordpress.com/114/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/streamhacker.wordpress.com/114/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/streamhacker.wordpress.com/114/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/streamhacker.wordpress.com/114/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/streamhacker.wordpress.com/114/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/streamhacker.wordpress.com/114/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/streamhacker.wordpress.com/114/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/streamhacker.wordpress.com/114/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/streamhacker.wordpress.com/114/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/streamhacker.wordpress.com/114/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/streamhacker.wordpress.com/114/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/streamhacker.wordpress.com/114/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/streamhacker.wordpress.com/114/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/streamhacker.wordpress.com/114/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=114&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://streamhacker.wordpress.com/2009/01/21/programming-as-information-architecture/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/348a20c3a576b2cb26674f1bc9eaf012?s=96&#38;d=identicon" medium="image">
			<media:title type="html">jacob</media:title>
		</media:content>
	</item>
		<item>
		<title>Static Analysis of Erlang Code with Dialyzer</title>
		<link>http://streamhacker.wordpress.com/2009/01/15/static-analysis-of-erlang-code-with-dialyzer/</link>
		<comments>http://streamhacker.wordpress.com/2009/01/15/static-analysis-of-erlang-code-with-dialyzer/#comments</comments>
		<pubDate>Fri, 16 Jan 2009 03:21:35 +0000</pubDate>
		<dc:creator>Jacob</dc:creator>
				<category><![CDATA[erlang]]></category>
		<category><![CDATA[dialyzer]]></category>
		<category><![CDATA[make]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://streamhacker.wordpress.com/?p=120</guid>
		<description><![CDATA[Dialyzer is a tool that does static analysis of your erlang code. It&#8217;s great for identifying type errors and unreachable code. Here&#8217;s how to use it from the command line. dialyzer -r PATH/TO/APP -I PATH/TO/INCLUDE Pretty simple! PATH/TO/APP should be an erlang application directory containing your ebin/ and/or src/ directories. PATH/TO/INCLUDE should be a path [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=120&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a title="Dialyzer Reference Manual" href="http://www.erlang.org/doc/apps/dialyzer/index.html">Dialyzer</a> is a tool that does static analysis of your <a title="Erlang" href="http://www.erlang.org/">erlang</a> code. It&#8217;s great for identifying type errors and unreachable code. Here&#8217;s how to use it from the <a title="Using Dialyzer from the Command Line" href="http://www.erlang.org/doc/apps/dialyzer/dialyzer_chapter.html#1.3">command line</a>.</p>
<blockquote><p>dialyzer -r PATH/TO/APP -I PATH/TO/INCLUDE</p></blockquote>
<p>Pretty simple! <em>PATH/TO/APP</em> should be an erlang application directory containing your  <em>ebin/</em><em></em> and/or<em></em><em> src/</em> directories. <em>PATH/TO/INCLUDE</em> should be a path to a directory that contains any <em>.hrl</em> files that need to be included. The -I is optional if you have no include files. You can have as many -r and -I options as you need. If you add -q, then dialyzer runs more quietly, succeeding silently or reporting any errors found.</p>
<p>If you have a test/ directory with <a title="Unit Testing with Common Test" href="/2008/11/26/unit-testing-with-erlangs-common-test-framework/">Common Test suites</a>, then you&#8217;ll want to add &#8220;-I /usr/lib/erlang/lib/test_server*/include/&#8221; and &#8220;-I /usr/lib/erlang/lib/common_test*/include/&#8221;. I&#8217;ve actually set this up in my <a title="How to write a Makefile" href="http://www.hsrl.rutgers.edu/ug/make_help.html">Makefile</a> to run as <strong>make check</strong>. It&#8217;s been great for catching bad return types, misspellings, and wrong function parameters.</p>
<br />Posted in erlang Tagged: dialyzer, make, testing <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/streamhacker.wordpress.com/120/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/streamhacker.wordpress.com/120/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/streamhacker.wordpress.com/120/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/streamhacker.wordpress.com/120/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/streamhacker.wordpress.com/120/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/streamhacker.wordpress.com/120/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/streamhacker.wordpress.com/120/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/streamhacker.wordpress.com/120/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/streamhacker.wordpress.com/120/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/streamhacker.wordpress.com/120/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/streamhacker.wordpress.com/120/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/streamhacker.wordpress.com/120/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/streamhacker.wordpress.com/120/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/streamhacker.wordpress.com/120/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=120&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://streamhacker.wordpress.com/2009/01/15/static-analysis-of-erlang-code-with-dialyzer/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/348a20c3a576b2cb26674f1bc9eaf012?s=96&#38;d=identicon" medium="image">
			<media:title type="html">jacob</media:title>
		</media:content>
	</item>
		<item>
		<title>Programming as Design</title>
		<link>http://streamhacker.wordpress.com/2009/01/07/programming-as-design/</link>
		<comments>http://streamhacker.wordpress.com/2009/01/07/programming-as-design/#comments</comments>
		<pubDate>Wed, 07 Jan 2009 20:19:43 +0000</pubDate>
		<dc:creator>Jacob</dc:creator>
				<category><![CDATA[design]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[alignment]]></category>
		<category><![CDATA[consistency]]></category>
		<category><![CDATA[contrast]]></category>
		<category><![CDATA[patterns]]></category>
		<category><![CDATA[proximity]]></category>

		<guid isPermaLink="false">http://streamhacker.wordpress.com/?p=93</guid>
		<description><![CDATA[Some say programming is engineering, others call it an art. A few might (mistakenly) think it&#8217;s a science. But both the art and engineering can be encapsulated under the umbrella of design. The best design is functional art, and a huge part of the artistic beauty of a product is a result of carefully engineered [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=93&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Some say programming is <a title="Software Engineering" href="http://en.wikipedia.org/wiki/Software_engineering">engineering</a>, others call it an <a title="http://www.amazon.com/gp/product/0201485419?ie=UTF8&amp;tag=cloudshadows-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0201485419" href="http://www.amazon.com/Art-Computer-Programming-Volumes-Boxed/dp/0201485419">art</a>. A few might (mistakenly) think it&#8217;s a <a title="Computer Science vs Programming" href="http://www.zenperfect.com/2007/07/30/computer-science-vs-programming-software-engineering/">science</a>. But both the <a title="Programming is an Art" href="http://www.wisdump.com/web-programming/programming-is-an-art/">art</a> and <a title="Software Craftsmenship vs Software Engineering" href="http://andymaleh.blogspot.com/2008/12/software-craftsmanship-vs-software.html">engineering</a> can be encapsulated under the umbrella of <a title="It Comes from Software Design Rather than Software Tools" href="http://blog.scottbellware.com/2008/12/productivity-it-comes-from-software.html">design</a>. The best design is <a title="Design, Art and Usability" href="http://www.edicy.com/blog/design-art-and-usability">functional art</a>, and a huge part of the artistic beauty of a <a title="Beautiful and Original Product Designs" href="http://www.smashingmagazine.com/2008/05/26/beautiful-and-original-product-designs/">product</a> is a result of <a title="A Practical Handbook of Software Construction" href="http://www.amazon.com/gp/product/0735619670?ie=UTF8&amp;tag=cloudshadows-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0735619670">carefully engineered</a> functionality. Products that are not carefully designed and engineered generally suck to use, and that applies to everything from cameras to <a title="Beautiful API" href="http://fromapitosolution.blogspot.com/2008/11/beautiful-api.html">software APIs</a>.</p>
<h3>Design Principles</h3>
<p>There are <a title="The Four Basic Design Principles" href="http://designerside.com/article/four-basic-design-principles">4</a> <a title="The CRAP Principles of Design" href="http://www.johnmonte.com/2008/07/crap-principles-of-design.html">major</a> <a title="Graphic Design Principles" href="https://www.msu.edu/~glazered/tc801/graphic.html">principles</a> of <a title="Graphic Design" href="http://en.wikipedia.org/wiki/Graphic_design">graphic design</a>.</p>
<ol>
<li><a title="The Importance of Alignment" href="http://www.presentationzen.com/presentationzen/2008/11/think-graphic-design-doesnt-matter.html">alignment</a></li>
<li><a title="Proximity and Alignment" href="http://designguyshow.blogspot.com/2008/12/design-guy-episode-35-proximity-and.html">proximity</a></li>
<li><a title="Contrast" href="http://www.bluemoonwebdesign.com/art-lessons-6.asp">contrast</a></li>
<li><a title="Design Consistency" href="http://lrs.ed.uiuc.edu/students/srutledg/goodsites10.html">consistency</a> / <a title="Good Design Principles - Repetition" href="http://www.swinburne.edu.au/design/tutorials/design/design/#four">repetition</a></li>
</ol>
<h4>Alignment</h4>
<p>The principle of <strong>alignment</strong> is that <strong>everything on a page should be connected to something else on the page</strong>. The goal of <a title="Alignmnent in Graphic Design" href="http://www.webdesignfromscratch.com/alignment.php">alignment in graphic design</a> is to create visual associations, often using a <a title="Designing with Grids" href="http://www.smashingmagazine.com/2007/04/14/designing-with-grid-based-approach/">grid based layout</a>. In software, we can apply this principle to the connectedness of data, such as the object inheritance hierarchy and <a title="Fundamentals of Relational Database Design" href="http://www.deeptraining.com/litwin/dbdesign/FundamentalsOfRelationalDatabaseDesign.aspx">relational</a> <a title="Handbook of Relational Database Design" href="http://www.amazon.com/gp/product/0201114348?ie=UTF8&amp;tag=cloudshadows-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0201114348">data</a> structures. Ideally, all your objects fit nicely into a well-defined <a title="Visualizing Hierarchical Data" href="http://www.timshowers.com/2008/12/visualization-strategies-hierarchical-data/">hierarchy</a> and your <a title="Database Modeling and Design" href="http://www.amazon.com/gp/product/0126853525?ie=UTF8&amp;tag=cloudshadows-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0126853525">data</a> structures relate to each other in an <a title="Relational Database Design Clearly Explained" href="http://www.amazon.com/gp/product/1558608206?ie=UTF8&amp;tag=cloudshadows-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=1558608206">intuitive fashion</a>. Of course, the real world of programming is never as clean as you&#8217;d like, but keep this principle in mind whenever you create a new object, add a new dependency, or modify relational structures.</p>
<ul>
<li>Does the object cleanly fit within the existing hierarchy? If not, do you need to change the new object, or re-align the hierarchy?</li>
<li>How does this data structure relate to that other data structure? What will happen if the relations change? Will you need to re-align the relational structures?</li>
</ul>
<p>Keeping your objects and data structures neatly aligned will result in easier to understand relations and hierarchies.</p>
<h4>Proximity</h4>
<p>The principle of <strong>proximity</strong> is that <strong>related items should be grouped together</strong>. <a title="Use Grouping to Associate Elements" href="http://www.webdesignfromscratch.com/grouping.php">Grouping</a> things together is simple way to show relatedness. In software development, that generally means putting related functions into the same module, and related modules into the same package. Helper functions should be located near the functions that call them. Basically, group blocks of code in a logically consistent manner. And if possible, put documentation and tests close to the code too (python&#8217;s <a title="Python Doctest" href="http://docs.python.org/library/doctest.html">doctest</a> module provides a great way to do that). One of the major benefits of following this principle is that it reduces the amount of time you&#8217;ll spend searching thru and understanding your own code. If all your code is organized in logical groups, and related functions are near each other in the same file, then it&#8217;s much easier to find a particular block of code.</p>
<h4>Contrast</h4>
<p>The principle of <strong>contrast</strong> is that <strong>if two things are not the same, then make them very different</strong>. The goal with <a title="Good Designs Have Strong Contrast" href="http://www.idratherbewriting.com/2009/01/03/contrast-is-prominent-in-good-design/">contrast</a> is to make different things distinctive from each other. <em>Naming</em> is great place to apply this principle. Names are all you have to distinguish between objects, functions, variables, and modules, so make sure that your names are distinctive and descriptive. Good names can tell you exactly what something is, and even imply its properties and behavior. Use different naming styles for different types of things. Private variables could be prefixed with an underscore, like <em>_private</em>, versus public variables like <em>public</em>, and <a title="CamelCase" href="http://en.wikipedia.org/wiki/CamelCase">CamelCase</a> class names, as in <em>MyClassName</em>. Having a distinctive naming style lets you know at a glance whether something is a class, variable, or function, making your code much more readable. Whatever naming style you choose, use it consistently.</p>
<h4>Consistency</h4>
<p>The principle of <strong>consistency</strong>, or <strong>repetition</strong>, is that you <strong>repeat design elements</strong>. Repetition helps patterns become internalized and instantly recognizable. For programming, that means keeping a consistent <a title="Coding Conventions" href="http://en.wikipedia.org/wiki/Coding_conventions">code</a> <a title="Style Guide for Python Code" href="http://www.python.org/dev/peps/pep-0008/">style</a>, with consistent <a title="Naming Conventions" href="http://en.wikipedia.org/wiki/Naming_conventions_(programming)">naming practices</a>. Also, try to use well known standard conventions and protocols, shared libraries, and <a title="Elements of Reusable Object-Oriented Software" href="http://www.amazon.com/gp/product/0201633612?ie=UTF8&amp;tag=cloudshadows-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0201633612">design patterns</a>. Your code should make sense, or at least be readable, to those familiar with the language and domain. You&#8217;re not just writing code for yourself, you might be writing code for other programmers, maybe your manager, but most importantly, you&#8217;re writing code for your future self. It always sucks coming back to code you haven&#8217;t touched in months and not knowing what the hell is going on. Consistent <a title="Minimalist Coding Style" href="http://www.notesfromatooluser.com/2008/07/minimalist-coding-style.html">style</a> and software <a title="Spartan Programming" href="http://www.codinghorror.com/blog/archives/001148.html">design</a> can save you from that headache.</p>
<h3>Programming as Design</h3>
<p>If all this seems like obvious <a title="Common Sense" href="http://en.wikipedia.org/wiki/Common_sense">common sense</a> to you, then great! But common sense isn&#8217;t always so common. The point of this article is make you aware that everyday software programming is filled with <a title="The Non-Designer's Design Book" href="http://www.amazon.com/gp/product/0321534042?ie=UTF8&amp;tag=cloudshadows-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0321534042">design choices</a>. Naming a variable is a design choice. Creating a new module is a design choice. The layout of your working directory is a design choice. Be conscious of these choices and use the above principles and to inform your <a title="Be Inflexible" href="http://www.wilshipley.com/blog/2007/05/pimp-my-code-part-14-be-inflexible.html">decisions</a>. The choices you make <a title="Graphic Design is not the Process, It's the Communication" href="http://ragingfx.com/2008-08/graphic-design-isnottheprocess-its-the-communication/">communicate</a> how the software works and how the code fits together. Make every choice <a title="Deliberate Design Decicions" href="http://bbrathwaite.wordpress.com/2008/09/08/deliberate-design-decisions/">deliberate</a> and justifiable. Use <a title="Improving the Design of Existing Code" href="http://www.amazon.com/gp/product/0201485672?ie=UTF8&amp;tag=cloudshadows-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0201485672">refactoring</a> to improve the design without affecting the functionality.</p>
<br />Posted in design, programming Tagged: alignment, consistency, contrast, patterns, proximity <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/streamhacker.wordpress.com/93/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/streamhacker.wordpress.com/93/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/streamhacker.wordpress.com/93/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/streamhacker.wordpress.com/93/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/streamhacker.wordpress.com/93/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/streamhacker.wordpress.com/93/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/streamhacker.wordpress.com/93/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/streamhacker.wordpress.com/93/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/streamhacker.wordpress.com/93/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/streamhacker.wordpress.com/93/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/streamhacker.wordpress.com/93/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/streamhacker.wordpress.com/93/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/streamhacker.wordpress.com/93/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/streamhacker.wordpress.com/93/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=streamhacker.wordpress.com&amp;blog=5393284&amp;post=93&amp;subd=streamhacker&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://streamhacker.wordpress.com/2009/01/07/programming-as-design/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/348a20c3a576b2cb26674f1bc9eaf012?s=96&#38;d=identicon" medium="image">
			<media:title type="html">jacob</media:title>
		</media:content>
	</item>
	</channel>
</rss>
