Archive for category Uncategorized

Agile – I know it when I see it

This is the start of a post I’ve added to my Pivotal Labs blog

What does an agile software development team look like?

At core, software engineers turn ideas into results. Of course we are not the only ones with this job description. We share it with many other creative professions. Writers, for example, turn ideas into words that inspire, educate and inform. That’s pretty much what we do too: we turn ideas into words that instruct a computer system to perform a desired behaviour.

Focusing on writers for a moment… there are a wide range of writing environments and styles. On one side of the spectrum are novelists who secret themselves away into a quiet room, where they can find the time and space to breathe life into their intricate vision. On the other are journalists on the newsroom floor, sampling from an avalanche of information, responding quickly to what’s new and what’s important.

At Pivotal Labs, we work with many fast-moving companies, helping them to fashion the software engineering capability they need to succeed. Our approach is to create an environment that resembles a newsroom more than a writer’s hide away.


Leave a comment

Automate Gmail Email to Remember the Milk Task

Every once in a while I spend a bit of time reviewing and streamlining my GTD process. This time I hit on a pretty big win — automating a connection between Gmail and Remember the Milk so that collecting actions from across all of my Gmail accounts is one-click easy. This automation makes it a breeze to process email on my phone. Woohoo – GTD on the can!

Remember the Milk has been at the core of my GTD stack for several years now. I’ve looked into other systems, even trying  Astrid for a month, but RTM still wins for my requirements set. It’s got a scriptable API, a command line interface, a solid Android app, and a nice web interface that’s keyboard friendly. Of course it has it quirks and annoyances (why can’t I move tasks between lists using the keyboard?), but it’s the best that I’ve found.

For email, Gmail has wormed its way into being my primary mechanism. It too has it quirks and annoyances, but the network effects are strong and the price is right (well, was right for business accounts). I still use Thunderbird too, but now primarily as a local Gmail client.

For this round of streamlining the challenge I set was to enable keyboard-based creation of a RTM task from a Gmail email that includes a link back to the original thread for quick action. This is now possible using a Google App Script. Here’s the code:

This script watches a gmail account for any thread labeled with 'rtm'. When such a thread is found it
sends a task-creation email to Remember the Milk that includes a link back to the original thread.
To follow the link back to the email you have to be logged in to the originating Gmail account (which
is only an issue if you have multiple gmail accounts). Otherwise Google claims the email cannot be
To install it go into Google Drive and create a new "Script" (the link is a bit hidden in the create submenu).
Copy this code into the new project and set a time-based trigger.
Copyright 2013, Jesse Heitler <jesseh>
Permission to use, copy, modify, and/or distribute this software for any
purpose with or without fee is hereby granted, provided that the above
copyright notice and this permission notice appear in all copies.
var LABEL_NAME = 'rtm'; // This string is the Gmail label that triggers a task to be created.
var TARGET_EMAIL = ''; // From your RTM settings.
var RTM_LIST = ''; // Use to post the message to a specific RTM list
var RTM_TAGS = ''; // Use to add tags to the task in RTM
* Process all the threads with a given label and remove that label
* and archive it so that it is not processesed again.
function processLabel(labelName) {
var label = GmailApp.getUserLabelByName(labelName);
var threads;
var thread;
if (label) {
threads = label.getThreads();
for (var i = 0; i < threads.length; i++) {
thread = threads[i];
* Get from whom was the most recent unread, or the latest message in a thread.
function threadFrom(thread) {
var message;
var messages = thread.getMessages();
var from = messages[messages.length 1].getFrom();
for(var i = 0; i < messages.length; i++) {
message = messages[i];
if (message.isUnread()) {
from = message.getFrom();
return from;
function emailTask(details) {
var recipient = TARGET_EMAIL;
var subject = "Reply to '" + details['subject'] + "' " + details['from'] + " (" + details['count'] + ')';
var body = "";
if (details['priority']) {
body += "Priority: 3\n";
if (RTM_TAGS) {
body += "Tags: " + RTM_TAGS + "\n";
// body += "URL:" + details['permalink'] + "\n";
body += "URL:; + details['threadId'] + "&search=all\n"
if (RTM_LIST) {
body += "List: " + RTM_LIST + "\n";
Logger.log("Sent email with subject: " + subject + "\nBody:\n" + body);
GmailApp.sendEmail(recipient, subject, body)
function handleThread(thread) {
var details = {
threadId: thread.getId(),
subject: thread.getFirstMessageSubject(),
count: thread.getMessageCount(),
permalink: thread.getPermalink(),
priority: thread.hasStarredMessages(),
from: threadFrom(thread)
function main() {
var labelName = LABEL_NAME;

view raw
hosted with ❤ by GitHub

Please feel free to form the gist and improve it!


Data Analysis and Visualization

Going from data to action is a recurring challenge in a start up. And the process has never been easier due to the wealth of amazing open source tools including Python (pandas, numpy, matplotlib), iPython Notebook, and D3,js.

I’ve recently worked on a project in the container shipping industry where we had a large database of information about repairs to shipping containers. The challenge was to find actionable opportunities based on insights gleaned from the data. Here’s how I went about the data analysis.

Mungeing and Probing

I started the project by flexing the data this way and that using pandas and the ipython notebook (both amazing tools you should get to know). This took a few passes. First I got it loaded into a DataFrame. Then I altered the structure to make it easier to understand, such as replacing coded names with full text. With that out of the way it was time to explore.  The most helpful chart I made was this pareto chart which reveals the relative significance of various drivers in the data. Below is the code to generate the chart for any data series.

import pandas as pd
import matplotlib.pyplot as plt
def combined_label(perc, tot):
Format a label to include by Euros and %.
return "{0:,.0f}k EUR, {1:.0f}%".format(perc * tot / 1000, perc * 100)
def cost_cum(data, focus, subject):
Accumulate the stats.
– data is a DataFrame,
– focus is the colum to group by
– subject is the column to aggregate.
# Setup data frame
parts = data[[focus, 'cost']].groupby(focus).sum().sort(subject, ascending=False)
parts['percent'] = parts['cost'] / parts.cost.sum()
parts['cum_percent'] = parts['percent'].cumsum()
return parts
def cost_pareto(data, focus_name, limit_percent = 0.75):
# Filter and organize the data frame
top_parts = data[data['cum_percent'] < limit_percent]
# Draw the plots
fig = plt.figure(figsize=(10,7))
fig.subplots_adjust(bottom=0.4, left=0.15)
ax = fig.add_subplot(1,1,1)
top_parts['cum_percent'].plot(ax=ax, color="k", drawstyle="steps-post")
top_parts['percent'].plot(ax=ax, kind="bar", color="k", alpha=0.5)
ax.set_ylim(bottom=0, top=1)
tick_nums = [x/float(100) for x in range(0,101,20)]
tot_cost = top_parts['cost'].sum()
ax.set_yticklabels([combined_label(x, tot_cost) for x in tick_nums])
ax.set_title("Top %s%% of Cost Split By %s" % (int(limit_percent * 100), focus))
return ax
accumulated = cost_cum(data, 'Part', 'Cost')
chart = cost_pareto(accumulated, 'Part', 0.9)

view raw
hosted with ❤ by GitHub

Using these pareto charts, plus a variety of histograms and scatter plots, I was able to provide the team with an initial window into the data which we used to identify an avenue that was worthy of further investigation.


With a more clear destination in mind, my goal became creating a visualization that would reveal the opportunity within the data. The tool for this is D3.js. D3 is a little bit confusing to get ones head around at first, but it is well worth figuring out because the things that you can do with it are amazing.

In our case, I wanted to let our team explore the impact of various interventions to curtail types of damage or to protect various parts of the containers. While the pareto chart (above) provides a insight about the cost of various damage types or container parts, it falls short when the two dimensions need to be considered together.

My solution is at this interactive visualization (view full size) . With it our team has been able to explore the data set without having to write more code. They are no longer dependent on me to “run the numbers”. And, it didn’t even take too long to make.


I highly recommend adding data analysis and visualization tools to your toolkit. They aren’t hard to learn and they are amazingly powerful.

Leave a comment

Is Google becoming evil? Default to Duck Duck Go

Recently Google has made a few announcements that tarnish it’s image as a a company that looks out for my interests. As a result I’ve changed all my default search mechanisms to use DuckDuckGo. While I generally like DuckDuckGo’s results, the main reason for switching is that they don’t track me. And tracking based on search, particularly with Google’s new privacy policy, worries me.

At the bottom of this post are instructions if you want to do the same:

Big Data about people = stereotyping and prejudice

Recently in my work for LevelBusiness I’ve been learning about big data. It’s powerful, amazing and fun tech, and it’s all about stereotyping. As we all learned in primary school stereotyping, and it’s flip-side prejudice, are generally bad things that often lead to bad ends.

Big data involves boiling down vast data sets into actionable conclusions. Big data gets dicey when the topic at hand is people rather than things. The conclusion about people are of the form:

Then the actions are respectively:

  • promote baby products to this group.
  • only show certain search results to this group.
  • decline to insure this group.

The critical but often overlooked point as that these grouping are always just probabilities based on the underlying data set. For example, Amazon infers your religion based on the wrapping paper you buy. They don’t know for sure that your are Christian, Jewish, Muslim or Sikh, but they think that they have enough evidence to make it worth their while to treat you as if their assumption is true. This is the definition of prejudice:

Prejudice (or foredeeming) is making a judgment or assumption about someone or something before having enough knowledge to be able to do so with guaranteed accuracy, or “judging a book by its cover”.


Prejudice + Hearsay = No Good

Every company needs to get to know its customers. But when every internet-using person is your customer you have to take care to be responsible about what conclusions you draw from your  vast data sets and what actions you take based on them. Google may or may not be up to this task, I think this remains an open question. What bothers me though is the following clause in the new Google Terms and Conditions:

We have a good faith belief that access, use, preservation or disclosure of such information is reasonably necessary to (a) satisfy any applicable law, regulation, legal process or enforceable governmental request…

Google’s new privacy policy

Adam Levin’s of the Huffington Post’s analysis clarifies the risk:

Hold on, Bucky.

What exactly constitutes an “enforceable governmental request?” This sentence should read: “We will share information with a Governmental entity only when presented with a valid search warrant issued by a court of competent jurisdiction.” Such a provision would make it obvious that by giving information to Google, you do not intend to waive your constitutional rights, and it would make it clear that despite the fact that your information was shared willingly with a private sector entity, you reasonably retained an expectation of privacy against Government intrusion.

In other words, Google is stereotyping you, and then not only are they acting on that prejudice but they are saying that if a government comes calling then they will happily share what they think they know about you. If you have even the slightest distrust of government, your own or any other in the world, then this should worry you.

I know, this data-driven stereotyping and prejudice is happening all over the place, but that does not mean it’s good or safe. And, it certainly doesn’t mean that you have to be a willing sheep in the process. That’s why I’m switching away from Google as my default search service. I don’t want to feed more of my data into their prejudice machine.

Here are instructions if you want to do the same.

How to make DuckDuckGo your Default

Instructions from

  • Chrome: Right-click the Chrome Omnibox » in the last entry fill the search engine name and keyword and copy/paste the URL » click on ‘make default’. (Alternatively you may add DuckDuckGo with suggestions to your search engines’ list and make it your default engine).
  • IE: Enter DuckDuckGo hompage » click on the left arrow and select DDG from the sub-menu ‘Add Search Provider’ » check ‘Make this my default search provider’ » click on ‘Add’. (As in the previous example you may set DDG with suggestion as your default engine).
  • Firefox: Type ‘about:config’ in the awesome bar and press ‘enter’ » confirm the declaration » type ‘Keyword.URL’ in the filter box » copy and past the following URL » click ‘OK’ and close this tab or window.You can also install the DuckDuckGo search plugin here.
  • Opera: Right-click the DDG search box » Select ‘Create Search’ » type your preferred keyword » check ‘Use as default’ » click ‘OK’. (Note: Adding DDG and suggestions is more complicated and described here).
  • Safari: Enter DDG hompage » click on ‘Add to Safari’ » follow the instructions.

1 Comment