Category Archives: Customer Joy

Better Game Playing using Parallel Algorithms

Twitter Summary: Innovations in playing the game “Go” using Monte Carlo Tree Search and parallel algorithms.

I have a new post up at blog@CACM, “Better Game Playing using Parallel AlgorithmsThe post was inspired by a friend of mine who visited from New York who was working on IBM’s Blue Fuego. Blue Fuego is their  “Go” playing system similar to Deep Blue the chess playing software and machine that defeated Kasparov in 1997.  The interesting part was that this work was an incremental innovation in game evaluation that emerged in 2006  that lead to a major leap in the performance of the “Go” playing software and would lead to the defeat of a human professional by 2009.

I for one welcome our computational overlords.

Hope vs Fear: Privacy challenges in online health communities.

Twitter Summary: The difficulty in creating privacy controls in online health communities represents the struggle between hope and fear.

I have a new post up at blog@CACM, “Hope vs Fear: Privacy challenges in online health communities.”  The post was inspired by all the controversy surrounding Facebook privacy updates which reminded me of  my previous startup which also struggled with defining how much privacy our customers wanted in the website.  My advice is to not post anything you don’t think will be seen by your mom, and at the same time to absolutely post stuff that your mom will see how you are your own person.

The Joy of Building

Twitter Summary: The joy of building and making something successful in any size of organization.

I recently signed on to work for Google as an Engineering Director in its Seattle offices. This move was a little surprising to everyone including myself.  I had been working on consulting with various startups around Seattle, and had an eye towards finding a new thing to build. When Google approached me this time, I was compelled by the opportunity to work with their teams improving its core technologies. There was also the opportunity to explore what else you could do if you had an “infinite” amount of computing resources. What I was happiest to hear in talking with members of the leadership team, is that they all wanted to improve how agile the company was while improving their current system and creating new products.

Generally, I write about lessons I have learned at previous companies for use in a future startup. The lessons for that startup can be summarized as, “Get focused. Find a great team, customers, vendors and investors. Build something great and have your customers buy it.” Funny enough the same lessons are applicable at larger companies.  In a startup, you have resource constraints in figuring out how much people, money and time it will take to build your product. In a large organization, you have the exact same constraints.  The key difference is that in a large organization the scale of the constraints gives you more flexibility in what you can deliver. Organizations like Google have made it their business to build tools like BigTable and MapReduce to provide their employees better tools to improve search. Amazon has been investing in web services and working to figure out how to have it work better for both external and internal groups. The constraints in these types of larger organizations are more often building the right team and setting expectations of when projects can be delivered.

Innovating within a large company has the advantage of being able to leverage the whole organization’s people, time and money if you have a runaway success.  Apple recently reported that 40% of their revenue comes from the iPhone where only a few years ago it was only making desktops and laptops. Amazon’s Kindle was it’s top seller in electronics this past year and has been successful enough that it is drawing engineers from other groups within the company who want to work on the “next cool thing.”  If you have an innovation within a startup you are much more susceptible to having it be starved by a lack of funding to promote it, a lack of time to have it gain traction, or not enough team members to make it excellent.

I am looking forward to working at Google as I will be able to ask the interview question:  “What would you build if you had an infinite amount of CPU/RAM/Disk?” The big reward will be if the candidate is hired and joins the company, I will be able to give them access to an infrastructure that will help them build what they imagine and see if it helps the customer.

Better GPS Software through user feedback

I have a new post up at blog@CACM, “Better GPS Software through user feedback“, about some of my experiences and frustration in using the software for my myriad of GPS devices.

I love my GPS devices and use them because they provide a great utility in either tracking my running and biking or informing me of the location of the closest coffee house. However, in using them its easy to see how they could be massively improved if they leveraged all the other people who use the devices similarly and anonymously shared the aggregate information. Amazon and Google improved their search engine relevance by getting lots of traffic and aggregating user behavior in web space. I believe that software like Garmin’s Connect and Google Maps could benefit through the use of aggregating people’s behavior in physical space.

Smartphones and Health Systems Research at Intel Seattle

I have a new post up at blog@CACM, “Smartphones and Health Systems Research at Intel Seattle“, about a recent open house I attended at Intel Seattle research labs.

One of the biggest factors to success in weight loss and management is if you are able to simply write down what you eat and have a support group help you along the way. If you can make a tool that makes it trivial to do both you would be more successful in your weight loss goals. The iPhone has some applications for this, but so far the applications I have seen aren’t as seamless as I think they should be in computing food eaten and sharing results with people who are looking to help you.

[For those who like the science behind the statements, the research article was published in August 2008 in the American Journal of Preventive Medicine.]

Other people’s postings

Two posts came out in the past few months that I would like to highlight as they both served to remind me that just about every organization is striving to make great environments for their employees and their customers.

The first is the Netflix “Reference Guide on our Freedom & Responsibility Culture” slide presentation outlining how Netflix chooses to distinguish itself from other organizations. So many great comments are available about it, I will merely add this would be a great resource to crib from if starting a new organization or if you are looking for a successful culture to model.

The second is from Kate Roth that highlights how a high service organization like the Ritz-Carlton encourages and allocates money per employee to impress their guests. I liked this one because it made me think of what employees at a web startup could do to “wow” their customers. In cash strapped start-ups, the “wow” may not be directly monetary, but can certainly be budgeted into the time it takes to deliver a piece of functionality.  This extra functionality would hopefully be something that amuses, astounds or makes the customer’s experience uniquely satisfying in their use of the product.

Information Aggregators: A marketer’s dream

Twitter Summary:  Information aggregators have the best chance at getting the right advertisement to the right person at the right time.

Prior to advertising on the web, advertisers were primarily paying for repeated impressions over broadcast technologies, such as radio, billboard and TV advertising. The amount they paid was based on the number of people that saw the shows, heard the radio, or passed by the billboard while the advertisement was displayed.  The metrics that the marketers use to target their customers are viewer age, sex, and zip code in the attempts to influence the viewer when they made their next purchase.

The web changed the nature of advertising by being much more precise then any of the previous methods. My first “favorite” search engine was Altavista. It returned results quickly and it was relatively complete for its time. It made money by selling banner advertising to whomever would pay for the space above the search results.  Advertisers were still paying primarily for impressions following the same strategy as for broadcast technologies, but this time they could get complete information about how many times their advertisements were actually seen by customers.

Google also began with marketing deals to display sponsored results at the top of a search results page based on their customer’s keyword searches.  Their innovation was to eventually only charge if the customer clicked on the advertisement to go to the resulting page, instead of just viewing the advertisement. This new scheme, branded as Google AdWords,  improved advertising results by scoring advertisements and only presenting the ones that were good enough to earn their customer’s clicks. Cost per click (CPC) advertising ensured the quality of the ads were high, and they created a multi-billion dollar business that replaced the impression based advertisements that appeared in the Google search results.

The challenge for marketers in this environment is getting the right advertisement to the right person at the right time.  With customer’s jumping from website to website, and searching for disjoint topics, its difficult to know who the customer is, which advertisements they have seen already, and whether they are spending the money effectively to reach everyone they can.

My prediction is “information aggregation” utilities such as Google Wave, Gist, or the customer’s web browser with integrated email client will be the next best source for complete marketing information. These aggregators could solicit customers for biographical data both explicitly (customer surveys) and implicitly  (observing the customer as they search and click on web links), and selectively share the biographical data with advertisers. This complete biographical view of  the customer will allow advertisers to more effectievely target fewer and better ads to everyone.

The challenge will be convincing the customers that the advertising they get is to their benefit as well as the advertiser’s benefit. There are concerns that this may be considered too invasive to the privacy of the customer as they navigate the web. Fortunately, the trends on the web indicate that younger audiences are more willing to share information via Youtube, Twitter, and Facebook. This would imply they are more likely to share their information with marketers provided they are included in the value proposition and actually get a benefit from the advertisements they see.  The success of Hulu bears this out since they are explicit about the tradeoff of getting on-demand video viewing while being required to sit through commercial breaks.  As people get used to the idea of providing more information to marketers, they will have the ability to get more focused marketing information that could help them make informed decisions in the future.

Search Engine Relevance: User Feedback Loop

Twitter Summary: User relevant search results can only be achieved by incorporating a user-centered feedback loop.

The largest improvement in the relevancy of modern search engines has been the incorporation of the user feedback loop. Historically, search results were evaluated only on the basis of the contents of the documents:  Words in the title were thought to be more valuable then words mentioned in the body of the document, author names were more important than if the name appeared in the subject, and words mentioned earlier in the document were more important then words mentioned later in the document. The net effect was that the search results were limited to weight and score of each document. A user had no input to the scoring function that determined if the document was a good match to their search result.

The most successful search engines learned early on that by using feedback loops to leverage human feedback you could add additional information into an index and get a better search result. Google implemented a “user feedback loop” through the incorporation of PageRank. PageRank took the network of outbound links from websites and weighted pages that they referred to higher in search results. The search engine used the basic weighting of words in title, subject, and header as important, but also included an additional factor that leveraged the human behavior of linking pages to  to websites that they thought were valuable. The result was a search engine that has become the defacto  standard for finding information on the web.

Amazon.com was able to incorporate an even stronger feedback loop by tying together three human activities: the words the customers used to search for an item,  to the items they clicked on, and the items they eventually bought. Their algorithms are exceedingly strong because Amazon’s scoring function involves money. Nothing says something is valuable to a person more than their willingness to pay money for it.  If you present results to a customer that doesn’t produce a sale, that is a sign the algorithm could potentially be improved.

Yelp includes a different user feedback loop in the way it refines a search for distance, specifically, including an option for walking distances.  If picking out a “Thai” restaurant in Seattle, the search results are biased towards good places that are within walking distance to where Yelp currently understands the user to be.  The presumption is that a restaurant that is close by is an important factor in where a user decides to eat. The effectiveness of the algorithm is evident by the fact that the company continues to use it as a default in search results.

In all these cases, a machine could clearly make a decision as to what is a better search result based on just the name of a restaurant, or title of a book.  However, a useful search result for a machine doesn’t require user relevant feedback as it would be meaningless for a machine to walk to lunch or decide whether it wants to buy a book.

If you are building your own search engine, think about what user feedback loops you should incorporate to return a better result.  Great user feedback loops should consider both positive results (a sale being made, a link being clicked on) and also negative results (what does a customer do after an empty search result, or if they are presented a result where they don’t click on any of the links).  The end product will be more relevant search results for you customers, and faster navigation through the website to the customer’s desired pages.

You are what you index

Twitter Summary:  Search engines only return results for items indexed. The more garbage they add, the more garbage their customers see.

Garbage In, garbage out

Search engines can only return results based on the content they index. The items indexed need to (a) match the user’s expectation, and (b) need to be relevant high quality results. This seems really obvious but the implications for building high quality search results are tremendous in that web sites are quickly limited to the type of information they can return. In addition the nature of the items indexed create requirements for information quality that most search engines can’t completely control.

Prime examples of this are:

Search Engine Items Indexed Relevancy
Amazon.com Things People can buy Sales Rank
Google Web Pages Page Rank
Twitter search Tweets Chronological Order

These search engines will break the customer’s expectations if they begin returning results that the customer did not intend to see. If Amazon.com were to start returning web pages or Twitter “tweets” as part of its search results, it would entirely miss the expectations of its customers as to what the search results should provide for them. If Google began returning a page full of “tweet” results, and product information, it would certainly capture more information to index, but it would unlikely return results that would be satisfying to the customer.

The relevancy of the results also becomes a limiting factor for these search engines as they need to spend effort controlling for quality when their basic algorithms return inadequate results. In Amazon’s case, quality of results is impacted by manufacturers creating product names that have typos, or music band names that were intentionally misspelled. It would be easier to just let the misspellings float through the system, but by returning a poor result or not showing the customer a potential match they can reduce a potential sale. In Google’s case, they have to deal with malicious websites that create inflated page ranks for pages through the use of “link farms”,  or websites that are mirror images of other websites that only vary by the URL at the top of the page. Rather then display the same information repeatedly on the same result page, Google spends effort removing duplicate web pages and eliminating rank inappropriately created by a link farm. In Twitter’s case, returning results in chronological order is simple, but a search for “milk chocolate” and “chocolate milk” are actually two distinct searches, and search.twitter.com can’t improve the results without breaking their default time ordering. Twitter also suffers from their user’s typos (or simple pluralization mistakes) that could be remedied, but because of how they return and display search results makes it difficult to fix.

At various companies I have worked, there have been multiple attempts to return search results that returned a variety of types of data. The user interface challenge was enormous and difficult, since that type of results requires the user to know what type of result they should expect to see prior to clicking on the link. In the end, it was simpler to design, create and explain to the customers that depending on which search box they used, they would have the type of result they were expecting. By managing customer expectations prior to the search, it resulted in making the page easier to design and for the customer to use.

Improving data quality is challenging regardless of domain and frequently requires human judgment as the data is typically made by humans for humans to consume. If the search engines were made for just machines to use, we wouldn’t need relevancy as a machine could process all the results.  People using a search engine require assistance in helping figure out which search result matches their query.

In the end, the old rule of “Garbage In, Garbage Out” or in this case, “You are what you index” is tremendously meaningful in figuring out what it takes to return a great search result.

Startups Are Like Dating

I frequently joke that working at a startup is a great analogy for dating.

It generally starts hot and heavy and you don’t know if this is goingHolding Hands on the Beach to be a great relationship or the ex from hell.  Having experienced both extremes with startups, I recommend that you take each of them with a light attitude, a determination to try and make sure to enjoy the ride. If you find that one isn’t working out there is no shame in being honest and ending it quickly and moving on to the next startup. Once you have seen enough startups, you begin to find the patterns of the ones that appeal to your sensibilities and which will keep you engaged to the best of your abilities.

Recently the analogy showed a little more depth when I was discussing with friends about most Web 2.0 websites tried to use the same techniques to get customers that Pickup Artists (wikipedia) use to get dates.

  • Quickly identify who in a crowd you would like to meet.
    • Send lots of email to people who could use your website.
  • Introduce yourself and quickly demonstrate that you’re something special/have unusual value
    • Create a special welcome landing page for first time visitors telling them about yourself. Make sure website design communicates correct image.
  • Get them to state an opinion about anything to see if they are interested in talking. If not, move on.
    • Collect any information from the customer via clickstream, keyword analysis, or customer provided poll information
  • Tell them something about themselves that they don’t know and imply you can  tell them more if they hang around.
    • Create personalization based off collected interactions and tell them how much better it would be if they keep using the website.
  • Take them to somewhere private and create trust
    • Communicate private information using SSL protocol
  • Get their email address.
    • Well……yeah.
  • Set their expectations you will contact them later to meet again.
    • Create permission based marketing email to entice them to interact with your website some more.
  • Ask them to invite their friends to meet you.
    • Provide incentives about how much better the website is by encouraging all their friends to join.

I don’t know what is scarier, that the analogues are so close, or that nobody has done a more systematic analysis as to which of these techniques are  successful.

Don’t hate the player. Hate the game.