Trying to cope with data overload

As data grows we need better relationship with software

First you can handle things. You don’t get that much email. But as the emails increases the obvious response is the categorize them into folders. Most email programs have folders. This works fine for awhile and you can now see all your emails about a specific subject in a single folder. Then the folders get out of control and you can’t find anything anymore. Then comes the search. Gmail took the search approach. Don’t use folders, just search. Eventually even the search gets out of control, and the results are not useful. Then personalization and behavioral techniques come into play. The only way to cope with data overload is to get some intelligence into the software we use, some sort of an intimate relationship.

One example of this is what Google did with gmail: priority email. The priority email is an attempt to cope with increasing load of email. By learning from your behavior, gmail learns what emails you find most useful. Things that you just delete and never respond to, must be less important than items you respond to. This is some sort of  a reverse spam filtering, a widely know and used technology.

This step in the evolution of email is very similar to the general response of information overload. When the web was beginning you could pretty much have good overview of  important web pages. As new pages came along it became increasingly difficult to keep up with what was available. Yahoo! was an answer to that. Categorization of web pages into directories, very similar as email folders. But the Yahoo! became cluttered and complicated. So search became important with Altavista and other early engines. Google became popular by applying some general ranking to search results making the important things appear first. As search increases, personalized and behavioral search becomes important.

Google has already implemented some logic to searchers. When a colleague of mine at work was trying different searches for his SEO work, he found an article by myself popped up on first page. I was of course pleased to hear this, happy to become an important figure in the world of technology. However, when he repeated the search in another country, my brilliant input was nowhere to be found. Either Google is very unstable and changes quickly or the fact that being in another country has some relevance. Whatever the logic used, it is clear that search companies must try to find intelligent ways to rank results. And the obvious result is to have the search engine develop an understanding of the user.

Communication is a killer business and always has been. As data increases, technology innovations that solve the overload will emerge. Data analysis software will become real-time, trying to interpret messages and their relevance. Using simple statistics is not enough, the intelligence has to learn the behavior of the user. We are seeing the rise of user – software relationships.