Monday, June 23, 2014

Real Time Insights using Apache Storm

Real Time Insights using Apache Storm

In the previous post on real time monitoring and alerts, discussed about the importance of real time optimization needs of Digital Businesses. 

Opensource community now provides Apache Storm - a powerful solution for this purpose.

What is Apache Storm?

As per Apache Foundation: Apache Storm is a free and open source distributed real time computation system. Storm makes it easy to reliably process unbounded streams of data, doing for real time processing what Hadoop did for batch processing.  Storm is simple, can be used with any programming language, and is a lot of fun to use!







Monday, June 2, 2014

Real Time Alerts, monitoring and optimization needs of Digital Businesses

Digital businesses spend millions each day on online advertising and in product promotions. Real time alerts are important for monitoring and optimizing the performance of online ad campaigns, promotions and making adjustments in real time to optimize the return on spend.

In addition to alerts on performance of online advertising, there are other specific needs for alerts such as
  • E-Commerce businesses want to know in real time how the traffic,conversions to different product categories are performing.
  • Almost all digital businesses want to understand which traffic segments to target at a given point of time. Which customers to target when?
  • Similarly what offers are to be provided at a given point of time.   


Hopefully new Analytics Architectures and Toolkits such as those based on Apache Spark, Apache Storm, NoSQL datastores etc. provide real time alerting capabilities on performance of specific KPIs and metrics. Large volumes of data in many varieties including streaming data in motion, is processed in Apache YARN/Hadoop and similar platforms, using clusters of distributed computing nodes. Any anomalies in data for specific metrics can be detected and reported in real time by sending alerts to business decision makers. Specific algorithms are written to run on these advanced Analytics platforms using programming languages like Java or Python or R. The need is to actually develop boxed applications around these algorithms, which can be installed and executed by end users on their own client machines, connected to these scalable, highly available, low latency and real-time data processing platforms over the cloud. 


Signature: Roopkumar T.V.

Wednesday, May 21, 2014

Application of Machine Learning algorithms in Digital Marketing

Some of the digital marketing challenges that can be addressed through machine learning algorithms include: 

  • Helping digital media planners and buyers determine each media or ad position based on qualitative and quantitative research that is outside your own website or app. This expanded view may encourage media planners and buyers to focus on quality buying of of ad inventory reducing cost of untargeted or unqualified ad inventory.
  • Enabling digital marketers to anticipate, identify and qualify audiences at the point of entry, and personalize content to maximize conversions or session outcomes. This will make content much more relevant and tailored to consumers. Also this personalization can be extended to multiple devices the audience uses. 
  • Providing content feeds to digital users based on their interests, past interactions or other important factors.  Social media platforms have some limited capacity in prioritizing newsfeed content order based on viewing, timings, and likelihood to consume.  So many top tier online experiences train consumers to expect a custom experience. So not providing priority content will disappoint.
  • Ranking visitors by value to the business. Not all traffic has equal value. Rapid website testing can be enabled and also personalization of page layouts, offers, messages, creative and content can be greatly improved by algorithms that have the means to assess the expected value of each visitor

Sunday, October 27, 2013

Twitter IPO

With Twitter planning to go public with it's IPO in early November 2013, it becomes another strong and consistently growing Internet / Digital brand to seek public investors. Twitter will get listed by Mid-November and irrespective of popular sentiment whether to invest in Twitter or not, it is one of the top Digital Brands and has proven it's popularity with 300 Billion tweets till date on its online
micro-blogging platform. Contrary to general public sentiments, strong and consistently growing Internet / Digital brands have always rewarded the best Return on Investment to investors. Of course a lot of patience is needed.

Amazon - 24,126.00% return since its IPO in May, 1997 
eBay - 2,661.50% return since its IPO in Sept, 1998
Google - 837.31% return since its IPO in Aug, 2004
Baidu (Chinese) - 1,200.47%  return since its IPO in Aug, 2005

Amazon has by far given the best ROI to long term investors compared to any other strong and popular Internet / Digital Brand. During the 2000-2003 period however, a lot of patience was needed by Amazon's shareholders as all Internet related companies were punished during those times. Those long term Amazon shareholders who stayed with it's stock through bad times, got all these brilliant returns by 2013. Of course Amazon listed long before Google, and this also means Google's stock still has lots and lots of room to grow even further, given that its brands like Google, YouTube, Gmail, Blogspot etc. are most popular among Digital users.   

Even Digital brands without any decent usage or popularity in US have given brilliant returns to investors, as long as they have been strong, consistently growing in popularity and user focused. Take Baidu, largest search engine in China and hardly used outside Chinese speaking population.

It is too early to make decide about Facebook's stock, since it has been listed only 17 months ago in May 2012. Given the popularity and strong brand value of Facebook as a super Digital brand, investors with patience may be celebrating a big party in a decade around 2023, just like Amazon's long term shareholders have done now in 2013.  With 1 billion active users, Facebook is no small turkey. Same with Twitter. Time is a great leveler for the patient investors in Internet / Digital brands. 

Disclaimer: All returns shown above for different companies, are absolute returns as on current date of this blog post, since the date of their respective IPOs. 


Signature: Roopkumar T.V.


Sunday, September 15, 2013

Delivering Targeted Ads and Communications to customers in real time using Big Data Algorithms.







The customers interact with various sales channels like website, call center, field sales force and mobile applications.

Each channel generates big volumes of data on customer interactions, in varieties such as



  1. Web clickstream data which is semi-structured and contains data on visitors, segments, traffic sources, user navigation, abandons, bounces, conversions like purchases or leads, etc.
  2. Call center data which is a mix of unstructured such as customer feedback and structured such as transactional data.
  3. Sales force data which again is a mix of unstructured such as customer feedback and structured such as transactional data in CRM.
  4. Mobile data which is semi-structured in form of weblogs, clickstream, applogs and contains data on visitors, locations, user navigation, abandons, conversions, etc. 
All these varieties of data, both data at rest and in motion are continuously stored into HDFS big data landing zone on the Big Data Refinery built using Apache Hadoop architecture.

Targeted online Ads (Web and Mobile Web) and targeted marketing communications are delivered across the WWW for each customer’s profile.



  1. Customer profiles have been built and stored in HDFS, based on all their interactions across different channels.
  2. Algorithms, using map-reduce programming method are executed by the Big Data Refinery to churn out targeted online Ads (Web and Mobile Web) and targeted marketing communications in real time.



Targeted and Re-targeted content (Ads and marketing communications) are viewed and interacted with by customers while browsing across the WWW. 


Signature: Roopkumar T.V. 

Monday, September 9, 2013

Algorithm driven Online Ad Optimization

Those who have done or have been doing SEM since it became popular last decade, would understand the importance of Pay Per Click Bid Optimization. How allocation of SEM budgets and CPC bidding tactics had to be applied in near real time across hundreds of thousands of Keywords, to maintain aggressive CPAs even while trying to meet Conversion targets. During initial years of SEM, which was specifically second half of last decade(2004-2010): 
  • Competition was not tight 
  • CPC prices were lower and
  • Generating Leads was much easier.  
Now in 2013, all these 3 points have changed and 
  • Competition has become bigger both in breadth and depth.
  • CPC prices have increased, skyrocketed in some Industry Verticals.
  • Generating qualified Leads has become even more difficult.
So manual intuitive bidding tactics or using Off-the-shelf bidding software is no longer useful.

Also other forms of Online Marketing like Online Display Ads and Email Marketing have seen huge improvements in Conversion Rates (%) over the years, due to application of successful Ad Re-Targeting algorithms. Just few years ago, SEM & SEO were the undisputed champions of Online Marketing. However now Algorithm driven Re-Targeted Online Display Ads are increasingly eating away Conversions from SEM & SEO. Google itself is promoting usage of Re-Targeted Online Display Ads to its Adwords accounts, offering multiple Ad formats apart from mainly Search Ads few years ago. 

Also new Online Channels like Twitter and Facebook have become more popular, though they are yet to prove their impact in Online Advertising, as their Online Ad share in still minuscule compared to Google Search Ads or Display Ads or even Email Marketing.

To summarize the above points
  1. Changing Economic Cycles
  2. Increasingly intense competition for generating qualified Leads or Customers Online
  3. Ever rising CPC prices in SEM
  4. Re-Birth of Online Display Ads and even Email Marketing as very reliable high Conversion Online Channels (both these had been written off few years ago) and 
  5. Finally proliferation of new Online Channels like Twitter or Facebook etc.
.......Have all increased the sophistication of Online Marketing. It is no longer efficient or effective to manage Online Ad Campaigns by throwing some internal teams and some external agencies. 

Welcome to the world of Algorithm driven Online Ad Optimization. The need of today, is to develop Algorithms which can process massive volumes of Online Marketing content and data in real time to effectively allocate Online Ad budgets across Online Channels, identify right mix and delivery of targeted Online Ads at right time to prospects to meet Conversion targets and maintain ever more aggressive CPAs.


Just like in Financial Services sector, where savvy Hedge Funds use sophisticated Algorithms to allocate Clients budgets optimally across stock or asset portfolios, even Online Ad sector needs to create sophisticated Algorithms to allocate Account budgets optimally across Online Channels or Ad portfolios. It's just the same.

Signature: Roopkumar T.V.

Sunday, September 8, 2013

Google's Online Ad Re-Targeting based on Post-Search Behavior of Users

What is similar between the below 4 screenshots of my web sessions :









  • In the 1st browser window, I am searching in YouTube for Euro Cup top 10 goals.
  • In the 2nd  window, I am watching a video on Euro Cup top 10 goals in YouTube.

Then I shut down my PC, and after an hour again start my PC, then open my web browser and start browsing some newspapers online.
  • In the 3rd window, I am reading an article in Eenadu, very popular Telugu language newspaper online.
  • In the 4th window, just out of curiosity I open a popular Tamil language daily magazine online, Dinamalar.com

If we notice all these web browsing sessions carefully, something is very strange. Every website, I had opened that evening, were displaying a particular Ad on “San Francisco Car Rentals”. This was strange.

I was also in Bangalore, India on that day.

What’s wrong here -
I am in Bangalore, then why should I get an online Display Ad on “San Francisco Car Rentals” on almost every website or YouTube video that I browsed to or watched on that day.  Definitely car rentals in San Francisco are not relevant for a Bangalore, India based browser who is hundreds of thousands of miles faraway.

Now I open DeccanHerald.com, very popular English language newspaper among old timers of Bangalore, and we can see a similar Display Ad “Car Hire SFO Airport” even on this website.



Definitely Ads on “San Francisco Car Rentals” or “SFO Airport Car Hire” are not relevant when I am watching a
  1. YouTube video on Euro Cup soccer top 10 goals. At least if Display Ads for Bus timings in Paris or Amsterdam was displayed while watching this video on European soccer we can understand, but what is a far away San Francisco, USA Car Rentals company’s online Ad doing on a YouTube video about Euro Cup soccer top 10 goals.
  2. India speaks many different languages, fine – but every language newspaper online whether Telugu, Tamil or English was displaying this Ad by this San Francisco, USA Car Rentals company. Again this online Ad is not relevant on these newspaper websites.

So what’s wrong here – If YouTube alone is not doing a good job of mapping Ad relevancy of its videos, we can understand; but how can many other websites in different languages belonging to totally different media companies also do the same mistake of showing the same irrelevant Ad? 

Actually nothing is wrong here. Let me give a clue, all these different websites that I had visited, from YouTube to regional language news websites were opened using the same web browser Google Chrome.

Welcome to the home of Ad Re-Targeting on Google’s Online Advertising platform.  

Actually on that same day, early in the morning I had searched for "San Francisco Car Rentals" in Google.com using Google Chrome web browser, since I was planning a business trip 2 weeks later to San Francisco. I had clicked on a Google Search Ad of a Car Rentals company, browsed to its website and viewed different models of cars and after selecting a model, decided to book it. Then after clicking on “Book this car” submission, on this Car Rentals website had decided to abandon this transaction midway without actually booking, as I wanted to check out other car hire deals on other Car Rental websites and also wanted to book closer to my trip date.

Google had tracked this entire activity of mine done early in the morning on that day, from the time I had searched for "San Francisco Car Rentals"  till I had abandoned the booking on that Car Rentals website, and everything in between. I am not sure, but Google might have even tracked the models of cars I had viewed on this Car Rentals website, the dates selected for car hire and even the car pickup location. I am sure about the last one, since later in that day I had seen an Ad on DeccanHerald.com on “Car Hire SFO Airport”. Therefore for Google’s Online Ad algorithms, this was a strong user intent towards a conversion, as I had tried to almost book a rental car before abandoning midway.

Hence these algorithms have decided to Re-Target me with online Display Ads on “San Francisco Car Rentals“ every time I was browsing through other websites or watching YouTube videos using Google Chrome browser. If I used a different web browser the next day or next week, I wouldn't see these Re-Targeted ads. As long as I am using Google’s Chrome web browser – for the next few days perhaps for few weeks I am going to frequently see this SFO Car Rental Ad in almost all my web browsing sessions, as I have become a good qualified lead for SFO car rentals.

I had even searched for "Hotels in San Francisco", on that same morning but did not try to book any hotel (as my hotel booking has to be done through my Employer’s Contractor only) and only wanted to check out the prices. For Google’s Online Ad algorithms, this was not a strong user intent towards a conversion – since I had not tried to book a hotel on the Hotel Booking website after clicking through the Google Search Ad. Therefore in case of San Francisco hotel booking, these algorithms have decided not to Re-Target me with online Ads as I am not a good prospect for SFO hotel booking.

Using the Visitor Identification and Browse behavior (including the Keywords searched in Google.com and the specific intent expressed by users on various websites visited post-search) Google’s powerful Online Ad algorithms are capable to Re-Target Ads to Visitors through the cookies stored in Google Chrome browsers. These algorithms typically execute on Big Data Analytics Architecture as discussed previously.

Signature: Roopkumar T.V.