Showing posts with label Big Data Analytics. Show all posts
Showing posts with label Big Data Analytics. Show all posts

Monday, June 2, 2014

Real Time Alerts, monitoring and optimization needs of Digital Businesses

Digital businesses spend millions each day on online advertising and in product promotions. Real time alerts are important for monitoring and optimizing the performance of online ad campaigns, promotions and making adjustments in real time to optimize the return on spend.

In addition to alerts on performance of online advertising, there are other specific needs for alerts such as
  • E-Commerce businesses want to know in real time how the traffic,conversions to different product categories are performing.
  • Almost all digital businesses want to understand which traffic segments to target at a given point of time. Which customers to target when?
  • Similarly what offers are to be provided at a given point of time.   


Hopefully new Analytics Architectures and Toolkits such as those based on Apache Spark, Apache Storm, NoSQL datastores etc. provide real time alerting capabilities on performance of specific KPIs and metrics. Large volumes of data in many varieties including streaming data in motion, is processed in Apache YARN/Hadoop and similar platforms, using clusters of distributed computing nodes. Any anomalies in data for specific metrics can be detected and reported in real time by sending alerts to business decision makers. Specific algorithms are written to run on these advanced Analytics platforms using programming languages like Java or Python or R. The need is to actually develop boxed applications around these algorithms, which can be installed and executed by end users on their own client machines, connected to these scalable, highly available, low latency and real-time data processing platforms over the cloud. 


Signature: Roopkumar T.V.

Wednesday, May 21, 2014

Application of Machine Learning algorithms in Digital Marketing

Some of the digital marketing challenges that can be addressed through machine learning algorithms include: 

  • Helping digital media planners and buyers determine each media or ad position based on qualitative and quantitative research that is outside your own website or app. This expanded view may encourage media planners and buyers to focus on quality buying of of ad inventory reducing cost of untargeted or unqualified ad inventory.
  • Enabling digital marketers to anticipate, identify and qualify audiences at the point of entry, and personalize content to maximize conversions or session outcomes. This will make content much more relevant and tailored to consumers. Also this personalization can be extended to multiple devices the audience uses. 
  • Providing content feeds to digital users based on their interests, past interactions or other important factors.  Social media platforms have some limited capacity in prioritizing newsfeed content order based on viewing, timings, and likelihood to consume.  So many top tier online experiences train consumers to expect a custom experience. So not providing priority content will disappoint.
  • Ranking visitors by value to the business. Not all traffic has equal value. Rapid website testing can be enabled and also personalization of page layouts, offers, messages, creative and content can be greatly improved by algorithms that have the means to assess the expected value of each visitor

Sunday, September 15, 2013

Delivering Targeted Ads and Communications to customers in real time using Big Data Algorithms.







The customers interact with various sales channels like website, call center, field sales force and mobile applications.

Each channel generates big volumes of data on customer interactions, in varieties such as



  1. Web clickstream data which is semi-structured and contains data on visitors, segments, traffic sources, user navigation, abandons, bounces, conversions like purchases or leads, etc.
  2. Call center data which is a mix of unstructured such as customer feedback and structured such as transactional data.
  3. Sales force data which again is a mix of unstructured such as customer feedback and structured such as transactional data in CRM.
  4. Mobile data which is semi-structured in form of weblogs, clickstream, applogs and contains data on visitors, locations, user navigation, abandons, conversions, etc. 
All these varieties of data, both data at rest and in motion are continuously stored into HDFS big data landing zone on the Big Data Refinery built using Apache Hadoop architecture.

Targeted online Ads (Web and Mobile Web) and targeted marketing communications are delivered across the WWW for each customer’s profile.



  1. Customer profiles have been built and stored in HDFS, based on all their interactions across different channels.
  2. Algorithms, using map-reduce programming method are executed by the Big Data Refinery to churn out targeted online Ads (Web and Mobile Web) and targeted marketing communications in real time.



Targeted and Re-targeted content (Ads and marketing communications) are viewed and interacted with by customers while browsing across the WWW. 


Signature: Roopkumar T.V. 

Monday, September 9, 2013

Algorithm driven Online Ad Optimization

Those who have done or have been doing SEM since it became popular last decade, would understand the importance of Pay Per Click Bid Optimization. How allocation of SEM budgets and CPC bidding tactics had to be applied in near real time across hundreds of thousands of Keywords, to maintain aggressive CPAs even while trying to meet Conversion targets. During initial years of SEM, which was specifically second half of last decade(2004-2010): 
  • Competition was not tight 
  • CPC prices were lower and
  • Generating Leads was much easier.  
Now in 2013, all these 3 points have changed and 
  • Competition has become bigger both in breadth and depth.
  • CPC prices have increased, skyrocketed in some Industry Verticals.
  • Generating qualified Leads has become even more difficult.
So manual intuitive bidding tactics or using Off-the-shelf bidding software is no longer useful.

Also other forms of Online Marketing like Online Display Ads and Email Marketing have seen huge improvements in Conversion Rates (%) over the years, due to application of successful Ad Re-Targeting algorithms. Just few years ago, SEM & SEO were the undisputed champions of Online Marketing. However now Algorithm driven Re-Targeted Online Display Ads are increasingly eating away Conversions from SEM & SEO. Google itself is promoting usage of Re-Targeted Online Display Ads to its Adwords accounts, offering multiple Ad formats apart from mainly Search Ads few years ago. 

Also new Online Channels like Twitter and Facebook have become more popular, though they are yet to prove their impact in Online Advertising, as their Online Ad share in still minuscule compared to Google Search Ads or Display Ads or even Email Marketing.

To summarize the above points
  1. Changing Economic Cycles
  2. Increasingly intense competition for generating qualified Leads or Customers Online
  3. Ever rising CPC prices in SEM
  4. Re-Birth of Online Display Ads and even Email Marketing as very reliable high Conversion Online Channels (both these had been written off few years ago) and 
  5. Finally proliferation of new Online Channels like Twitter or Facebook etc.
.......Have all increased the sophistication of Online Marketing. It is no longer efficient or effective to manage Online Ad Campaigns by throwing some internal teams and some external agencies. 

Welcome to the world of Algorithm driven Online Ad Optimization. The need of today, is to develop Algorithms which can process massive volumes of Online Marketing content and data in real time to effectively allocate Online Ad budgets across Online Channels, identify right mix and delivery of targeted Online Ads at right time to prospects to meet Conversion targets and maintain ever more aggressive CPAs.


Just like in Financial Services sector, where savvy Hedge Funds use sophisticated Algorithms to allocate Clients budgets optimally across stock or asset portfolios, even Online Ad sector needs to create sophisticated Algorithms to allocate Account budgets optimally across Online Channels or Ad portfolios. It's just the same.

Signature: Roopkumar T.V.

Sunday, September 8, 2013

Google's Online Ad Re-Targeting based on Post-Search Behavior of Users

What is similar between the below 4 screenshots of my web sessions :









  • In the 1st browser window, I am searching in YouTube for Euro Cup top 10 goals.
  • In the 2nd  window, I am watching a video on Euro Cup top 10 goals in YouTube.

Then I shut down my PC, and after an hour again start my PC, then open my web browser and start browsing some newspapers online.
  • In the 3rd window, I am reading an article in Eenadu, very popular Telugu language newspaper online.
  • In the 4th window, just out of curiosity I open a popular Tamil language daily magazine online, Dinamalar.com

If we notice all these web browsing sessions carefully, something is very strange. Every website, I had opened that evening, were displaying a particular Ad on “San Francisco Car Rentals”. This was strange.

I was also in Bangalore, India on that day.

What’s wrong here -
I am in Bangalore, then why should I get an online Display Ad on “San Francisco Car Rentals” on almost every website or YouTube video that I browsed to or watched on that day.  Definitely car rentals in San Francisco are not relevant for a Bangalore, India based browser who is hundreds of thousands of miles faraway.

Now I open DeccanHerald.com, very popular English language newspaper among old timers of Bangalore, and we can see a similar Display Ad “Car Hire SFO Airport” even on this website.



Definitely Ads on “San Francisco Car Rentals” or “SFO Airport Car Hire” are not relevant when I am watching a
  1. YouTube video on Euro Cup soccer top 10 goals. At least if Display Ads for Bus timings in Paris or Amsterdam was displayed while watching this video on European soccer we can understand, but what is a far away San Francisco, USA Car Rentals company’s online Ad doing on a YouTube video about Euro Cup soccer top 10 goals.
  2. India speaks many different languages, fine – but every language newspaper online whether Telugu, Tamil or English was displaying this Ad by this San Francisco, USA Car Rentals company. Again this online Ad is not relevant on these newspaper websites.

So what’s wrong here – If YouTube alone is not doing a good job of mapping Ad relevancy of its videos, we can understand; but how can many other websites in different languages belonging to totally different media companies also do the same mistake of showing the same irrelevant Ad? 

Actually nothing is wrong here. Let me give a clue, all these different websites that I had visited, from YouTube to regional language news websites were opened using the same web browser Google Chrome.

Welcome to the home of Ad Re-Targeting on Google’s Online Advertising platform.  

Actually on that same day, early in the morning I had searched for "San Francisco Car Rentals" in Google.com using Google Chrome web browser, since I was planning a business trip 2 weeks later to San Francisco. I had clicked on a Google Search Ad of a Car Rentals company, browsed to its website and viewed different models of cars and after selecting a model, decided to book it. Then after clicking on “Book this car” submission, on this Car Rentals website had decided to abandon this transaction midway without actually booking, as I wanted to check out other car hire deals on other Car Rental websites and also wanted to book closer to my trip date.

Google had tracked this entire activity of mine done early in the morning on that day, from the time I had searched for "San Francisco Car Rentals"  till I had abandoned the booking on that Car Rentals website, and everything in between. I am not sure, but Google might have even tracked the models of cars I had viewed on this Car Rentals website, the dates selected for car hire and even the car pickup location. I am sure about the last one, since later in that day I had seen an Ad on DeccanHerald.com on “Car Hire SFO Airport”. Therefore for Google’s Online Ad algorithms, this was a strong user intent towards a conversion, as I had tried to almost book a rental car before abandoning midway.

Hence these algorithms have decided to Re-Target me with online Display Ads on “San Francisco Car Rentals“ every time I was browsing through other websites or watching YouTube videos using Google Chrome browser. If I used a different web browser the next day or next week, I wouldn't see these Re-Targeted ads. As long as I am using Google’s Chrome web browser – for the next few days perhaps for few weeks I am going to frequently see this SFO Car Rental Ad in almost all my web browsing sessions, as I have become a good qualified lead for SFO car rentals.

I had even searched for "Hotels in San Francisco", on that same morning but did not try to book any hotel (as my hotel booking has to be done through my Employer’s Contractor only) and only wanted to check out the prices. For Google’s Online Ad algorithms, this was not a strong user intent towards a conversion – since I had not tried to book a hotel on the Hotel Booking website after clicking through the Google Search Ad. Therefore in case of San Francisco hotel booking, these algorithms have decided not to Re-Target me with online Ads as I am not a good prospect for SFO hotel booking.

Using the Visitor Identification and Browse behavior (including the Keywords searched in Google.com and the specific intent expressed by users on various websites visited post-search) Google’s powerful Online Ad algorithms are capable to Re-Target Ads to Visitors through the cookies stored in Google Chrome browsers. These algorithms typically execute on Big Data Analytics Architecture as discussed previously.

Signature: Roopkumar T.V.

Tuesday, September 3, 2013

Big Data Analytics solutions for Online Marketing - Use Case 1

A sample Online Marketing application deployed in the Big Data Architecture, is shown below.



Online users search for products, services, topics of interest etc. not only in Google and other search engines, but also more importantly on site itself (For example, in eCommerce site Amazon.com, search is the top product finding method used by site visitors). Facilitating searchers by providing relevant search results is something online search providers like Google, Bing and also site search providers continuously optimize and calibrate.

From an Online Marketing perspective, once the searchers click through the search results and arrive at the website (if coming through external search like Google) or arrive at the product or topic page they were searching internally on the site, that page of arrival from a search result, called as landing page in Online Marketing terminology, is very important for:
  • Improving Conversion Rate (%) of the site.
  • Traffic dispersion to subsequent stages of the site.
  • Improving site engagement for the users 

As already discussed in a previous post, delivering dynamic and search relevant landing pages is very important, particularly for large websites like eCommerce stores, Music & Movie download sites, Travel websites etc.  While delivering keyword or search relevant landing pages dynamically across thousands of keywords, perhaps across hundreds of thousands of keywords for large websites, itself is a big challenge; even bigger challenge is to deliver these dynamic, search relevant landing pages targeted to each of different user segments. As already discussed previously, luckily Big Data Analytics solutions are available now to solve these Big Data challenges in Online Marketing.

Large websites generate and also need to process, huge volumes of different varieties of data as below:

  • Website clickstream data collected through Web Analytics applications like Omniture and from webserver logs.
  • The website content such as product content, marketing content, navigation etc. in various formats like text, images, videos etc. which is available in the web content management systems.
  • External web content typically collected by web crawlers, which includes content such as
    • Product content from competitor websites
    • Marketing collaterals from external industry websites etc.
  • User generated content such as product reviews, user survey feedback, social media posts, online discussions, tweets, blog posts, online comments, Wiki articles etc.

Most of the above varieties of data are unstructured or semi-structured, and hence cannot be collected and processed in traditional RDBMS databases like Oracle or MySQL.

For large websites, it is not just important to collect large volumes of variety of data as shown above, but it is also important to handle the velocity at which all these data is getting generated online, particularly clickstream data and user generated content.

This is where Big Data Analytics solutions come in. In this above example, a typical Architecture to support Big Data Analytics is solutioned using open source Apache Hadoop framework.  In an Hadoop architecture - big volumes, variety and velocity of online data are collected and then stored in HDFS file system. Hadoop architecture also provides RDBMS like databases such as HBase, for storing big data in traditional style, particularly useful for beginners and new users of these Big Data Architectures. As we can see in this example, a big data landing zone is set up on a Hadoop cluster to collect big data, which is then stored in HDFS file system.

Using Map-Reduce programming method, Online Marketing Analysts or Big Data Scientists or Analysts develop and deploy various algorithms on a Hadoop cluster for performing Big Data Analytics. These algorithms can be implemented in standard Core Java programming language which is the core programming language used for executing various services for collecting, storing and analyses of big data in a Hadoop architecture.  Additional programming languages like Pig, Hive, Python or R can be used to implement the same algorithms with less number of lines of code to be deployed. However code written in any of these additional languages would still be compiled into Core Java code by Java Compilers for execution on Big Data Hadoop Architectures.

Some of the use cases of Online Marketing Algorithms which can be implemented on Hadoop Architecture for deriving Analytics are shown in the same example. All these algorithms are deployed using the Map-Reduce programming method.

  • Keyword Research: Counting the number of occurrences in content and search for hundreds of thousands of keywords across the diverse variety of data collected into Hadoop and stored in HDFS. This algorithm would help identify top keywords by volume, and also the long tail of hundreds of thousands of keywords searched by users. Even new hidden gems among keywords can be discovered using this algorithm to deploy in SEM/SEO campaigns.
  • Content Classifications / Themes: Classify the user generate content and also web content into specific themes. Due to huge processing capabilities of Hadoop Architecture, huge volumes of content can be processed and classified into dozens of major themes and hundreds of sub themes.
  • User Segmentation: Individual user behavior available in web clickstream data is combined with online user generated content and further combined with user targeted content available in web content management systems to generate dozens of user segments, both major & minor segments. Further this algorithm would identify the top keywords and right content themes targeted for each of the dozens of user segments, by combining the output from other algorithms used for Keyword Research and Content Classifications.

Also, since the Hadoop Architecture is running on clusters of computers, all the above algorithms can not only process huge voluminous amounts and varieties of data, but can handle data in motion which keeps coming into the Hadoop Big Data landing zone in near real time. This would enable the Online Marketing Campaigns to be tweaked in near real time to derive better ROIs from Online Marketing spends.  In the example illustrated above, the output from the 3 algorithms running in parallel, is dynamic Keyword Relevant Content Rich User Targeted Landing Pages generated in near real time, for hundreds of thousands of keywords, across dozens of content themes and targeted across dozens of user segments. This output would be integrated with eCommerce platforms or Web Content Management Systems or with Web Portals for creation, production & delivery of Keyword Relevant Content Rich User Targeted Landing Pages in near real time.


Signature: Roopkumar T.V.

Sunday, August 18, 2013

Big Data Architectures are a Big Boon for Online Marketing

Like discussed in the previous posts, Big Data Architectures are a big boon for Online Marketing, and provide us capabilities to develop innumerable applications for 

  1. Web UI Analytics or Web Analytics.
  2. Online Marketing Analytics and Optimization.
  3. Web UI Testing and Optimization.
  4. Web Visitor Segmentation 
  5. Customer Segmentation and Customer Analytics
  6. Sentiment Analysis
  7. etc.
However the use cases provided above are just a sample list. Big Data Architectures benefit in developing applications which in turn provides benefits across industries ranging from Agriculture to Medicine & Healthcare to Defense & Intelligence to Internet eCommerce. 

One important point is, Online Marketing was the earliest domain which benefited from Big Data Architectures, as Online companies like Yahoo and Google were the original pioneers in using Big Data Architectures and are also the biggest contributors of Frameworks (Hadoop), Tools (PIG, HIVE), Programming Methods (Map-Reduce Method)  and even Infrastructure (Amazon Web Services) needed for developing applications on Big Data Architectures. 

Provided below are small tutorials on using using Big Data Architectures for applications in Online Marketing and Web Analytics domains. 

The original videos below are from HortonWorks, a pioneering start-up in Big Data Applications Development.







Signature: Roopkumar T.V.

Monday, August 12, 2013

Web UI Testing and Optimization

In the previous posts, I wrote about 
One important missing piece which was not discussed was Web UI Testing and Optimization. I had however mentioned about the necessity to extensively execute UI optimization tests in a previous post.


When do we need to do UI optimization for the Online Marketing Campaigns? 
  1. To optimize page effectiveness of specific pages like Home page, Landing page, Sale page, Product Listings page etc.
  2. Optimize Lead Generation Conversion Funnel performance
  3. Optimize Cart-Checkout performance
  4. Optimize effectiveness and usage of Web Forms.
  5. Optimize eCommerce Purchase Funnels.
  6. Optimize Search Effectiveness on the site
  7. and many more
As seen there are many innumerable number of opportunities to optimize the performance of Web UI through continuous user testing. 


Why do we need to do UI optimization for the Online Marketing Campaigns? 

An optimized UI will benefit Online Marketing Campaigns on many dimensions like
  • Increased Conversion Rate(%) of Clicks to Leads, Purchases etc.
  • Increased traffic dispersion across the site. Improved Conversion(%) performance of Micro-conversions and mini-goals on the site.
  • Improved customer engagement with the Web channel.
  • Higher RPVs (Revenue per Visit) and higher RPUs(Revenue per Unique Visitor).
  • Increased Share of Customer Wallet. Increase in Average Order Size, Product Attach Rates etc.
  • Reduced Customer Churn Rate.
  • Overall higher ROI from Marketing Campaigns due to increased Conversions.


The methodology for UI optimization testing, involves both Qualitative and Quantitative Analytics. 




More details on each of above methodologies for executing UI Optimization Testing, will be discussed in future posts.



1 Big Data Mining for UI Optimization Testing involves generating insights on the user experiences, by data mining of Call Center Transcripts, Email Messages, Social Media Messages etc. using Big Data Analytics platforms.

2 Design of Experiments for 
UI Optimization Testing is a Quantitative technique for optimizing UI, by executing A/B Tests or Multi-Variate Tests on 2 or more recipes of the UI. 


Signature: Roopkumar T.V.

Saturday, August 10, 2013

Dynamic and Search Keyword Relevant Landing Pages

One of the biggest challenges faced by Large Dynamic websites like eCommerce Stores, Music or Movie download websites, Travel & Hospitality websites etc. is that they don't always deliver the most relevant landing pages to the Visitors who arrived at their website from Google and other search engines.

Recalling a much simpler earlier post on this topic, the disadvantages of not delivering the most relevant landing pages to users who arrived from Google or other search engines would include
  • Lost sales or lead generation opportunity
  • Lost opportunity to build engaging long term customer relationships or customer loyalty
  • Bad reputation and negative feedback, even negative reviews
  • Lost investments on the website
A landing page is the first touch point on the website for users coming from search engines. Users will not spend more then 10 seconds on a website, which has a irrelevant landing page. Capturing the users interest in the first 10 seconds is very important, and this is only possible by delivering the most relevant content to Visitors consistently.

The large dynamically changing websites would be searched and found in search engines like Google, across thousands of keywords. The top searched keywords would keep changing for large dynamic websites each month, or perhaps even each week. Also there would be a long tail of thousands of keywords, in some cases hundreds of thousands of long tail keywords for large websites. Hence delivering Search Keyword Relevant Landing pages across thousands, perhaps hundreds of thousands of keywords is always a challenge for large dynamic websites.

Good news is that solutions are now available to help large dynamic websites, to always deliver Search Keyword Relevant Landing pages across hundreds of thousands of keywords. All these solutions leverage the Big Data Analytics Platforms. 

Big Data Analytics Solutions would help organizations to always deliver Dynamic and Search Keyword Relevant Landing Pages across hundreds of thousands of keywords always and for every search.

Big Data Analytics Platforms benefit Organizations in discovering potential keywords for SEM and SEO by 
  1. Scanning or crawling all their content found anywhere on the internet, discovering hundreds of thousands of potential keywords for which their content (websites, images, videos, mobile apps, social apps, pages, blogs, Facebook fan pages etc.) could be discovered in Google search. 
  2. Scanning or crawling the content of direct competitors to discover additional potential thousands of keywords.
  3. Scanning similar content across other websites, blogs, social media, mobile or social apps, images, videos etc. to discover the long tail of potentially hundreds of thousands of keywords.


By integrating these Big Data Analytics Platforms to their Web Portals, E-Commerce Platforms, Content Management Systems, Business Process Management systems etc. - Organizations can deliver Dynamic and Search Keyword Relevant Landing Pages across hundreds of thousands of keywords always and for every search. This methodology would be discussed in more detail in future posts.

Signature: Roopkumar T.V.