A sample Online Marketing
application deployed in the Big Data Architecture, is shown
below.
Online users search for products,
services, topics of interest etc. not only in Google and other search engines,
but also more importantly on site itself (For example, in eCommerce site
Amazon.com, search is the top product finding method used by site visitors).
Facilitating searchers by providing relevant search results is something online
search providers like Google, Bing and also site search providers continuously
optimize and calibrate.
From an Online Marketing perspective,
once the searchers click through the search results and arrive at the website (if
coming through external search like Google) or arrive at the product or topic
page they were searching internally on the site, that page of arrival from a
search result, called as landing page in Online Marketing terminology, is very
important for:
- Improving Conversion Rate (%) of the site.
- Traffic dispersion to subsequent stages of the site.
- Improving site engagement for the users
As already discussed in a previous post, delivering dynamic and search relevant landing pages is very important,
particularly for large websites like eCommerce stores, Music & Movie
download sites, Travel websites etc. While delivering keyword or search relevant
landing pages dynamically across thousands of keywords, perhaps across hundreds
of thousands of keywords for large websites, itself is a big challenge; even
bigger challenge is to deliver these dynamic, search relevant landing pages
targeted to each of different user segments. As already discussed previously,
luckily Big Data Analytics solutions are available now to solve these Big Data
challenges in Online Marketing.
Large websites generate and also
need to process, huge volumes of different varieties of data as below:
- Website clickstream data collected through Web Analytics applications like Omniture and from webserver logs.
- The website content such as product content, marketing content, navigation etc. in various formats like text, images, videos etc. which is available in the web content management systems.
- External web content typically collected by web crawlers, which includes content such as
- Product content from competitor websites
- Marketing collaterals from external industry websites etc.
- User generated content such as product reviews, user survey feedback, social media posts, online discussions, tweets, blog posts, online comments, Wiki articles etc.
Most of the above varieties of
data are unstructured or semi-structured, and hence cannot be collected and
processed in traditional RDBMS databases like Oracle or MySQL.
For large websites, it is not
just important to collect large volumes of variety of data as shown above, but it
is also important to handle the velocity at which all these data is getting
generated online, particularly clickstream data and user generated content.
This is where Big Data Analytics solutions
come in. In this above example, a typical Architecture to support Big Data
Analytics is solutioned using open source Apache Hadoop framework. In an Hadoop architecture - big volumes,
variety and velocity of online data are collected and then stored in HDFS file
system. Hadoop architecture also provides RDBMS like databases such as HBase,
for storing big data in traditional style, particularly useful for beginners
and new users of these Big Data Architectures. As we can see in this example, a
big data landing zone is set up on a Hadoop cluster to collect big data, which
is then stored in HDFS file system.
Using Map-Reduce programming
method, Online Marketing Analysts or Big Data Scientists or Analysts develop
and deploy various algorithms on a Hadoop cluster for performing Big Data
Analytics. These algorithms can be implemented in standard Core Java
programming language which is the core programming language used for executing
various services for collecting, storing and analyses of big data in a Hadoop architecture. Additional programming languages like Pig,
Hive, Python or R can be used to implement the same algorithms with less number
of lines of code to be deployed. However code written in any of these additional
languages would still be compiled into Core Java code by Java Compilers for
execution on Big Data Hadoop Architectures.
Some of the use cases of Online
Marketing Algorithms which can be implemented on Hadoop Architecture for
deriving Analytics are shown in the same example. All these algorithms are deployed
using the Map-Reduce programming method.
- Keyword Research: Counting the number of occurrences in content and search for hundreds of thousands of keywords across the diverse variety of data collected into Hadoop and stored in HDFS. This algorithm would help identify top keywords by volume, and also the long tail of hundreds of thousands of keywords searched by users. Even new hidden gems among keywords can be discovered using this algorithm to deploy in SEM/SEO campaigns.
- Content Classifications / Themes: Classify the user generate content and also web content into specific themes. Due to huge processing capabilities of Hadoop Architecture, huge volumes of content can be processed and classified into dozens of major themes and hundreds of sub themes.
- User Segmentation: Individual user behavior available in web clickstream data is combined with online user generated content and further combined with user targeted content available in web content management systems to generate dozens of user segments, both major & minor segments. Further this algorithm would identify the top keywords and right content themes targeted for each of the dozens of user segments, by combining the output from other algorithms used for Keyword Research and Content Classifications.
Also, since the Hadoop
Architecture is running on clusters of computers, all the above algorithms can
not only process huge voluminous amounts and varieties of data, but can handle
data in motion which keeps coming into the Hadoop Big Data landing zone in near
real time. This would enable the Online Marketing Campaigns to be tweaked in
near real time to derive better ROIs from Online Marketing spends. In the example illustrated above, the output
from the 3 algorithms running in parallel, is dynamic Keyword Relevant Content
Rich User Targeted Landing Pages generated in near real time, for hundreds of
thousands of keywords, across dozens of content themes and targeted across
dozens of user segments. This output would be integrated with eCommerce
platforms or Web Content Management Systems or with Web Portals for creation,
production & delivery of Keyword Relevant Content Rich User Targeted Landing
Pages in near real time.
Signature: Roopkumar T.V.
Signature: Roopkumar T.V.