Proof of Concept (POC) to analyze the huge application log files using HADOOP Cluster on IBM Cloud Platform
Analyzing the application log files, those are generated on production environment is very challenging. Data in the log file is in unstructured format and hence they can’t be stored in RDBMS without conversion to structured format (row, column) to leverage the query functionality. Hence finding a specific information from large log file probably of size hundreds of terabytes is nearly impossible to troubleshoot if application behaves abruptly for very short duration.
As part of our case study, we found that sometimes, asynchronous communication was not getting established to a third party vendor for order fulfillment from an E-Commerce application running on Oracle Web Commerce platform (ATG). JMS messaging protocol was responsible to delivered the order submission message from ATG third party vendor and vice versa, but it was failing to do that sometimes. Using Hadoop cluster with customized Map-Reduce programming model, we extracted the exact recorded warnings and errors from log files produced from out of box ATG component. After performing the intricate analysis within the framework component, based on the analyzed reports produced by Hadoop framework, we concluded that the issue was lying within the ATG framework itself. The same was communicated to the software vendor and subsequently received the patch from them.
Facebook data extraction using R & data processing in Data Lake
Without detailed visibility of social media data, it’s extremely tough for the brands to decide on their products and marketing strategy. Almost 94 per cent of buying decisions are based on exponential growth of user participation on social media (mainly on Facebook). Facebook is playing a critical role to increase brand awareness. Businesses need to transform digital native customers into brand advocates and this can only be done if the relationship has been nurtured. The key for brands is to encourage consumers to endorse the brand and play a real part within the business.
Facebook data mining is becoming a major factor in taking accurate business decisions by analysis of user posts, comments, likes, shares etc. as well as the sentiment analysis on business page. To analyse data, we should have proper mechanism to extract data from the business page. With effective utilization of R, a programming language for statistical analysis and Hadoop Distributed File Systems (HDFS) with ELT (Extraction, Loading and Transformation) approach, this E-Book has describe in details how we can perform Facebook data mining.