Architecture

Few intrinsic of Apache Zookeeper and their importance

As a bird’s eye view, Apache Zookeeper has been leveraged to get coordination services for managing distributed applications. Holds responsibility for providing configuration information, naming, synchronization, and group services over large clusters in distributed systems. To consider as an example, Apache Kafka uses Zookeeper for choosing their leader node for the topic partitions. Please click here if you want read on how to setup the multi-node Apache Zookeeper cluster on Ubuntu/Linux zNodes The key concept of the Zookeeper is the znode which can be acted...

Read more...

Why Kappa Architecture for processing of streaming data. Have competence to superseding Lambda Architecture?

Data is quickly becoming the new currency of the digital economy, but it is useless if it can’t be processed. The processing of data is essential for subsequent decision-making or executable actions either by the human brain or various devices/applications etc. There are two primary ways of processing data namely batch processing and stream processing. Typically batch processing has been adopted for very large data sets and projects where there is a necessity for deeper data analysis, on the...

Read more...

Processing and Analysis of Big Telecom Data to minimize crime, combat terrorism, unsocial activities etc.

Telecom providers have a treasure trove of captive data - customer data, CDR (call detail records), call center interactions, tower logs etc. and are metaphorically “sitting on a gold mine”. Ideally, each category of the generated data has the following information. ⦁ Customer data consolidates customer id, plan details, demographic, subscribed services and spending patterns ⦁ Service data category consolidates types of customer, customer history, complain category, query resolved etc.       are on ⦁ Usually for the smart mobile phone subscriber,...

Read more...

Why Lambda Architecture in Big Data Processing

Due to the exponential growth of digitization, the entire globe is creating minimum 2.5 Quintilian 2500000000000 Million) bytes of data every day and that we can denote as Big Data. Data generation is happening from everywhere starting from social media sites, various sensors, satellite, purchase transaction, Mobile, GPS signals and much more. With the advancement of technology, there is no sign of slowing down of data generation, instead it will grow in massive volume. All the major organizations, retailers,...

Read more...

Fog Computing

Fog computing also refer to Edge computing . Cisco Systems introduced the term "Fog Computing" and it's not the replacement of cloud computing. Ideally cloud computing points to storing and accessing data and programs over the Internet instead of local computer's hard drive or storage. The cloud is simply a metaphor for the Internet. In Fog computing, data, processing and applications are concentrated in devices at the network edge. Here devices communicate peer-to-peer so that data storage and share...

Read more...

Essentially of Data Wrangling

To roll out a new software product commercially irrespective of any domain in the market,  360-degree quality check with test data is mandatory.  We can correlate this with a visualized concept of a new vehicle.  After completion of vehicle manufacturing, fuel has to be injected to the engine to make it operational. Once the vehicle starts moving, all the quality checks, testing get started like brake performance, mileage, comfort etc with thousands of other factors which are decided/concluded during...

Read more...

Performance of Hadoop Map-Reduce

The performance of Hadoop Map-Reduce job can be increased amicably without investing more on the hardware cost. Simply tuning some parameters according to the cluster specifications, input data size and processing complexities. Here are few general tips to improve Map_reduce job performance - Always we should use compression when writing intermediate data (mapper output) to disk before shuffling - Include combiner in the appropriate position. - LongWritable data type is incorrect as output when range of output values are in Integer range. IntWritable...

Read more...