HDFS - Data Engineering Solution in Bangalore | Apache Kafka Streaming Solutions in Bangalore | Kafka Confluent Cloud Solutions in Bangalore | Kafka Streaming Implementation Support in Bangalore | Apache Kafka Support in Bangalore | Multinode Kafka Cluster Setup in Bangalore | Kafka Application Consulting in Bangalore | Kafka cloud implementation in Bangalore | Kafka infrastructure consulting in Bangalore | Kafka security implementation in Bangalore | Kafka upgrade support in Bangalore | Zookeeper setup support in Bangalore | Zookeeper Solutions in Bangalore | Multinode Zookeeper Setup in Bangalore | Big Data Consulting Service Providers in Bangalore | Data Analytics Consutling Services in Bangalore | Big Data Solution Providers in Bangalore | Big Data Analytics Companies in Bangalore | Data Analytic Services in Bangalore | Big Data Services in Bangalore | Big Data Analytics Solutions in Bangalore | Big Data Analytics Service Providers in Bangalore | Big Data Case Studies | Big Data Companies in Bangalore | Multi Node Hadoop Cluster | Data Lake creation and support | Data Ingestion Services in Bangalore | Koolanch | Artificial Intelliegence Solutions in Bangalore | Predictive Analysis Solution in Bangalore | Machine Learning Solution in Bangalore | Deep Learning Solutions Bangalore | ChatBots for Websites | Text to Speech API | DialogFlow ChatBots | ChatBots using DialogFlow | AI based image processing | AI solution providers in Bangalore | AI based Predictive Analytics | Conversational Bots Development in Bangalore | AI chatbots and voicebots | E-Commerce Solution Providers in Bangalore | Demandware Consulting Service in Bangalore | Demandware Companies in Bangalore | SFCC Consulting Service in Bangalore | SFCC Consulting Companies in Bangalore | SFCC Service Providers in Bangalore | Demandware Contract Staffing in Bangalore | Salesforce Commerce Cloud Consulting Services in Bangalore | SFCC Contract Staffing in Bangalore | Salesforce Commerce Cloud Contract Staffing in Bangalore | Oracle Consulting Services in Bangalore | Oracle Service Providers in Bangalore | Oracle Contract Staffing in Bangalore | OCC Contract Staffing in Bangalore | Oracle Commerce Cloud Consulting in Bangalore | Oracle Commerce Cloud Companies in Bangalore | SAP Hybris Consulting Services in Bangalore | SAP Hybris Service Providers in Bangalore | SAP Hybris Contract Staffing in Bangalore | SAP Hybris Commerce Cloud Consulting in Bangalore | SAP Hybris Companies in Bangalore | SAP Hybris Solutions in Bangalore | Hybris Commerce Solution in India | Hybris Solution Provider Companies | Magento Consulting Services in Bangalore | Magento Service Providers in Bangalore | Magento Contract Staffing in Bangalore | Magento Commerce Cloud Consulting in Bangalore | Magento Companies in Bangalore | Mobile App Development Company in Bangalore | Android App Development Services in Bangalore | Location Tracking Based Mobile App Development | Mobile App Development In Bangalore | Mobility Solution Provider in Bangalore | SQL Server Support Services in Bangalore | SQL Server Support Companies in Bangalore | Data Mining Solution in Bangalore | Custom App Development in Bangalore | COntract Staffing Solution in Bangalore | Remote Contract Staffing Solution from India

19DecDecember 19, 2023

Architecture to leverage Apache Kafka for sharing large messages (GB size)

In today's data-driven world, the capability to transport and circulate large amounts of data, especially video files, in real-time is crucial for news media companies. For example, an incident occurred in a specific location, and a news reporter promptly filmed the entire situation. Subsequently, the complete video was distributed for broadcasting across their multiple studios situated in geographically distant locations. To construct or create a comprehensive solution for the given problem statement, we can utilize Apache Kafka in conjunction with...

By Gautam GoswamiApache Kafka, Data Engineering, Data ScienceApache Kafka, Big Data, HDFSComments Off

18AugAugust 18, 2022

Why Kappa Architecture for processing of streaming data. Have competence to superseding Lambda Architecture?

Data is quickly becoming the new currency of the digital economy, but it is useless if it can’t be processed. The processing of data is essential for subsequent decision-making or executable actions either by the human brain or various devices/applications etc. There are two primary ways of processing data namely batch processing and stream processing. Typically batch processing has been adopted for very large data sets and projects where there is a necessity for deeper data analysis, on the...

By Gautam GoswamiArchitecture, Data Engineering, Data IngestionAmazon Kinesis, Apache Flink, Apache Hadoop, Apache Kafka, Apache Samza, Apache Storm, batch processing layer, Big Data, Data Lake, data warehouse, event based Kappa architecture, event streaming platform, Hadoop Data Lake, HDFS, Kafka, Kafka Architecture Development Kafka Architecture Development, Kappa Architecture, Kappa Architecture for streaming data, Lambda Architecture, Map-Reduce framework, messaging engine, multiple stream processors, stream processing application, streaming computation system, streaming data analyticsComments Off

11JanJanuary 11, 2021

Data Security

By Kislay KomalApache Ranger, Apache Sentry, Big Data Security, Data Lake, data privacy, Data Security, Data Security Framework, Data security layer, data theft, enables data privacy, Hadoop Data Lake, HDFS, Hive, proof of concept, Safeguarding the Hadoop Data lake, sensitive data theft, Traditional security mechanismComments Off

06JanJanuary 6, 2021

Case Studies

By Kislay Komalanalyse application log files using HADOOP Cluster, analyse application log files using HADOOP Cluster on IBM Cloud, analyse the application log files using HADOOP Cluster on IBM Cloud, analyse the huge application log files, analyse the huge application log files using HADOOP Cluster, analyse the log files using HADOOP Cluster, Analysing the application log files, Analysing unstructured Data, analysis of facebook user posts, analysis of user comments, analysis of user likes, analysis of user posts, analysis of user shares, asynchronous communication, ATG, big data analytics case studies, big data analytics case study, Big Data Analytics Proof of Concept, big data case studies, big data case study, data extraction using R, data processing in Data Lake, ecommerce case studies, ELT, Extraction, Facebook data extraction, Facebook data extraction using R, Facebook data mining, HADOOP Cluster, HDFS, IBM Cloud, JMS messaging protocol, Loading and Transformation, Map-Reduce programming model, Oracle Web Commerce Platform, R, R Language, R Programming Language, social media data, social media data analysis, user participation on social media, using HADOOP Cluster on IBM CloudComments Off

25JunJune 25, 2018

Data Governance & Security Mechanism in Distributed Data Storage System

We are aware that the traditional data storage mechanism is incapable to hold the massive volume of data generated with lightning speed for further utilization even if we perform vertical scaling, and we have anticipated only one fuel, nothing but DATA to accelerate the movement across all the sectors starting from business to natural resources including medical towards rapid growth. But the question is how to persist this massive volume of data for processing? The answer is, storing the data...

By Gautam GoswamiApache Hadoop, Data Engineering, Storage MechanismApache Knox gateway, Data Governance, Data Governance & Security Mechanism in Distributed Data Storage System, Data Security Mechanism, Data Security Mechanism in Distributed Data Storage System, Data Storage System, Distributed Data Storage System, HDFS, HDFS cluster, Integration with LDAP, Kerberos, Kerberos for authentication, managing data accessComments Off

12DecDecember 12, 2017

Fault Tolerance Enhancement On Apache Hadoop 3.0.0-alpha2 For Supporting More Than 2 NameNodes

NameNode is the most critical resource in Hadoop core cluster. Once very large files loaded into the Hadoop Distributed File System (HDFS), the files get broken into block-sized chunks as per the parameter configured (64 MB by default). The chunks are then stored as independent units across the data nodes in the cluster. The primary responsibility of the data nodes is to hold the actual data in the form of chunk and NameNode holds the information where all the chunks located/stored in the...

By Kislay KomalApache Hadoopactive NameNode, Apache Hadoop, Apache Hadoop 3.0.0, Apache Hadoop 3.0.0-alpha2, configuration of NameNodes, data nodes, data nodes location, Fault Tolerance Enhancement On Apache Hadoop, filesystem metadata, filesystem namespace, filesystem tree, FsImage, FsImage file, Hadoop 2.0.0, Hadoop 3.0.0, Hadoop core cluster, Hadoop Distributed File System, Hadoop framework, HDFS, HDFS cluster, HDFS NameNode, JournalNodes, Master Node, NameNode, NameNode holds the information, new features of Apache Hadoop 3.0.0, new features of Hadoop 3.0.0, Primary NameNode, Quorum-based Storage, responsibility of the data nodes, secondary NameNode, single point of failure, SPOF, standby NameNode, Supporting More Than 2 NameNodes, very large amount of enterprise dataComments Off

30NovNovember 30, 2017

Basic Understanding Of Stateful Data Streaming Supported By Apache Flink

Technologies related to Big Data processing platform are enhancing the maturity in order to efficiently execute the streaming data which is becoming a major focus point to take business decision instantly specially in telecom and retail sector. Collecting data continuously from the various sensors installed/fitted with an industrial heavy equipment, click stream on an e-commerce application’s navigation etc can be considered as streaming data generation sources. By leveraging streaming application, we can process/analyze these continues flow of data without...

By Kislay KomalData Engineering, Processing EngineApache Flink, Big Data processing platform, checkpoint, concurrent updates, Data Streaming, Flink, HDFS, KeyBy, maintain parallelism, protecting streaming application from failure, savepoint., Stateful Data Streaming, Stateful Data Streaming Supported By Apache Flink, stateful operation, Stateless computation, Statelful computation, streaming application, Understanding Of Stateful Data Streaming Supported By Apache FlinkComments Off

17SepSeptember 17, 2017

Steering number of mapper (MapReduce) in sqoop for parallelism of data ingestion into Hadoop Distributed File System (HDFS)

To import data from most the data source like RDBMS, sqoop internally use mapper. Before delegating the responsibility to the mapper, sqoop performs few initial operations in a sequence once we execute the command on a terminal in any node in the Hadoop cluster. Ideally, in production environment, sqoop installed in the separate node and updated .bashrc file to append sqoop's binary and configuration which helps to execute sqoop command from anywhere in the multi-node cluster. Most of the...

By Gautam GoswamiData Engineering, Data IngestionData ingestion, Hadoop Distributed File System, HDFS, Map Reduce, parallelism of data ingestion, Sqoop, sqoop for parallelism of data ingestion into Hadoop Distributed File System (HDFS)Comments Off

08SepSeptember 8, 2017

Transfer structured data from Oracle to Hadoop storage system

Using Apache's sqoop, we can transfer structured data from Relational Database Management System to Hadoop distributed file system (HDFS). Because of distributed storage mechanism in Hadoop Distributed File System (HDFS), we can store any format of data in huge volume in terms of capacity. In RDBMS, data persists in the row and column format (Known as Structured Data). In order to process the huge volume of enterprise data, we can leverage HDFS as a basic data lake. In this...

By Gautam GoswamiData Engineering, Hadoop Eco SystemAmazon web service, Apache Sqoop, Apache's sqoop, Data ingestion, Data ingestion mechanism, distributed storage, distributed storage mechanism, enterprise data, Google cloud, Hadoop, Hadoop 2.x, Hadoop Distributed File System, Hadoop storage system, HDFS, huge volume of enterprise data, Microsoft Azure, multi node cluster, Oracle to Hadoop, Sqoop, structured data, Transfer structured data from Oracle to Hadoop, Using Apache sqoopComments Off

29AugAugust 29, 2017

Data Ingestion phase for migrating enterprise data into Hadoop Data Lake

The Big Data solutions helps to achieve valuable information to iron out the accurate strategic business decision. Exponential growth of digitalization, social media, telecommunication etc. are fueling enormous data generation everywhere. Prior to process of huge volume of data, we should have efficient data storage mechanism in a distributed manner to hold any form of data starting from structured to unstructured. Hadoop distributed file systems (HDFS) can be leveraged efficiently as data lake by installing on multi node cluster....

By Gautam GoswamiData Engineering, Data IngestionApache software foundation, Apache Sqoop, ata storage mechanism, ATG database, ATG database schema, cloud service providers, collecting Twitter streaming data, Couchbase, Data ingestion, Data Ingestion phase for migrating enterprise data into Hadoop Data Lake, Data Lake, data storage mechanism, DB2, Digitization, distributed storage, efficient data storage mechanism, ELT, enterprise data, export data from Kafka topic to HDFS, fault-tolerant, Flume, Hadoop, HADOOP Cluster, Hadoop Data Lake, Hadoop distributed file systems, Hadoop multi node cluster, HDFS, Hive, huge data reservoirs, huge volume of data, Ingestion, JDBC connector, JDBC protocol, Kafka, Kafka HDFS connector, Kafka to HDFS, Mainframe, mainframe dataset to HDFS, MapReduce, MapReduce distributed computing, migrating enterprise data, moving large amount of streaming data into HDFS, multi node cluster, multiple delimited text files, MySQL, Netezza, NoSql DB, NoSql Stores, Oracle, Oracle 11g Enterprise Edition, Oracle ATG Platform, parallel import process, parallel processing, pluggable mechanism, PostgreSQL, read the messages from Kafka topic, SQLServer, Sqoop, Sqoop installation, Strom, Using Kafka HDFS connectorComments Off

Tag - HDFS

Architecture to leverage Apache Kafka for sharing large messages (GB size)

Why Kappa Architecture for processing of streaming data. Have competence to superseding Lambda Architecture?

Data Security

Data Governance & Security Mechanism in Distributed Data Storage System

Fault Tolerance Enhancement On Apache Hadoop 3.0.0-alpha2 For Supporting More Than 2 NameNodes

Basic Understanding Of Stateful Data Streaming Supported By Apache Flink

Steering number of mapper (MapReduce) in sqoop for parallelism of data ingestion into Hadoop Distributed File System (HDFS)

Transfer structured data from Oracle to Hadoop storage system

ready to realize your digital transformation dreams?