Data Engineering

Data Engineering Specialist

Person who passes examination of Data Engineering Specialist at least will have the following competencies:

  1. Data Warehouse Concepts
    This module concentrates on data warehouse deliverables independent of any specific methods, but within the framework of best practices. It focuses on understanding deliverables that may be produced throughout the data warehouse process and issues reasons for producing them. This course closes with exploration for practical next steps the students can take. This includes steps further to develop knowledge and skills, to position oneself for success, and to get started with data warehousing.
  2. Introduction to SQL/PL-SQL
    This competency includes storing, retrieving, updating and displaying data using Structured Query Language (SQL) integrated into Stored Procedures, Functions, Packages and Triggers (PL/SQL Programming). SQL and PL/SQL is designed specifically to process SQL commands. Students will learn how it works and why it’s secure, robust and portable.
  3. Data Integration
    The volume, variety and velocity of data are increasing rapidly. Organizations need fast and easy-to-use tools to harness data for actionable insight. One of the biggest challenges facing organizations today is the requirement to provide a consistent, single version of the truth across all sources of information in an analytics-ready format. With powerful data extract, transform and load (ETL) capabilities, an intuitive and rich graphical design environment, and an open and standards-based architecture, Pentaho Data Integration is increasingly the choice over proprietary and homegrown data integration tools. Tools that is used in this training are Pentaho and Talend.

Data Engineering Qualified

Person who passes examination of Data Engineering Qualified at least will have the following competencies:

  1. Big Data Essential
    This course is designed for data analysts, business intelligence specialists, developers, system architects, and database administrators. Knowledge of SQL is assumed, as is basic Linux command-line familiarity.
  2. Hadoop Administration
    Hadoop enables the distributed storage and processing of large datasets across clusters of computers. Learning how to administer Hadoop is crucial to exploit its unique features. In this course, you will be able to overcome common problems encountered in Hadoop administration. The course begins with laying the foundation by showing you the steps needed to set up a Hadoop cluster and its various nodes. You will get a better understanding of how to maintain Hadoop cluster, especially on the HDFS layer and using YARN and MapReduce. Further on, you will explore durability and high availability of a Hadoop cluster. You’ll get a better understanding of the schedulers in Hadoop and how to configure and use them for your tasks. You will also get hands-on experience with the backup and recovery options and the performance tuning aspects of Hadoop. Finally, you will get a better understanding of troubleshooting, diagnostics, and best practices in Hadoop administration.
  3. NoSQL Database
    This course is designed for Software Professionals who are willing to learn MongoDB Database in simple and easy steps. It will throw light on MongoDB concepts and after completing this.
    course you will be at an intermediate level of expertise, from where you can take yourself at higher level of expertise
  4. Spark Administration
    This course will focus on how to analyze large and complex sets of data. Starting with installing and configuring Apache Spark with various cluster managers, you will cover setting up development environments. You will then cover various recipes to perform interactive queries using Spark SQL and real-time streaming with various sources such as Twitter Stream and Apache Kafka.

Data Engineering Professional

Person who passes examination of Data Engineering Professional at least will have the following competencies:

Data Streaming
Accelerated specialization provides participants a hands-on introduction to designing and building data streaming process using Apache Kafka. Participants will learn how to design data processing systems, build end-to-end data pipelines and advanced ETL (Extract, Transform, Load) processing.