Overview

This module aims to equip the professional with skills ranging from data wrangling using Python, SQL systems like mysql, nosql systems like MongoDB, big data processing with Apache Spark and machine learning with sklearn.

After the course, the professional will be able to confidently
  • carry out exploratory data analysis using python;
  • design / create both SQL and NoSQL databases;
  • create ETL (Extract Transform Load) data pipelines in Apache Spark;
  • create supervised machine learning models in sklearn; and
  • perform unsupervised techniques such as clustering.

The objective of the module is to prepare the professional for an entry level position in data engineering.

Teaching Team

What You’ll Learn

  • Data Wrangling. Participants will be able to apply data wrangling techniques, using libraries such as Numpy and Pandas to transform data from one form to another.
  • SQL. Participants will be taught basic SQL and they will be able to write and debug simple SQL queries on a database for CRUD (Create Retrieve Update Delete) operations. Participants will also be taught principles of database design (normal forms).
  • NoSQL. Participants will be taught the difference between SQL and NoSQL databases and be able to contrast the situations where each should be used. The will similarly have to be able to carry out CRUD operations on a NoSQL database such as MongoDB.
  • Apache Spark. Participants will be taught how to create a simple data pipeline consisting of data ingestion, data preparation and generating views / queries.
  • Supervised Machine Learning. Participants will be able to use tools such as sklearn to create machine learning models using a range of techniques such as decision trees or neural networks.
  • Unsupervised Machine Learning. Participants will be exposed to unsupervised learning techniques such as k-means and hierarchical clustering.

Who Should Attend

  • Engineers
  • Software Developers
  • Professionals who have experience in programming and are interested to find out more about data engineering

SITizens Learning Credits (SLC) - Eligible Course

SIT Alumni: Before registering for courses, please activate your SITizens Learning Credits via the email sent by SIT Alumni Team, on behalf of SITLEARN Professional Development.

CLICK HERE TO APPLY (only for SIT Alumni)
 

Certificate and Assessment

A Certificate of Participation will be issued to participants who:
  • Attend 75% of the course;
  • Undertake non-credit bearing assessment (during course).

A Certificate of Attainment will be issued to participants who:

  • Complete the course and pass all the credit-bearing assessments;
  • Have bachelor degree in engineering or other relevant degree.

Schedule

Every Friday from 4 September 2020

4 September 2020
  • Welcome and Registration
  • Introduction to data programming and Python
  • Pandas and Numpy basics
  • Tea Break
  • Pandas data structure and plotting
  • Hands-on practices
  • Lunch
  • Assembling data and handling missing data
  • Hands-on practices
  • Tea-break
  • Applying functions
  • Hands-on practices
  • Quiz/ Q&A
  • End of Day

11 September 2020
  • Welcome and Registration
  • Overview of database systems
  • Tea Break
  • SQL basics – data definition language, data manipulation language
  • Hands-on practices
  • Lunch
  • Relational database and SQL for relational data
  • Hands-on practices
  • Tea-break
  • Advanced topics on database and discussion
  • Quiz/ Q&A
  • End of Day

18 September 2020
  • Welcome and Registration
  • Introduction to NoSQL
  • Introduction to REST
  • Tea Break
  • Introduction to MongoDB CRUD
  • Lunch
  • Building MongoDB for a Python Application
  • Tea-break
  • NOSQL for big data
  • Quiz/ Q&A
  • End of Day


25 September 2020
  • Welcome and Registration
  • Introduction to Big Data
  • Introduction to Hadoop
  • Tea Break
  • Introduction to Apache Spark
  • Introduction to RDD
  • Lunch
  • Introduction to Functional Programming
  • Tea-break
  • Introduction to Data Pipelines
  • Quiz/ Q&A
  • End of Day

2 October 2020
  • Welcome and Registration
  • Introduction to supervised machine learning algorithms
  • Introduction to clustering through K Nearest Neighbour algorithm 
  • Tea Break
  • Theory behind Decision Trees in Supervised Learning 
  • Random Forest Algorithm 
  • Lunch
  • Introduction to machine learning programming 
  • Implementing clustering with K Nearest Neighbour
  • Tea-break
  • Implementing clustering with Decision Trees 
  • Implementing clustering with Random Forest
  • Quiz/ Q&A
  • End of Day


9 October 2020
  • Welcome and Registration
  • Introduction to unsupervised machine learning 
  • Understanding partitional clustering through K-Means algorithm
  • Tea Break
  • Understanding hierarchical clustering 
  • Lunch 
  • Implementing clustering techniques in Python
  • Tea-break
  • An introduction to genetic algorithm
  • An hands-on example of genetic algorithmic programming 
  • Quiz/ Q&A
  • End of Day
16 October 2020
  • Exam

Total course contact hours: 48 hours (min. of 8 hrs including assessment time)

Fees

Category Full Fee After SF Funding After SF Mid-Career
Enhanced Subsidy
Singapore Citizen (Below 40) /
Singapore PR
$5,778.00 $1,733.40 Not Eligible
Singapore Citizen (40 & above) $5,778.00 $1,733.40 $653.40
Non-Singaporeans $5,778.00 Not Eligible Not Eligible
 
Note:
  • All figures include GST. GST applies to individuals and Singapore-registered companies.
  • You can opt for either SkillsFuture Funding or Mid-Career Enhanced Subsidy. Both cannot be combined.

Learn more about funding types available

Terms & Conditions:

SkillsFuture Funding

To be eligible for the 70% training grant awarded, applicants (and/or their sponsoring organisations where applicable) must:
  1. Be a Singaporean Citizen or Singapore Permanent Resident
  2. Not receive any other funding from government sources in respect of the actual grant disbursed for the programme

SkillsFuture Mid-Career Enhanced Subsidy

To be eligible for the 90% enhanced subsidy awarded, applicants (and/or their sponsoring organisations where applicable) must:
  1. Be a Singaporean Citizen
  2. Be at least 40 years old
  3. Not receive any other funding from government sources in respect of the actual grant disbursed for the programme

SIT reserves the right to collect the balance of the programme fees (i.e. the potential grant amount) directly from the applicants (and/or their sponsoring organisations where applicable) should the above requirements not be fulfilled.

SIT reserves the right to make changes to published course information, including dates, times, venues, fees and instructors without prior notice.

Course Series

Find out more about the Modular Certification Courses - Data Engineering and Smart Factory

These modules are stackable towards a Postgraduate Certificate in Data Engineering and Smart Factory (PGCert in DESF) or can be taken individually as a single module as follows:
 

Modules 
  • DSF6010 Application of Data Engineering
  • DSF6020 Operational Excellence for Smart Factory
  • DSF6030 Managing and Leading AI Projects for Smart Factory
  • DSF6040 Application of Robotics & Automation for Smart Factory
Candidates who pass all 4 modules (totalling 24 credits) with a CGPA of ≥ 2.5 will be awarded the Postgraduate Certificate in Data Engineering and Smart Factory by SIT.

Key Info

Venue SIT@Dover, 10 Dover Drive S138683
Time 09:00 AM to 06:00 PM
Date 04 Sep 2020 (Fri) to
16 Oct 2020 (Fri)
Registration is Closed.

You May Also Like