SITizens Learning Credits

Overview

This module aims to equip the professional with skills ranging from data wrangling using Python, SQL systems like mysql, nosql systems like MongoDB, big data processing with Apache Spark and machine learning with sklearn.

After the course, the professional will be able to confidently:
  • Carry out exploratory data analysis using python;
  • Design/create both SQL and NoSQL databases;
  • Create ETL (Extract Transform Load) data pipelines in Apache Spark;
  • Create supervised machine learning models in sklearn; and
  • Perform unsupervised techniques such as clustering.

The objective of the module is to prepare the professional for an entry-level position in data engineering.

Teaching Team

What You’ll Learn

  • Data Wrangling. Participants will be able to apply data wrangling techniques, using libraries such as Numpy and Pandas to transform data from one form to another.
  • SQL. Participants will be taught basic SQL and they will be able to write and debug simple SQL queries on a database for CRUD (Create Retrieve Update Delete) operations. Participants will also be taught principles of database design (normal forms).
  • NoSQL. Participants will be taught the difference between SQL and NoSQL databases and be able to contrast the situations where each should be used. The will similarly have to be able to carry out CRUD operations on a NoSQL database such as MongoDB.
  • Apache Spark. Participants will be taught how to create a simple data pipeline consisting of data ingestion, data preparation and generating views / queries.
  • Supervised Machine Learning. Participants will be able to use tools such as sklearn to create machine learning models using a range of techniques such as decision trees or neural networks.
  • Unsupervised Machine Learning. Participants will be exposed to unsupervised learning techniques such as k-means and hierarchical clustering.

Who Should Attend

  • Engineers
  • Software Developers
  • Professionals who have experience in programming and are interested to find out more about data engineering


SITizens Learning Credits (SLC) - Eligible Course

This course is SITizens Learning Credits (SLC) eligible. Please refer to the user guide on how to register for courses utilising your SLC.

Find out more about SITizens Learning Credits (SLC).
 

Certificate and Assessment

A Certificate of Participation will be issued to participants who:
  • Attend 75% of the course;
  • Undertake non-credit bearing assessment (during course).

A Certificate of Attainment will be issued to participants who:

  • Attend 75% of the course;
  • Complete the course and pass all credit-bearing assessments.

Schedule

Every Friday from 5 May 2023

5 May 2023
  • Welcome and Registration
  • Introduction to data programming and Python
  • Pandas and Numpy basics
  • Tea Break
  • Pandas data structure and plotting
  • Hands-on practices
  • Lunch
  • Assembling data and handling missing data
  • Hands-on practices
  • Tea-break
  • Applying functions
  • Hands-on practices
  • Quiz/ Q&A
  • End of Day

12 May 2023
  • Welcome and Registration
  • Overview of database systems
  • Tea Break
  • SQL basics – data definition language, data manipulation language
  • Hands-on practices
  • Lunch
  • Relational database and SQL for relational data
  • Hands-on practices
  • Tea-break
  • Advanced topics on database and discussion
  • Quiz/ Q&A
  • End of Day

19 May 2023
  • Welcome and Registration
  • Introduction to NoSQL
  • Introduction to REST
  • Tea Break
  • Introduction to MongoDB CRUD
  • Lunch
  • Building MongoDB for a Python Application
  • Tea-break
  • NOSQL for big data
  • Quiz/ Q&A
  • End of Day

26 May 2023
  • Welcome and Registration
  • Introduction to Big Data
  • Introduction to Hadoop
  • Tea Break
  • Introduction to Apache Spark
  • Introduction to RDD
  • Lunch
  • Introduction to Functional Programming
  • Tea-break
  • Introduction to Data Pipelines
  • Quiz/ Q&A
  • End of Day

2 Jun 2023
  • Welcome and Registration
  • Introduction to supervised machine learning algorithms
  • Introduction to clustering through K Nearest Neighbour algorithm 
  • Tea Break
  • Theory behind Decision Trees in Supervised Learning 
  • Random Forest Algorithm 
  • Lunch
  • Introduction to machine learning programming 
  • Implementing clustering with K Nearest Neighbour
  • Tea-break
  • Implementing clustering with Decision Trees 
  • Implementing clustering with Random Forest
  • Quiz/ Q&A
  • End of Day


9 Jun 2023
  • Welcome and Registration
  • Introduction to unsupervised machine learning 
  • Understanding partitional clustering through K-Means algorithm
  • Tea Break
  • Understanding hierarchical clustering 
  • Lunch 
  • Implementing clustering techniques in Python
  • Tea-break
  • An introduction to genetic algorithm
  • An hands-on example of genetic algorithmic programming 
  • Quiz/ Q&A
  • End of Day
23 Jun 2023
  • Exam (SIT@NYP)

Total course contact hours: 48 hours (min. of 8 hrs including assessment time)

Fees

Category Full Fee After SF Funding
Singapore Citizen (Below 40) $5,778.00 $1,733.40
Singapore Citizen (40 & above) $5,778.00 $653.40
Singapore PR/ LTVP+ Holder $5,832.00 $1,749.60
Non-Singaporeans $5,832.00 Not Eligible
 
Note:
  • All figures include GST. GST applies to individuals and Singapore-registered companies.
  • You can opt for either SkillsFuture Funding or Mid-Career Enhanced Subsidy. Both cannot be combined.

Learn more about funding types available

Terms & Conditions:

SkillsFuture Funding

To be eligible for the 70% training grant awarded, applicants (and/or their sponsoring organisations where applicable) must:
  1. Be a Singaporean Citizen or Singapore Permanent Resident or LTVP+ Holder
  2. Not receive any other funding from government sources in respect of the actual grant disbursed for the programme

SkillsFuture Mid-Career Enhanced Subsidy

To be eligible for the 90% enhanced subsidy awarded, applicants (and/or their sponsoring organisations where applicable) must:
  1. Be a Singaporean Citizen
  2. Be at least 40 years old
  3. Not receive any other funding from government sources in respect of the actual grant disbursed for the programme

SIT reserves the right to collect the balance of the programme fees (i.e. the potential grant amount) directly from the applicants (and/or their sponsoring organisations where applicable) should the above requirements not be fulfilled.

SIT reserves the right to make changes to published course information, including dates, times, venues, fees and instructors without prior notice.

Course Series

Find out more about the Modular Certification Courses - Data Engineering and Smart Factory

These modules are stackable towards a Postgraduate Certificate in Data Engineering and Smart Factory (PGCert in DESF) or can be taken individually as a single module as follows:
 

Modules 
  • DSF6010 Application of Data Engineering
  • DSF6020 Operational Excellence for Smart Factory
  • DSF6030 Managing and Leading AI Projects for Smart Factory
  • DSF6040 Application of Robotics & Automation for Smart Factory
Candidates who pass all 4 modules (totalling 24 credits) with a CGPA of ≥ 2.5 will be awarded the Postgraduate Certificate in Data Engineering and Smart Factory by SIT.
SITizens Learning Credits

Key Info

Venue SIT@NYP, 172A Ang Mo Kio Avenue 8, S567739
Time 09:00 AM to 06:00 PM
Date 05 May 2023 (Fri) to
23 Jun 2023 (Fri)
Registration is Closed.

You May Also Like