Advanced Data Analytics

COS20083

Duration
One Semester or equivalent

Prerequisites
COS10081 Introduction to Data Science AND
COS10009 Introduction to Programming

Corequisites
Nil

Contact hours
48 Hours

Credit Points
12.5

Aims and learning outcomes

This unit introduces the foundations of probability and statistics and explores the key concepts and tools of data mining and machine learning. Students will learn the importance of these advanced data analytics methods and how different algorithms can be used to analyse real-world analytics problems. Students will demonstrate problem solving and critical thinking.

After successfully completing this unit, you should be able to:

  1. Recognize the basics of probability & statistics and linear algebra
  2. Describe the concepts and principles of data mining and their applications
  3. Demonstrate critical thinking in solving complex problems
  4. Differentiate between descriptive and predictive data mining and their role to support decision making.
  5. Analyse data sets by applying appropriate advanced data analytics methods and algorithms.

Unit information

Learning and teaching structures

2 hours lectures and 2 hour tutorial/laboratory per week.

In a Semester, you should normally expect to spend, on average, twelve and a half hours of total time (formal contact time plus independent study time) a week on a 12.5 credit point unit of study.

Content

  • Introduction to probability & statistics and Linear Algebra
  • Introduction to data mining
  • Data analysis
  • Data transformation
  • Association rule
  • Classification and regression
  • Clustering
  • Text and graph mining

General skills outcomes

Key Generic Skills:

  • Teamwork
  • Analysis
  • Problem Solving
  • Ability to tackle unfamiliar problems
  • Ability to work independently

Assessment

  1. Weekly lab tasks (Individual) 40%
  2. Assignment 1: Statistics (Individual) 20%
  3. Assignment 2: Case study and algorithm implementation (Group of 2) 20%
  4. Test (Individual) 20%

Minimum requirements to pass this unit of study

In order to achieve a pass in this unit of study, you must:

  • an aggregated mark of 50% or more, and
  • at least 50% in the final exam

Study Resources

Resources and reference material

  • Data Science from Scratch – first principles with python Joel Grus
    2015 O’Reilly Media
  • An Introduction to Statistical Learning
    Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani
    2014 Springer
    ISBN 978-1-4614-7137-0 ISBN 978-1-4614-7138-7 (eBook)
  • The Elements of Statistical Learning
    Trevor Hastie, Robert Tibshirani, Jerome Friedman
    2009, 2nd Edition Springer
    ISBN 978-0-387-84858-7
  • Think Stats – Exploratory Data Analysis
    Allen B. Downey
    2011 2nd Edition O’Reilly
  • Think Stats – Probability and Statistics for Programmers
    Allen B. Downey
    2011 Green Tea Press (Free Download)
  • Think Bayes – Bayesian Statistics Made Simple
    Allen B. Downey
    2012 Green Tea Press (Free Download)
  • Principles of Data Science
    Sinan Ozdemir
    2016 Packt Publishing
  • Python Data Science Essentials
    Alberto Boschetti and Luca Massaron
    2016 Packt Publishing
  • Doing math with Python : use programming to explore algebra, statistics, calculus, and more! Amit Saha.
    2015 No Starch Press, Inc