Hi, you are logged in as , if you are not , please click here
You are shopping as , if this is not your email, please click here

Data Wrangling - £835

Centres

£835.00

Description

This is compulsory on MSc Data Science.
This module runs in every trimester 2 (January) only.
This module fee is for Overseas fee payers only.

 

Detailed Description

Data Wrangling is the process of transforming and mapping data from "raw" data formats into other formats with the intent of making it more appropriate and valuable for a variety of downstream purposes such as analytics. This may include further data processing, visualisation, aggregation, training a statistical model, as well as many other potential uses. Data Wrangling includes several steps, starting with extracting the data in a raw form from the data source, processing the raw data using specialised algorithms (e.g. NLP approaches for text processing), storing using appropriate data structures (e.g. lists, matrices etc.) and finally utilise the resulting content into a data sink for storage and future use, such as training machine learning models.

Contemporary data acquisition and analysis has to address several challenges including the variety of data sources, the volume of data, validity etc. These require the use of specialised data storage, aggregation and processing techniques. This module introduces a range of tools and techniques necessary for working with data in a variety of formats with a view to developing data-driven applications. The module focuses primarily on developing applications using the Python scripting language and associated libraries and will also introduce a range of associated data processing technologies and techniques.

The module covers the following topics:

• Data types and formats: numerical and time series, textual, unstructured
• Data sources and interfaces: open data, APIs, social media, web-based
• Techniques for dealing with text data such as vectorisation, bag of words, word embeddings
• Supervised Machine Learning approaches
• Developing and evaluating Data-Driven Applications in Python

The Benchmark Statement for Computing specifies the range of skills and knowledge that should be incorporated in computing courses. This module encompasses cognitive skills in Computational Thinking, Modelling and Methods and Tools, Requirements Analysis and practical skills in specification, development and testing and the deployment and use of tools and critical evaluation in addition to providing useful generic skills for employment.

How would you rate your experience today?

How can we contact you?

What could we do better?

   Change Code