DT 332 - Applied Data Science
Data science is the craft that data scientists uses to answer meaningful business questions using available data. The mastery of the subject is essential for either managers which need to understand the insights that can be obtained from data and people that want to apply the scientific method to data.
In this course, students will start questioning about how to link the business and data. A particular focus of the course will be upon the basics for applied data science, such as domain knowledge, visualisation, data engineering, database management and statistical modelling. Among the most salient goals of the course will be an in-depth understanding of the most effective ways as to how communicate data insights as they are created.
In particular, the course content covered during the semester is: Introduction to Python, Knowledge Discovery in Databases, ETL (Extract, Transform, Load), Data Exploration, Feature Engineering, delivering actionable insights, Visualisation and Data Storytelling.
Intended Learning Outcomes
On successful completion of the course, students should be able to transform raw data into knowledge by formulating and testing adequate working hypothesis to answer real business questions. This aim will be developed through the following intended learning outcomes:
· ILO 1: To understand the data science process by defining and exploring popular terms such as data mining, machine learning or big data to provide an overall framework to applied data science.
· ILO 2: To identify business key facts by deploying data paradigms, such as Kimball’s data warehouse modelling, to structure the data in a meaningful way.
· ILO 3: To extract, transform and load raw data following the appropriate data paradigms by using state-of-the-art frameworks to populate the databases.
· ILO 4: To conduct initial exploratory analysis by applying a variety of statistical techniques to identify the appropriate features for more advanced modelling.
· ILO 5: To formulate adequate hypothesis by analysing and transforming real business questions to understand what is needed from the data.
· ILO 5: To apply supervised and unsupervised modelling by applying advance machine learning algorithms to obtain knowledge from data.
· ILO 6: To apply different visualisation techniques by using state-of-the-art visual suites (such as Tableau) to appropriately report data insights to the business managers.
Attendance Policy
Regular class attendance is mandatory. Please refer to the official IES Abroad Madrid attendance policy available in the orientation booklet and in Moodle
Weekly assignments
The assignments will be related to the materials delivered during the course and the development of a data science project, to be selected by the student in the 8th Week. No late assignments will be accepted. Assignments are due at the beginning of the lecture and, in addition to uploading it to the course website. The lowest homework score will be deleted for grading purposes.
Class Preparation and Participation
The in-class participation, including answering questions, contribution to discussions and participation in group activities will be evaluated.
Midterm Exam and Final Exam
The exam format can vary, it could be multiple-choice, fill-in-the-blank, short answer or it could require the students to write some code to solve a problem.
The exams would be designed to evaluate the student's understanding of the concepts and their ability to apply them to solve data science problems. It can include a combination of theoretical and practical questions, such as a question that describes a scenario and asks the student to select a specific technique or method to analyze or visualize the data, or questions that ask the student to write code to perform a specific data science task.
The final exam is comprehensive and will include a bonus question worth 10/100 extra points that will allow the students to make up a low midterm score.