PDF for print Find calendar

Applied Data Science and Data Visualisation

Title
Applied Data Science and Data Visualisation
Semester
F2026
Master programme in
Molecular Health Science / Chemical Biology / Environmental Science / Mathematical Bioscience
Type of activity

Course

Mandatory or elective

Mandatory/Elective

Mandatory: Molecular Health Science and Chemical Biology Elective: Mathematical Bioscience

Teaching language
English
Study regulation

Read about the Master Programme and find the Study Regulations at ruc.dk

Læs mere om uddannelsen og find din studieordning på ruc.dk

REGISTRATION AND STUDY ADMINISTRATIVE
Registration

Sign up for study activities at stads selvbetjening within the announced registration period, as you can see on the Studyadministration homepage.

When signing up for study activities, please be aware of potential conflicts between study activities or exam dates.

The planning of activities at Roskilde University is based on the recommended study programs which do not overlap. However, if you choose optional courses and/or study plans that goes beyond the recommended study programs, an overlap of lectures or exam dates may occur depending on which courses you choose.

Number of participants
ECTS
5
Responsible for the activity
John Shorter (johnsh@ruc.dk)
Head of study
Lotte Jelsbak (ljelsbak@ruc.dk)
Teachers
Study administration
INM Registration & Exams (inm-exams@ruc.dk)
Exam code(s)
U60176
ACADEMIC CONTENT
Overall objective

The overall objective of this course is to introduce the concept of data science and visualization of data to enable students within experimental sciences to design, perform, visualize, evaluate, interpret and communicate experiments where many parameters are measured and so called big data experiments (‘omics’ data).

Furthermore, the aim is to provide students with the necessary methodological and data analysis skills to be able to evaluate validity and quality of methods and data related to analysis of large datasets.

Detailed description of content

The course consists of lectures combined with hands-on exercises, and projects where the students can work on their own data or other data from their own field.

No previous programming experience is required, but students will be expected to learn basic programming (R and Rstudio) for visualization and statistical analysis during this course.

Course material and Reading list

No textbooks are needed, course material will be specified on moodle.

Overall plan and expected work effort

The course is composed of 8 lessons. Each lesson is concluded with a written report. The reports demonstrate the students active participation. The reports can be written individually or in groups of 2-3 students. The final mini project report is written in groups of 2-3 students.

  • lectures 8 hrs
  • pc lab practical exercises 24 hrs
  • preparation for lectures and exercises 38 hrs
  • problem solution and report writing 65 hrs

total 135 hrs

Format
Evaluation and feedback

The course includes formative evaluation based on dialogue between the students and the teacher(s).

Students are expected to provide constructive critique, feedback and viewpoints during the course if it is needed for the course to have better quality. Every other year at the end of the course, there will also be an evaluation through a questionnaire in SurveyXact. The Study Board will handle all evaluations along with any comments from the course responsible teacher.

Furthermore, students can, in accordance with RUCs ‘feel free to state your views’ strategy through their representatives at the study board, send evaluations, comments or insights form the course to the study board during or after the course.

Programme

Most classes will begin with a short lecture / introduction to a concept within data science followed by time for discussion and work with R programming exercises.

Students will then work in small teams to analyze data based on concepts covered in the introduction. Selected groups will then present their data analysis to the class.

The students will write a report for each lesson, where an emphasis should be on explaining the analyses used, the implication of the results, and on the visualization of the data.

These reports will be turned in at the end of the class, or before the next class, with the names of the group members along with code used for the analysis and visualization in the report.

The topic for mini-project is to present a visual and statistical analysis of an approved dataset. You will use what you learned during the semester to create an R script that goes step-by-step on an analysis, and you will present this analysis and script at the last class.

You are expected to bring a laptop computer to class.

ASSESSMENT
Overall learning outcomes

After completing the course, the students will be able to:

  • describe and explain the concepts of multivariable data processing and visualization

  • handle multivariable data using relevant software

  • identify and extract relevant parameters from large datasets

  • implement appropriate descriptive statistics on high complexity and bigdata

  • describe and analyze the intrinsic structure of a large multivariable dataset using relevant methods, such as clustering methods, principal component analysis (PCA) or least-squares analyses (PLS)

  • analyze multivariable data using basic linear models with covariate adjustments and interpret and discuss results from these

  • describe simple machine learning algorithms and explain their differences regarding purpose of use, strengths and weaknesses, as well as use selected machine learning algorithms for tasks such as selection of the variable with the best predicting power and interpreting results from these

  • explain the results from these methods to both lay people and specialists

  • • be aware of the assumptions and limitations of the chosen statistical tests

  • visualize the results in an informative and rigorous way

  • design complex experiments, including ‘omics’ experiments based on the methodological considerations of the ensuing data analysis

  • write documents describing methodological considerations regarding the analysis of big (omics) data

  • communicate the knowledge and understanding gained from the course in a precise and scientific way

Prerequisites
Form of examination
Active, regular attendance, and satisfactory participation

Active participation is defined as:
The student must participate in course-related activities (e.g., workshops, seminars, field excursions, process study groups, working conferences, supervision groups, and feedback sessions).

Regular attendance is defined as:
- The student must be present for a minimum of 75 percent of the lessons.

Satisfactory participation is defined as:
- e.g., oral presentations (individually or in a group), peer reviews, mini projects, tests, and planning of a course session.

Assessment: Pass/Fail
Form of Re-examination
Individual written take-home assignment

The character limit of the assignment is: 2,400-19,200 characters, including spaces.
The character limit includes the cover, table of contents, bibliography, figures and other illustrations, but exclude appendices.

The duration of the take-home assignment is 7 days and may include weekends and public holidays.



Assessment: Pass/Fail
Type of examination in special cases
Examination and assessment criteria (implemented)

Exam: Participate actively is defined as: The student must participate actively in lectures, discussion and problem solving classes. Students may be selected to present their report to the class at the end of a lecture. Active participation means students must work on data analysis during the class and present if they are called upon.

Regular attendance is defined as: The student must be present for minimum of 75 percent of the lessons. This includes arriving on-time and staying until the end of the class.

Satisfactory participation is defined as: The student must write and submit reports following every class. The student must work on a mini-project and present results at the final class.

Assessment criteria in relation to satisfactory participation of the exercises and mini-project, students will be assessed by their ability to:

  • explain the analyses used
  • account for, how choice of analysis have implication on the results, the visualization of the data, and programming code for analysis and visualization
  • communicate the knowledge and understanding gained from the lesson in a precise way within the submitted reports

Reexam: Assessment criteria in relation to the re-exam, students will be assessed by their ability to:

  • Create working code in the report that is reproducible
  • Explain the analyses used
  • Account for, how choice of analysis have implication on the results, the visualization of the data, and programming code for analysis and visualization
  • Communicate the knowledge and understanding gained from the lesson in a precise way within the submitted reports

Regarding the use of generative AI at the exam and reexam

In this course, generative AI tools (GAI) are allowed in the work on the exam if their use is declared. You must clearly indicate how you have used generative artificial intelligence (GAI). This can, for example, be included as part of a methodology section or as a brief statement at the end of your exam paper or submitted as an appendix to your assignment. This means that you must describe how you have used GAI, for example, for preparatory work on the assignment, to ask questions, search and process information, receive feedback and critique on your text, perform proofreading, or improve language and readability. It is important that you actively consider your choice of tools in this way, as it is part of the entire creation process of the assignment and thus part of your scientific method and academic communication.

The use of any specific text that is GAI-generated requires citation, just like the use of any other sources from which direct quotes are taken.

The use of generative artificial intelligence (GAI) must always take place within the framework of Roskilde University's ‘Guidelines for using generative artificial intelligence in exams. In the library's guide, you can see more about how to cite AI, how you can declare your use of GAI, and read Roskilde University’s Guidelines - https://libguides.ruc.dk/AI.

Regular spell check and other language suggestions, as known from Word or other word processing programs, as well as programs for writing minutes and transcription, are allowed in all written exams and do not need to be declared.

Exam code(s)
Exam code(s) : U60176
Last changed 02/12/2025

lecture list:

Show lessons for Subclass: 1 Find calendar (1) PDF for print (1)

Wednesday 11-02-2026 12:15 - 11-02-2026 16:00 in week 07
Applied Data Science and Visualisation
-

Monday 16-02-2026 12:15 - 16-02-2026 16:00 in week 08
Applied Data Science and Visualisation
-

Wednesday 18-02-2026 12:15 - 18-02-2026 16:00 in week 08
Applied Data Science and Visualisation
-

Wednesday 25-02-2026 12:15 - 25-02-2026 16:00 in week 09
Applied Data Science and Visualisation
-

Monday 02-03-2026 12:15 - 02-03-2026 16:00 in week 10
Applied Data Science and Visualisation
-

Wednesday 11-03-2026 12:15 - 11-03-2026 16:00 in week 11
Applied Data Science and Visualisation
-

Wednesday 18-03-2026 12:15 - 18-03-2026 16:00 in week 12
Applied Data Science and Visualisation
-

Wednesday 25-03-2026 12:15 - 25-03-2026 16:00 in week 13
Applied Data Science and Visualisation
-

Tuesday 23-06-2026 10:00 - Tuesday 30-06-2026 10:00 in week 26 and week 27
Applied Data Science and Visualisation
Reexam