Module: Dataprocessing
Gegevensveld | Waarde |
---|---|
Osiriscode | BFVH4DPS1 |
ECTS | 3 |
Toetsvorm | Opdracht |
Minimum cijfer | 5,5 |
Docent(en) | FEFE |
Contactpersoon | FEFE |
Voertaal | Nederlands |
Cursusdoelen (leerdoelen)
- Student can identify and design work flow steps, inputs- and outputs requirements of a real-life "biodata analysis" research question
- Student can automate the analysis done by multiple software tools applied in a sequential fashion on input data by using the tool Snakemake
- Student includes the log of the analysis and the graphical representation of the analysis in the automated work flow
- Student schedules analysis steps on a server or a computer cluster given the data size and computation time requirements
- Student explains design choices
Inhoud
Bioinformatics research often needs high-performance computing resources to carry out data management and analysis tasks on large scale. The analysis needs to be carried out automatically according a specific workflow, especially in the case of multiple types of data sources, using multiple steps like mapping and filtering in the analysis process producing multiple types of results. Snakemake is an MIT-licensed workflow management system that aims to structure complex data analysis. It can handle multiple scripts like shell, python and R and it supports parallelisation and logging. It can handle scheduling on any CPU, GPU or memory available and it can assess input or output files via HTTP, HTTPS, Amazon S3, Google Storage, Dropbox, FTP and SFTP. In this module you will work with Snakemake to answer a real case research question.
Literatuur en andere bronnen
Web
- Blackboard course thema 11 and http://bioinf.nl/~fennaf/snakemake
Competenties
-
Werkvormen
During the first 3-4 lectures working with Snakemake will be introduced and students will practice with small study cases. After the introduction lectures students, will work on the final assignment. The scheduled lectures 5-8 will be used for feedback sessions to discuss design choices and code.
Ingangseisen
-
Ingangseisen toets
-
Voorkennis
- Knowledge and skills to program with bash/python and bioinformatics tools
Voorkennis kan worden opgedaan met
- previous informatics and bioinformathics courses
Bronnen van zelfstudie
Verplicht materiaal
-
Aanbevolen materiaal
- Tutorials and presentations to be found at https://bioinf.nl/~fennaf/snakemake