Description |
The applied workshops are preceded by a related Summer School activity (10-21 July 2017) that will focus on growing competence in accessing, analyzing, visualizing, and publishing data. It is open to participants from all disciplines and/or background from the sciences to humanities. The three applied workshops (24-28 July 2017) focus on Extreme sources of data; Bioinformatics; IoT/Big Data Analytics. TOPICS: • Summer School: principles and practice of Open Science; research data management and curation; use of a range of research compute infrastructures; large scale analysis, statistics, visualisation and modeling techniques; automation and scripting. • Workshop on Extreme sources of data: Introduction to ATLAS Open Data Platforms/Tools , tutorials and CERN LHC. • Workshop on Bioinformatics: computational methods for the management and analysis of genomic and sequencing data. • Workshop on IoT/Big Data Analytics: Big Data tools and technology; real time event processing; low latency query; analizing social media and customer sentiment. |
Shuttle service from the Adriatico entrance to the Enrico Fermi Building: all financially supported participants lodging at ICTP Guesthouse should reach the Operations and Travel Unit at the Enrico Fermi Building in order to fulfill all financial procedures. Please bring with you passport and travel receipts. Registration at Adriatico lower level: only for participants lodging outside ICTP premises and faculty.
Location: | Adriatico Guest House - Office 3 |
Abstract: In this module, we'll explore various technique used in Medical Genetics to find genomic variations responsible for complex and Mendelian traits. We'll start with association analysis in complex traits using genotyping data in thousands of individuals and will conclude with the analysis of Exome sequences in families with a Mendelian Pathology. For each technique, we'll illustrate the correlated biological and statistical problems, the rationale and potential pitfalls. In the afternoon, all the students will apply the techniques using hand-on tutorials.
Location: | Adriatico Guest House - Informatics Laboratory |
Speaker: | Luca Bortolussi (University of Trieste) |
Speaker: | Pio d'Adamo (University of Trieste/Children's Hospital Burlo Garofolo) |
Speaker: | Pio d'Adamo (University of Trieste/Children's Hospital Burlo Garofolo) |
Speaker: | Pio d'Adamo (University of Trieste/Children's Hospital Burlo Garofolo) |
Speaker: | Pio d'Adamo (University of Trieste/Children's Hospital Burlo Garofolo) |
Location: | Adriatico Guest House - Lundqvist Lecture Hall |
Material: | Slides |
E-mail: alberto.sartori@sissa.it
Speaker: | Alberto Sartori (SISSA, Italy) |
Material: | Repository |
Speaker: | Alberto Sartori (SISSA, Italy) |
Speaker: | Alberto Sartori (SISSA, Italy) |
Speaker: | Alberto Sartori (SISSA, Italy) |
Location: | Adriatico Guest House - Denardo Lecture Hall |
Speaker: | Ekpe Okorafor (Big Data Academy) |
Material: | Slides Slides |
Speaker: | Ekpe Okorafor (Big Data Academy) |
Material: | Exercises Slides |
Abstract: Next Generation Sequencing (NGS) technologies have led to discoveries of new diagnostic, prognostic and therapeutic targets. Despite these discoveries, treatment of cancer patients, detection of cancer biomarkers and prediction of therapy response remain largely unsolved problems. These difficulties are hindering the realisation of effective approaches to personalized medicine; and data needs to be better exploited to systematically elucidate the mechanisms and causes underlying cancer origination and development. Cancers accumulate genetic mutations that allow their cells to proliferate out of control. Mutations occur randomly, are inherited through cell divisions, and orchestrate cancer initiation and development with accumulation patterns differing between individuals. NGS technologies are routinely used to detect mutations in tumoral biopsies, and free-access large collections of cancer datasets are now available. Cancer mutation profiles are incredibly heterogenous, and we observe few common mutations across patients even if their cancers have similar histological classification. Tumor Heterogeneity (TH) is intimately related to Cancer Evolution, and is considered to lead to the emergence of drug-resistance mechanisms, relapse and failure of treatments. Quantification of TH across cancer types and patients is of the utmost importance in modern cancer research. I will present a causal framework to infer, from DNA sequencing data, Graphical Models that recapitulates the progression of the tumors (i.e., evolutionary models). This inference problem has several formulations, according to the type of NGS data that we have access to. I will discuss an approach that combines Statistics, Machine Learning and Formal Methods to infer models from single-sample data; and then I will move on to the problem of studying Cancer Evolution from multi-samples of the same individual. These two problems are orthogonal, and I will discuss attempts at defining a unique framework to study Cancer Evolution. Example applications with real data will be presented and discussed.
Location: | Adriatico Guest House - Informatics Laboratory |
Speaker: | Giulio Caravagna (The University of Edinburgh) |
Speaker: | Giulio Caravagna (The University of Edinburgh) |
Speaker: | Giulio Caravagna (The University of Edinburgh) |
Speaker: | Giulio Caravagna (The University of Edinburgh) |
Location: | Adriatico Guest House - Denardo Lecture Hall |
Speaker: | Ekpe Okorafor (Big Data Academy) |
Speaker: | Ekpe Okorafor (Big Data Academy) |
Material: | Exercises |
Speaker: | Ekpe Okorafor (Big Data Academy) |
Speaker: | Giancarlo Panizzo (University of Trieste) |
Material: | Slides |
Speaker: | Leonid Serkin (ICTP) |
Material: | Slides |
Speaker: | Leonid Serkin (ICTP) |
Material: | Slides |
Speaker: | Leonid Serkin (ICTP) |
Material: | Slides |
Speaker: | Arturo Rodolfo Sanchez Pineda (ICTP) |
Speaker: | Arturo Rodolfo Sanchez Pineda (ICTP) |
All participants are cordially invited
Location: | Adriatico Guest House - Terrace |
Location: | Adriatico Guest House - UN Room |
Material: | Slides |
Abstract: DNA aligners (such as BLAST, Bowtie or BWA) are very fast tools that allow searching occurrences of (short) DNA sequences in one or more (big) genomes. The idea behind these tools is to pre-process the genome file and build an index; such an index permits to search a DNA sequence in time proportional to its length, rather than to the length of the genome. Indexing accelerates DNA alignment by millions of times, but it introduces a problem: the index could be several times bigger than the text, exceeding the computer's RAM size. This is particularly concerning in view of recent developments in DNA sequencing technologies: projects such as the 1000 Genomes Project are producing thousands of sequenced genomes, which should be indexed in order to quickly align DNA sequences on them. Not all hope is lost, however. Two genomes from the same species are 99.99% identical, so compression techniques can be exploited to greatly reduce the index size. In this lecture I will introduce a famous compression and indexing technique that is having a huge impact in bioinformatics: the Burrows-Wheeler transform (BWT). We will see - both in theory and practice - how BWT-based aligners can achieve extremely high search speeds while taking (up to) thousands of times less space than the input collection of genomes.
Location: | Adriatico Guest House - Informatics Laboratory |
Speaker: | Nicola Prezza (Technical University of Denmark) |
Material: | Slides |
Speaker: | Nicola Prezza (Technical University of Denmark) |
Speaker: | Nicola Prezza (Technical University of Denmark) |
Speaker: | Nicola Prezza (Technical University of Denmark) |
Location: | Adriatico Guest House - Denardo Lecture Hall |
Speaker: | Ekpe Okorafor (Big Data Academy) |
Speaker: | Ekpe Okorafor (Big Data Academy) |
Material: | Exercises |
Speaker: | Ekpe Okorafor (Big Data Academy) |
Location: | Adriatico Guest House - Informatics Laboratory |
Speaker: | Omer Ayoub (King Abdul Aziz University, Saudi Arabia) |
Material: | Slides |
Abstract: High-throughput data sets from next-generation sequencing provide a rich but highly complex picture of the biological processes assayed. Statistical challenges abound, arising from high dimensionality, strong heterogeneity and general low replication of the data. In this talk, I will describe how techniques from machine learning and computational statistics can be effectively used to answer some of these questions. I will focus on the issues of statistical testing for epigenomic data such as ChIP- and BS-Seq, and determining isoform proportions/ splicing ratios from low coverage RNA-Seq data.
Location: | Adriatico Guest House - Informatics Laboratory |
Speaker: | Guido Sanguinetti (The University of Edinburgh) |
Material: | Slides |
Speaker: | Guido Sanguinetti (The University of Edinburgh) |
Material: | Slides |
Speakers: | Alberto Policriti (University of Udine), Guido Sanguinetti (The University of Edinburgh) |
Location: | Adriatico Guest House - Lundqvist Lecture Hall |
Location: | Adriatico Guest House - Denardo Lecture Hall |
Speaker: | Ekpe Okorafor (Big Data Academy) |
Speaker: | Ekpe Okorafor (Big Data Academy) |
Material: | Exercises |
Speaker: | Ekpe Okorafor (Big Data Academy) |
Location: | Adriatico Guest House - Informatics Laboratory |
Speaker: | Omer Ayoub (King Abdul Aziz University, Saudi Arabia) |
Material: | Slides |
Location: | Adriatico Guest House - Lundqvist Lecture Hall |
Material: | Slides |
Material: | Link |
Material: | Slides |