Data Science (DSC)

Why study Data Science?

Data abounds: social media, manufacturing systems, medical devices, logistic services, and countless others generate petabytes of data on a daily basis. With a wealth of data available, we are at a point in history, where we can conduct analyses to detect, discover, and, ultimately, better understand the world around us.

What are the carrier opportunities for Data Science graduates?

Become a professional for a career in a highly innovative area: data science. The profession in Data Science is hailed as the “… Sexiest Job of the 21st Century,” by Harvard Business Review in October 2012. The Data Scientist is a professional who simultaneously possesses breadth and depth in scalable data management, data analysis, and domain area expertise, and who is capable of solving real-world problems. This is an opportune time to pursue training in both a challenging and rewarding new field. Join us and embark on a journey of a lifetime!

Why Data Science at EIT Digital?

Mazen Aly

Getting value, meaning and answering big questions, are the ultimate goals of Learning Data Science at EIT Digital.

Mazen Aly
Data Science Master student
Eindhoven University of Technology

Farideh Heidari

Study Data Science at EIT Digital, where education is provided by renowned universities and where entrepreneurial data scientists of the future are come to being!

Farideh Heidari
Data Science Coordinator

What is Data Science Master at EIT Digital all about?

The newly established Data Science Master’s offers a unique academic programme, whereby students can study data science, innovation, and entrepreneurship at leading European universities. In this programme, students will learn about scalable data collection techniques, data analysis methods, and a suite of tools and technologies that address data capture, processing, storage, transfer, analysis, and visualization, and related concepts (e.g., data access, pricing data, and data privacy).

Data Science - Eindhoven University of Technology

How is the programme structured?

The first year will be similar at all four DSc entry point universities: Universidad Politecnica de Madrid (UPM), Eindhoven University of Technology (TU/e), Universite Nice Sophia Antipolis (UNS) and Polytechnic University of Milan with foundations courses, such as data handling, data analysis, advanced data analysis and data management, visualization, and applications. The second year will enable students to concentrate one of five technical specialisation areas of their own choosing. See: specialisations below.

An important part of the programme are the Innovation and Entrepreneurship (I&E) courses. The I&E basics course provides an introduction to business & management.

Students participating in the DSc track are offered an internship with an industry partner or research centre of the EIT-Digital to work on their thesis project. Directly linked to the master thesis is the I&E minor thesis that specifies the requirements, strategy and business plan for the selected thesis project.

Where can I study if I choose Data Science?

Entry - 1st year

  • Eindhoven University of Technology (TU/e)
  • Universidad Politecnica de Madrid (UPM)
  • Universite Nice Sophia Antipolis (UNS)
  • Polytechnic University of Milan (Polimi)

Exit - 2nd year, specialisation

  • Infrastructures for Large Scale Data Management and Analysis at UPM
  • Multimedia and Web Science for Big Data at UNS
  • Business Process Intelligence at TU/e
  • Distributed Systems and Data Mining for Really Big Data at KTH
  • Design, Implementation, and Usage of Data Science Instruments at TUB

What can I study at the entry and exit points?

Entry - 1st year

Eindhoven University of Technology

Eindhoven University of Technology

Eindhoven University of Technology (TU/e)
Program Coordinator: Farideh Heidari

Entry Program (each course with 5 ECTS)

Technical Common Base
2IMI35 Introduction to process mining
2IMW15 Web information retrieval and data mining
2IMA10 Advanced Algorithms
2IMV20 Visualization
2DMT00 Applied Statistics

Core Electives (2 out of 4)
2IMV10 Visual computing project
2IMA20 Algorithms for geographic data
2IMM20 Statistical learning theory
2IMI20 Advanced process mining

Suggested Electives (on Top of Program)
2MMS10 Probability and stochastics 1
2IMV25 Interactive virtual environments
2IMS25 Principles of Data Protection
2MMS30 Probability and stochastics 2
2IMI30 Business process simulation
2IMW30 Foundations of data mining
2IMW10 Data engineering
2IMV15 Simulation in Computer Graphics
2IMW20 Database technology
2DD23 Time-series analysis & forecasting

Technical University of Madrid

Technical University of Madrid

Universidad Politecnica de Madrid (UPM
Program Coordinator: Marta Patino

 First Semester 30 ECTS
- I&E 6 ECTS
- Cognitive systems 4.5 ECTS
- Intelligent Data Analysis 4.5 ECTS
- Cloud Computing and Big Data Ecosystems Design 4.5 ECTS
- Big Data 6 ECTS
- Intelligent Systems 4.5 ECTS

Second Semester 30 ECTS
- I&E 18 ECTS
- Information Retrieval, Extraction and Integration 4.5 ECTS
- Deep learning 3 ECTS
- Data Science Seminars 4.5 ECTS

University of Nice Sophia-Antipolis

University of Nice Sophia-Antipolis

Universite Nice Sophia Antipolis (UNS)

Program Coordinator: Francoise Baude

Technical Courses (34-36 ECTS)

Semester 1

Big data handling (6 ECTS)
- Technologies for big data
- Data mining

Data Transmission (choose from 4 up to 6 ECTS)
- Networking and traffic analysis
- Virtualized cloud computing
- Internet & network programming
- Wireless networking
- Cryptography & security

Data Analysis (choose from 4 up to 6 ECTS)
- Problem solving
- Analysis&indexing of images&videos in big size systems
- Graph & linear programming
- Probability & Statistics
- Web semantic & reasoning

Elective (choose from 0 up to 6)
- Group project in Data Science
- Winter School on Complex networks

Semester 2

Parallelism and Big Data distributed Systems (choose from 2 up to 6)
- Concepts of concurrency
- Concepts of parallelism
- Distributed systems and databases

Intelligent data analysis
- Statistical machine learning
- Data Valorization

Elective (choose from 1 up to 5)
- Web Science seminar
- Machine learning for computer vision
- Advanced programming (C++)

Innovation and Entrepreneurial courses (24-26 ECTS)

Semester 1

- Specific Scientific Writing (2 ECTS)
- French as Foreign Language (2 ECTS)
- Innovation & Entrepreneurship = mini BDL (2 ECTS)
- Basic concepts in I&E (2 ECTS)

Semester 2

- EIT Digital summer school (4 ECTS)
- Shared on-line business courses on Data Science (5 ECTS)
- Business Dev. Lab (7 ECTS)

Polytechnic University of Milan

Polytechnic University of Milan

Polytechnic University of Milan

Program Coordinator: Paolo Cremonesi (paolo.cremonesi@polimi.comit)

Technical Common BaseSemECTS
89183DATA BASES 215
Core Electives (3 out of 6)SemECTS
Suggested Electives (on Top of Program)SemECTS
Innovation and Entrepreneurship Part I (1 out of 2)SemECTS
Innovation and Entrepreneurship Part II (1 out of 2)SemECTS

Exit - 2nd year, specialisation

Infrastructures for Large Scale Data Management and Analysis at UPM

Marta Patiño

Marta Patiño is professor at UPM. She is Distributed Systems co-director and co-founder of LeanXcale startup on Real-Time Big Data analytics. She is also funder member of the research center Center for Open Middleware. She is co-inventor of 3 patent applications. She has coordinated several national projects and the EU funded projects LeanBigData, CoherentPaaS, and CumuloNimbo. She is co-author of the book Database replication and published over 100 papers in international conferences and journals such as SIGMOD, VLDB Journal, ACM Trans. On Database Systems, ACM Trans. On Computer Systems, IEEE Trans. On Parallel and Distributed Systems, etc. Her research areas include: scalable transactional processing, scalable complex event processing, online analytical processing, cloud computing, big data, fault-tolerance.

The specialization focuses on how to use large scale data management and big data infrastructures for processing, storing and analyzing huge amounts of data and building new applications on top of them. The students will learn how to use data streaming systems, persistent queues, batch processing for large clusters, large distributed databases among other technologies. They will also learn how to combine these tools to build ecosystems in which applications will be able to deal with the large amount of data that is being produced today and that it is increasing at a high pace due to the high number of devices that will be available, connected and producing data. Students will also gain experience with data analytics in order to get new insights and value from the produced data.

Students will be able to do internships and cooperate with large companies like Telefonica, Indra, Atos and also with startups from UPM in the area of Big Data: Localidata and LeanXcale. Localidata focuses on the value data chain. LeanXcale provides a leading-edge Real-Time Big Data Analytics platform.

Contact: Marta Patino

Further information on EIT Digital Master Program in Data Science at UPM can be found here.

(24 ECTS)
- Data Analysis (4.5 ECTS)
- Large Scale Systems Project (3 ECTS)
- Open Linked Big Data (6 ECTS)
- Large Scale Data Management (4.5 ECTS)
- Deep Learning (3 ECTS)
- Massively Parallel Machine Learning (3 ECTS)

Multimedia and Web Science for Big Data at UNS

Françoise Baude

Françoise Baude is a Full Professor (since 2010) at the Engineering school Polytech’Nice-Sophia, part of University of Nice-Sophia Antipolis, and belongs to joint research group with Inria, CNRS I3S since 1995. She is the official contact person for UNS in EIT Digital since 2010. She has been involved in numerous research activities encompassing large-scale distributed systems and middleware, focusing recently on big data analytics (recently EU funded PLAY FP7 project for instance). She has published more than 70 peer-reviewed contributions (in international journals and conferences).

The Multimedia & Web Science for Big Data specialization targets Data analysts in multimedia considering the huge amount of multimedia data is available on the web, and in particular on social media. The explosion of multimedia data (image, video, 3D, etc.) from mobile image captures, social sharing, the web, TV shows and movies, and the availability of large amount of metadata have created unprecedented opportunities and fundamental challenges to multimedia analytics. They are not just big in volume, but also unstructured and multi-modal. Students will acquire high technical skills to design and implement new approaches and algorithms to process and make sense of the volumes of information that people and organizations need to deal with. A particular focus will be given to multimedia content available on the web, making the Web platform the place where to mine for relevant information. Consequently, the specialization courses also pertain to advanced web technologies allowing the future data scientist to access, understand, categorize, reason upon the web of data, including non multimedia ones whenever relevant.

Certified World Competitiveness Cluster in July 2005 and Regional Joint Innovation and Economic Development Cluster in 2007, the Secured Communicating Solutions (SCS) Cluster brings together players in the field of microelectronics, software, telecommunications, services and uses of Information and Communications Technologies in the Provence-Alpes-Côte d'Azur French region. A new workgroup including some international companies like HP and SAP has been created within the SCS Cluster in 2013 to deal with Big Data including Data Science opportunities, in which UNS is contributing. The students in this new Data Science master will benefit from UNS relevant industrial closest collaborators, either located in the Europe’s leading Sophia-Antipolis technology park or even abroad: Groups as Orange and its research branch Orange Labs, Amadeus, Akamai, Thalès, all very active in web and multimedia-content delivery and applications; SMEs like ActiveEon, Alcméon, Mnemotix proposing data analytics supporting technologies.

Contact: Françoise Baude

Further information on EIT-Digital Master Program in Data Science at UNS can be found here.

UNS co-coordinator for DSc exit point: Pr Lionel Fillatre, Pr Frédéric Précioso

I&E Local coordinator : Cedric Ulmer

Mandatory courses
- Web of Data & Semantic Web (4 ECTS)
- Multimedia Data Processing (4 ECTS)
- Analysis, Mining and Indexing of Structured or Semi-structured Web Data, Images, and Videos in Big Multimedia Systems (2 ECTS)
- Programmable Web: Client and Server Side (4 ECTS)

Elective Courses (10 ECTS in total)
- Knowledge Engineering (2 ECTS)
- Security of Web Applications (2 ECTS)
- Security and Privacy 3.0 (2 ECTS)
- An algorithmic approach to truly distributed systems (2 ECTS)
- Middleware of internet of things (2 ECTS)
- Large scale distributed systems (2 ECTS)
- Software architecture for the cloud (2 ECTS)
- Virtualized infrastructures in cloud computing (2 ECTS)
- Group project in multimedia data management (6 ECTS)

Business Process Intelligence at TU/e

Wil van der Aalst

Prof. Dr. ir. Wil van der Aalst – Process Mining. The Architecture of Information Systems (AIS) research group at TU/e investigates methods, techniques and tools for the design and analysis of Process-Aware Information Systems (PAIS), i.e., systems that support business processes (workflows) inside and between organizations. The AIS group is generally seen as one of the strongest Business Process Management (BPM) groups in the world. According to Google Scholar, the chair of AIS has the highest H-index of all European computer scientists. (H-index of 108 with more than 50.000 citations) illustrating the impact of the research. This is also reflected by the widespread use of its open sourced tools like ProM and YAWL. The AIS group is the main group responsible for the Master of Business Information Systems and the EIT Master of Service Design and Engineering.

This specialization focuses on technologies for business analysis and prediction specifically focusing on process mining and its application in the domain of high tech systems, healthcare, visual analytics, spatial data handling, and software analytics (Big Software). Process mining is a relatively young research discipline that sits between computational intelligence and data mining on the one hand, and process modeling and analysis on the other hand. The idea of process mining is to discover, monitor and improve real processes (as opposed to assumed processes) by extracting knowledge from event logs readily available in today's information systems. Process mining includes automated process discovery, conformance checking, social network and organizational mining, automated construction of simulation models, model extension, model repair, case prediction, and history-based recommendations. In this specialisation students will acquire breadth and depth in the design, implementation, and use of data science instruments, with the emphasis on business problem solving in the context of business processes.

The TU/e has excellent research groups in the data-science area, which have combined forces in the Data Science Center Eindhoven (DSC/e). A central task of the DSC/e is to educate new generations of data scientists. The Netherlands, and specifically the Brainportregion, offers a number of leading technology companies. Our graduates have participated in numerous internship opportunities at leading technology companies like ASML, Philips Healthcare, NXP, FEI Company, TomTom, DAF Trucks and also at innovative startups.

The mandatory courses listed below offer students business analytics and predictive modeling techniques enable to detect structures and relationships in such large data sets and to build predictive models. Common techniques for this from fields such as applied statistics, data mining and artificial intelligence are discussed, but also multi-objective optimization of operational processes through nature-inspired meta-heuristics. This includes evolutionary computation techniques such as genetic/memetic algorithms, particle swarm optimization, ant-colony optimization. Process mining techniques will not be limited to control-flow and will also include other perspectives in bottleneck analysis, social network analysis, and decision mining. The elective courses enable students to acquire greater depth in high tech systems, healthcare applications, spatial data handling, visual analytics, or software evolution.

Besides learning theoretical concepts, students will be exposed to event data from a variety of domains, including hospitals, insurance companies, governments, high-tech systems, etc. Assignments will focus on the analysis of such data sets and on focusing on a particular process mining problem. Application areas include but are not limited to hospital logistics optimization, software repository mining, predictive maintenance of healthcare equipment, visualisation of genomics data, visual analytics for epidemiologists, etc. Upon completion of this programme, graduates will possess a sound foundation to begin a career as a data scientist with a specialisation in process mining.

Contact: Wil van der Aalst

Further information on EIT Master Program-Data Science at TU/e can be found here.

For questions related to coordination of the program at TU/e, please contact Farideh Heidari

Further information on Data Science Center Eindhoven can be found here.

Mandatory courses (5 ECTS Each)

  • 2IMI35 Introduction to process mining (if needed)
  • 2MMS10 Probability and stochastics 1 (if needed)
  • 2IMS25 Principles of Data Protection
  • 2IMI20 Advanced process mining
  • 2IMI00 Seminar architecture of information systems, OR
  • 2IMW00 Seminar web engineering
  • 1ZS30 Innovation and entrepreneurship thesis
  • 2IMC00 Master project

Alternatives for “if needed” courses

  • 2IMA10 Advanced algorithms
  • 2IMW15 Web information retrieval and data mining

Electives at TU/e

  • 2IMV10 Visual computing project
  • 2IMA20 Algorithms for geographic data
  • 2IMM20 Statistical learning theory
  • 2IMV25 Interactive virtual environments
  • 2MMS30 Probability and stochastics 2
  • 2IMI30 Business process simulation
  • 2IMW30 Foundations of data mining
  • 2IMW10 Data engineering
  • 2IMV15 Simulation in Computer Graphics
  • 2IMW20 Database technology
  • 2DD23 Time-series analysis & forecasting

Distributed Systems and Data Mining for Really Big Data at KTH

Vladimir Vlassov

Vladimir Vlassov is an associate professor in Computer Systems at KTH. He worked as a visiting scientist and researcher at UMASS (2004) and at MIT, USA (1998). His current research interests include (but not limited to) Big Data and data intensive computing; autonomic computing; distributed and parallel computing. His research foci are on data-intensive computing and stream processing; Cloud resource management; self-management of cloud-based services and applications; large-scale distributed systems. He has participated in a number of European projects on P2P, Clouds, autonomic computing and multi-core systems, including PEPITO (FP5), CoreGRID (FP6), Grid4All (FP6), SELFMAN (FP6), ENCORE (FP7), CLOMMUNITY (FP7), PaPP (FP7); and in a number of research projects funded by Swedish funding agencies, such as VINNOVA and SSF, and a project funded by NSF USA. He is (was) a co-supervisor of a number of PhD students at KTH. He is one of the coordinators of the Erasmus Mundus Joint Doctorate in Distributed Computing (EMJD-DC, He teaches courses on Concurrent programming (for KTh and industry), Network programming, and Data Mining.

The Distributed Systems & Data Mining for Really Big Data specialization focuses on providing students with analytical and programming skills to be able to efficiently build systems that manipulate and process Big Data. After completing the courses at KTH, the student will be able to effectively design and implement systems to parse data at whatever stage in the pipeline, from batch-oriented to real-time stream processing. Students will also be able to write efficient programs that extract useful information from Big Data. The student will acquire deeper skills in a data-mining subfield of his/her choice in areas such as Graph-based data analysis (of particular interest to our partner Spotify AB), unsupervised machine learning with Deep Learning, or data mining for streaming data (of particular interest to our partner Ericsson). Students will work with platforms such as Hadoop, Flink, Spark, GraphLab, Mahout, and H20. We will also have many guest lectures from the many companies active in Big Data in the Stockholm region (listed below).

For the 2nd semester industry placement programme, we can offer students practical industrial experience in cooperation with Stockholm-based companies such as Ericsson, Spotify AB, Ltd, and Oracle (MySQL). We can also help finding research-oriented projects at Swedish ICT - SICS and Ericsson Research, including at the KTH-SICS Cloud Innovation Center (C!C). We also work closely with fast-moving start-ups, to whom we supply a regular stream of interns, including Recorded Future AB, Gavagai AB, Peerialism AB, and Several Nines AB. These companies already cooperate with KTH on Big Data related projects, and, together, we envisage future cooperation in the context of this Master's Programme.

Contact: Vladimir Vlassov

Period 1 (September - October)

  • ID2221 Data-Intensive Computing (7.5 ETCS)
  • ID2224 Networks in Data Science (7.5 ETCS)
  • (elective) ID2203 Distributed Systems, Advanced Course (7.5 ETCS)

Period 2 (November - December)

  • ID2222 Data Mining (7.5 ETCS)
  • ID2223 Scalable Machine Learning and Deep Learning (7.5 ETCS)

Period 3-4 (January - June) Master thesis project.

Design, Implementation, and Usage of Data Science Instruments at TUB

Volker Markl

Volker Markl is a Full Professor and Chair of the Database Systems and Information Management (DIMA) Group at TUB, as well as an adjunct status-only professor at the University of Toronto. His research interests include IaaS, new hardware architectures for information management, information integration, big data analytics, query processing, query optimization, data warehousing, electronic commerce, and pervasive computing. He has presented over 100 invited talks worldwide, authored over 50 research papers, has seven patent awards, & has transferred technology into several commercial products, and advises several companies and startups. He is the Speaker and Principal Investigator of the Stratosphere Research Project that resulted in the Apache Flink Big Data Analytics System. He is also Speaker of the Berlin Big Data Center (BBDC), one of the first competence centers in Europe researching innovative technologies and applications around big data, with strong ties to industry and startups. Additionally, he currently serves as the Secretary of the VLDB Endowment and was recently elected as one of Germany's leading “Digital Minds” (Digitale Köpfe) by the German Informatics Society (GI).

The Design, Implementation, and Usage of Data Science Instruments specialization offers students training in three key areas, namely, scalable data management, data analysis and machine learning, as well as applications. We provide students with solid data science instruments knowledge (i.e., involving methods, technologies, and systems) to empower them to tackle data science problems in science or business, in important application domains, such as Industrie 4.0 (ICT based manufacturing), healthcare, energy, smart cities and smart spaces, or logistics. For example, in the context of information marketplaces, where the aim is to contribute to information economies, by provisioning, transforming, analyzing, augmenting, and reselling data along data value chains (e.g., for text, speech, or video data analytics in the media sector, sensor data for Industry 4.0, and other data-driven business intelligence use-cases). Examples of these instruments include Hadoop, Flink, Spark, and GraphLab. Students will learn how to use and enhance these open-source technologies, in addition to working with various closed source technologies, in varying settings, such as for graph mining or text mining. Our curriculum is technology-focused, but also addresses other data science dimensions, such as business models, legal issues, and societal aspects.

Today, graduates with data science skills are in great demand. We are routinely asked to recommend our graduates for immediate employment. Indeed, many of them have conducted internships at leading technology companies, such as Google, IBM, Oracle, SAP, and Twitter. Our strong ties to industry offer students numerous opportunities to pursue rewarding internships, where they will acquire hands-on experience and put their knowledge and skills into practice. These opportunities include DFKI (the German Research Center for Artificial Intelligence), Deutsche Telekom, SAP, Siemens, Trumpf, and innovative start-ups, such as Blue Yonder, Data Artisans, DataMarket, Internet Memory Research, Parstream, and Vico Research.

The mandatory courses listed below offer students the opportunity to obtain training in data management systems and big data (analytics) technologies. Furthermore, elective courses enable them to acquire greater depth in machine intelligence, database technology, speech processing, or cloud operations and attend seminars in machine learning, parallel data processing, signal processing, or big data analytics. Upon completion of this programme, graduates will possess a sound foundation to begin a career as data scientists.

Contact: Volker Markl

For questions related to coordination of the program at TUB, please contact Ralf-Detlef Kutsche.

Further information on EIT Digital Master Program-Data Science at TU Berlin can be find here.


  • Scalable Data Science: Systems and Methods, AIM-3 SDSSM (6 ECTS - previously called: Large Scale Data Processing and Analytics)
  • BDAPRO - Big Data Analytics Project (9 ECTS)
  • I&E Study (6 ECTS)

Plus elective / 9 ECTS

  • Cloud Computing - CC (6 ETCS)
  • Database Technology - DBT (6 ETCS)
  • Database Technology Lab: Implementation of a Database Engine - IDB-PRA (6 ETCS)
  • Heterogeneous and Distributed Information Systems - AIM-1 HDIS (6 ECTS)
  • Machine Learning I & Classical Topics in Machine Learning - ML I (9 ECTS)
  • Management of Data Streams - AIM-2 MDS (6 ECTS)
  • Speech Signal Processing and Speech Technology - SSPSC (6 ECTS)
  • Big Data Analytics Seminar - BDASEM (3 ECTS)

Details about the classes and all rules and regulations you can find here.

© 2010-2018 EIT Digital IVZW. All rights reserved. Legal notice