Online MS in Data Science Curriculum Overview
Our online MS in Data Science program offers a rigorous, interdisciplinary curriculum to prepare you with marketable skills in data-centric problem-solving that drives strategic decision-making and optimized outcomes. The curriculum is delivered 100 percent online and can be completed in less than two years.
Explore the fundamental concepts of database management systems, including data models, SQL query language, implementation techniques, the management of unstructured and semi-structured data, and scientific data collections.
“Big data” deals with techniques for collecting, processing, analyzing, and acting on data at internet scale: unprecedented speed, scale, and complexity. This course introduces the latest techniques and infrastructures developed for big data including parallel and distributed database systems, map-reduce infrastructures, scalable platforms for complex data types, stream processing systems, and cloud-based computing. You’ll learn to apply common statistical and machine learning techniques to large data sets. Course content will be a blend of theory, algorithms, and practical, hands-on work.
This course focuses on the history, theory, and computational methods of artificial intelligence. Basic concepts covered include representation of knowledge and computational methods for reasoning. One or two application areas will be selected and studied from among these topics: expert systems, robotics, computer vision, natural language understanding, and planning.
This course provides an overview of methods by which computers can learn from data or experience and make decisions accordingly. Topics include supervised learning, unsupervised learning, reinforcement learning, and knowledge extraction from large databases with applications to science, engineering, and medicine. You’ll learn to recognize a problem as being appropriate for a machine learning solution and take steps to solve that problem with an applicable technique.
This course will focus on agents that must learn, plan, and act in complex, non-deterministic environments. We will cover the main theory and approaches of reinforcement learning (RL), along with common software libraries and packages used to implement and test RL algorithms. The course is a graduate seminar with assigned readings and discussions. The content of the course will be guided in part by the interests of the students. It will cover at least the first several chapters of the course textbook. Beyond that, we will move to more advanced and recent readings from the field (e.g., transfer learning and deep RL) with an aim towards focusing on the practical successes and challenges relating to reinforcement learning.
A two-course, hands-on, and project-based culmination to the program, in which students apply data science and analytic principles to the solution of a real-world problem. In the first course, students will perform requirements analysis, review available data sources, and propose a solution strategy to the problem, beginning their analysis. The second course completes the analysis process, culminating in a final report summarizing data gathered, analytic results, lessons learned, and opportunities for future study.
Advanced analysis in probabilistic systems with strong emphasis on theoretical methods. Development of analytical tools for the modeling and analysis of random phenomena with application to problems across a range of engineering and applied science disciplines. Probability theory, sample and event spaces, discrete and continuous random variables, conditional probability, expectations and conditional expectations, and derived distributions. Sums of random variables, moment generating functions, central limit theorem, laws of large numbers. Statistical analysis methods including hypothesis testing, confidence intervals and nonparametric methods.
A course in mathematical data science with an emphasis on theory. The course will also highlight important applications and students will have the opportunity to program some standard algorithms. The topics to be covered include principal component analysis, algorithms in numerical linear algebra, unsupervised clustering and density methods, nearest neighbor classifiers, supervised methods such as support vector machines and neural networks, and spectral graph theory, with applications in areas like image processing and network analysis.
A course on mathematical statistics. The emphasis is on theory, though there will also be many computations. Students will analyze problems of estimating, predicting, and inferring given limited data. The major topics include: parameter estimation, convergence of random variables, properties of estimators, statistical tests and confidence intervals, and non-parametric statistics.