Prerequisites
Data mining is a broad field that combines techniques from different areas in computer science and statistics. Our model curriculum assumes that students have basic background knowledge in the following areas:
Database Systems:
Data models, query languages, SQL, conceptual database design, query processing, and transaction processing.
Statistics:
Expectation, basic probability, distributions, hypothesis tests, ANOVA, and estimating a distribution parameter.
Linear Algebra:
Vectors and matrices, vector spaces, basis, matrix inversion, and solving linear equations.
Algorithms and Data Structures:
We assume familiarity with basic data structures and general maturity of students to understand algorithms written in pseudo-code.
We believe that most computer science seniors either have covered this material in previous courses, can pick up missing material in self-study, or that the missing material is introduced by the course instructor as necessary.