Data mining serves as a crucial field that leverages advanced algorithms to reveal hidden, yet invaluable insights buried within extensive datasets. These algorithms are drawn from a multitude of areas such as machine learning, artificial intelligence, pattern recognition, statistics, and database systems, working together to facilitate a deeper understanding and analysis of data.
This course, ISA5810: Data Mining: Concepts, Techniques, and Applications, is designed to equip you with the foundational knowledge and hands-on experience needed to delve into the expansive world of data mining. Whether you are looking to enhance your skill set or embark on a new career path, this course will serve as a stepping stone to achieving your goals.
The curriculum encompasses a range of topics that will introduce you to the core concepts and techniques prevalent in the field of data mining. These include:
Supporting this course
She offers the fundamental database course and advance database courses for more than a decade. Her current research interests are: social networks, data mining, emotion analysis, and web intelligence.
Orientation9/2 for 3 hours |
During the orientation session, you'll have the opportunity to acquaint yourself with the course structure, meet your instructor, and connect with fellow classmates, fostering a collaborative and engaging learning environment. Additionally, we will provide a comprehensive overview of the course content, setting the stage for a productive and enlightening educational journey. Activities
Interesting Videos |
Overview and Data9/9, 9/16 for 6 hours |
Mastering and optimizing data stands as a pivotal phase in the comprehensive process of data mining activities. In this session, an introduction to the diverse attributes and distinct characteristics inherent in datasets will take center stage. This will transition into a deep dive into various data preprocessing techniques essential for effective data analysis. Following this, a range of similarity and distance measures will be explored, serving as vital tools for discerning patterns and trends within the data. To conclude the session, an immersion into the art of data visualization will take place, showcasing a potent tool that aids in the intuitive representation and interpretation of complex data structures. Related Videos
Activities
|
Lab for Data Exploration and Management9/23 |
During this lab session, emphasis will be placed on utilizing scientific computing libraries for the adept processing, transformation, and management of data. Moreover, participants will be acquainted with practices and introduced to cutting-edge visualization tools, fostering effective big data analysis. Activities
|
Classification9/30, 10/7 for 6 hours |
Classification, often identified as supervised learning, stands as a focal point in the spheres of data mining and machine learning. The primary objective here is to categorize input data into defined classes, enhancing the accuracy of predictive analyses. In this session, crucial algorithms integral to classification techniques will be explored. The discussion will commence with an analysis of Decision Trees, utilizing a tree-like graph structure for strategic decision-making. This will transition into a study of Bayesian Networks, central tools for deducing probabilities and making informed predictions by analyzing the statistical relationships between different variables. Subsequently, the focus will shift to Neural Networks, potent frameworks adept at deciphering complex patterns and facilitating precise predictions. The session will conclude with an overview of Convolutional Neural Networks (CNNs), vital instruments in the realm of visual imagery analysis, notably in tasks involving image and video recognition. This session aims to impart a comprehensive understanding of the core principles and subtleties of classification, furnishing participants with the skills vital for success in data mining projects. Activities
Related Videos
|
Text Mining10/14, 10/21 for 6 hours |
Text mining operates as a method for gleaning essential insights from unstructured textual data, commonly employing Natural Language Processing (NLP) techniques such as lexical and syntactic analysis, and inference methods. In this session, advanced computational methodologies like the Word2Vec algorithm will be discussed, highlighting its role in mapping word relationships through vector spaces. The conversation will also introduce Transformers, which enable efficient sequence processing, and Large Language Models, renowned for their expansive text generation and comprehension capabilities. A segment on ChatGPT will illustrate its significance in modern applications such as chatbots and content creation, underscoring the current innovations in the text mining domain. Related Videos
Activities |
Lab for Deep Information Retrieval and Neural Word Embeddings10/28 for 3 hours |
During this lab session, hands-on practice will take center stage, guiding participants through the utilization of information retrieval techniques for the modeling, training, and classification of textual data. The session will offer practical exposure to advanced deep learning frameworks such as word2vec, doc2vec, and FastText. Furthermore, participants will have the opportunity to engage with traditional text classification approaches like KNN, SVM, and Naive Bayesian, enabling a comprehensive, practice-oriented understanding of the diverse techniques utilized in the field. Activities
|
DM Clustering & Project Progress Report11/4, 11/11 for 6 hours |
Cluster analysis serves as a technique to group objects such that those within the same cluster exhibit higher similarity to each other compared to those housed in separate clusters. Initially embraced within the realms of pattern recognition and signal processing, these clustering strategies have expanded their influence into many other domains. This session will present a deep dive into a range of clustering techniques, emphasizing key algorithms such as K-Means for partitioning, Hierarchical Clustering which forms a tree of clusters, Density-Based Clustering that groups together points with sufficient proximity, and aspects of Cluster Validity which assesses the quality and reliability of the clusters formed. This discussion aims to furnish attendees with a robust understanding of these pivotal clustering algorithms and their practical applications. Related Videos
Activities
|
Association Rules11/18, 11/25 for 3 hours |
Association rules learning delves into identifying meaningful relationships between variables in large datasets, using metrics such as interestingness and confidence measures to pinpoint strong rules that arise from data analysis. This session will provide a succinct introduction to the core concepts of association rules, along with an overview of the Frequent Pattern Growth algorithm and key techniques for Pattern Evaluation. Participants will be equipped with the knowledge to effectively apply these techniques in real-world scenarios Activities
Related Videos
Activities
|
Examination12/2 for 3 hours |
Time to evaluate. Different from other examination in our life, we do not want to assess how much we remember. It is more important to know how much we understand. Hence, each student can bring one A4-page paper with all kinds of notes into the classroom. Enjoy. Notes
|
Student Presentation & Discussion12/9 for 3 hours |
Participants will engage in a collaborative exploration of a specified paper using the Jigsaw reading approach. Each student will be entrusted with understanding a particular section of the paper in depth, with the goal to elucidate their findings to group members. This initiative encourages not only a profound individual comprehension of the material but also fosters a synergistic learning environment, where aiding group members in grasping complex concepts becomes paramount. It’s a step towards nurturing a learning community where knowledge is mutually shared and amplified through collaborative discussion. Activities
|
Final Project Demo12/16 3 hours |
Culminating in a display of knowledge acquired through learning, analysis, and execution, this final project demonstration stands as a testament to your grasp of data mining principles throughout this course. Through this initiative, participants could also gain valuable experience in collaborative teamwork. Requirements
|