This work is for cross-disciplinary researchers and practitioners from the data mining disciplines, the life sciences, and healthcare domains. It describes the latest research results and best practices in analyzing and converting biological, biomedical, and clinical data into useful knowledge through biological data mining. The data mining techniques described are designed to tackle data analysis challenges such as noisy and incomplete data and integration of various data sources. Chapters are in sections on sequence analysis, biological network mining, classification and trend analysis and 3D medical images, and biomedical applications of text mining. Each chapter begins with an introduction to a specific class of data mining techniques, written in a tutorial style accessible to non-computational readers such as biologists and healthcare researchers. This is followed by a detailed case study on how to use data mining techniques in a real-world biological or clinical application. Some specific topics include mining genomic sequence data, automated mining of disease-specific protein interaction networks based on biomedical literature, and indexing for similarity queries on biological networks. The book includes b&w images. Annotation ©2014 Ringgold, Inc., Portland, OR (protoview.com)
Biologists are stepping up their efforts in understanding the biological processes that underlie disease pathways in the clinical contexts. This has resulted in a flood of biological and clinical data from genomic and protein sequences, DNA microarrays, protein interactions, biomedical images, to disease pathways and electronic health records. To exploit these data for discovering new knowledge that can be translated into clinical applications, there are fundamental data analysis difficulties that have to be overcome. Practical issues such as handling noisy and incomplete data, processing compute-intensive tasks, and integrating various data sources, are new challenges faced by biologists in the post-genome era. This book will cover the fundamentals of state-of-the-art data mining techniques which have been designed to handle such challenging data analysis problems, and demonstrate with real applications how biologists and clinical scientists can employ data mining to enable them to make meaningful observations and discoveries from a wide array of heterogeneous data from molecular biology to pharmaceutical and clinical domains.
Sample Chapter(s)
Chapter 1: Mining the Sequence Databases for Homology Detection: Application to Recognition of Functions of Trypanosoma brucei brucei Proteins and Drug Targets (453 KB)
Biologists are stepping up their efforts in understanding the biological processes that underlie disease pathways in the clinical contexts. This has resulted in a flood of biological and clinical data from genomic and protein sequences, DNA microarrays, protein interactions, biomedical images, to disease pathways and electronic health records. To exploit these data for discovering new knowledge that can be translated into clinical applications, there are fundamental data analysis difficulties that have to be overcome. Practical issues such as handling noisy and incomplete data, processing compute-intensive tasks, and integrating various data sources, are new challenges faced by biologists in the post-genome era. This book will cover the fundamentals of state-of-the-art data mining techniques which have been designed to handle such challenging data analysis problems, and demonstrate with real applications how biologists and clinical scientists can employ data mining to enable them to make meaningful observations and discoveries from a wide array of heterogeneous data from molecular biology to pharmaceutical and clinical domains.