This book reviews the latest techniques in exploratory data mining (EDM) for the analysis of data in the social and behavioral sciences to help researchers assess the predictive value of different combinations of variables in large data sets. Methodological findings and conceptual models that explain reliable EDM techniques for predicting and understanding various risk mechanisms are integrated throughout. Numerous examples illustrate the use of these techniques in practice. Contributors provide insight through hands-on experiences with their own use of EDM techniques in various settings. Readers are also introduced to the most popular EDM software programs. A related website at http://mephisto.unige.ch/pub/edm-book-supplement/offers color versions of the book’s figures, a supplemental paper to chapter 3, and R commands for some chapters.The results of EDM analyses can be perilous – they are often taken as predictions with little regard for cross-validating the results. This carelessness can be catastrophic in terms of money lost or patients misdiagnosed. This book addresses these concerns and advocates for the development of checks and balances for EDM analyses. Both the promises and the perils of EDM are addressed.
Editors McArdle and Ritschard taught the "Exploratory Data Mining" Advanced Training Institute of the American Psychological Association (APA). All contributors are top researchers from the US and Europe. Organized into two parts--methodology and applications, the techniques covered include decision, regression, and SEM tree models, growth mixture modeling, and time based categorical sequential analysis. Some of the applications of EDM (and the corresponding data) explored include:
selection to college based on risky prior academic profiles
the decline of cognitive abilities in older persons
global perceptions of stress in adulthood
predicting mortality from demographics and cognitive abilities
risk factors during pregnancy and the impact on neonatal development
Intended as a reference for researchers, methodologists, and advanced students in the social and behavioral sciences including psychology, sociology, business, econometrics, and medicine, interested in learning to apply the latest exploratory data mining techniques. Prerequisites include a basic class in statistics.
Table of Contents
Part I: Methodological Aspects J.J. McArdle, Exploratory Data Mining Using Decision Trees in the Behavioral Sciences. G. Ritschard, CHAID and Earlier Supervised Tree Methods. J.Kopf, T. Augustin, C. Strobl, The potential of model-based recursive partitioning in the social sciences –Revisiting Ockham's Razor. A.M. Brandmaier, Timo von Oertzen, J.J. McArdle, U. Lindenberger, Exploratory Data Mining with Structural Equation Model Trees. G. Ritschard, F. Losa, P.Origoni, Validating Tree Descriptions of Women’s Labor Participation with Deviance-based Criteria. G.A. Marcoulides, W.Leite, Exploratory Data Mining Algorithms for Conducting Searches in Structural Equation Modeling: A Comparison of Some Fit Criteria. K.J. Grimm, N. Ram, M. P. Shiyko, L. L. Lo, A Simulation Study of the Ability of Growth Mixture Models to Uncover Growth Heterogeneity. R. Piccarreta, C.H. Elzinga, Mining for Association between Life Course Domains. G.Ritschard, R. Bürgin, M. Studer, Exploratory Mining of Life Event Histories. Part II: Applications C.A. Prescott, Clinical versus Statistical Prediction of Zygosity in Adult Twin Pairs: An Application of Classification Trees. J.J. McArdle, Dealing with Longitudinal Attrition Using Logistic Regression and Decision Tree Analyses. J.J. McArdle, Adaptive Testing of the Number Series Test Using Standard Approaches and a New Decision Tree Analysis Approach. T. S. Paskus, Using EDM to Identify Academic Risk among College Student-Athletes in the United States. S. B. Scott, B. R. Whitehead, C. S. Bergeman, and L. Pitzer, Understanding Global Perceptions of Stress in Adulthood through Tree-Based Exploratory Data Mining. P. Ghisletta, Recursive Partitioning to Study Terminal Decline in the Berlin Aging Study. Y. Zhou, K.M. Kadlec, J. J. McArdle, Predicting Mortality from Demographics and Specific Cognitive Abilities in the Hawaii Family Study of Cognition. K. F. Widaman, K.J. Grimm, Exploratory Analysis of Effects of Prenatal Risk Factors on Intelligence in Children of Mothers with Phenylketonuria.
John J. McArdle is Senior Professor of Psychology at the University of Southern California where he heads the Quantitative Methods training program.
Gilbert Ritschard is Professor of Statistics and project leader at the Swiss National Center of Competence in Research LIVES.
"Data mining emerges from several tracks within quantitative methodology, and requires broad methodological background with outstanding computer skills. McArdle and Ritschard are exactly the right scholars to edit this volume, which includes fascinating and modern data mining research." – Joseph L. Rodgers, Vanderbilt University, USA
"The richness and volume of data available to behavioral scientists has increased dramatically, creating opportunities for new discoveries and improved prediction models. This timely and innovative volume describes and illustrates the use of new statistical strategies for probing large and complex data sets." – Rick H. Hoyle, Duke University, USA
"Deliberately ignoring the boundaries between separate quantitative traditions and different social and behavioural sciences, this book is an essential reading on the potential of "big data" to change the way we study individuals, social relationships and societies." -Francesco C. Billari, University of Oxford, UK
“The combination between theoretical/methodological issues with the empirical applications is excellent. ... It offers a wide range of research examples cutting across disciplines, data types, and units of analysis. ... Readers will be able to grasp the problems presented, relate them to their own research ... and apply the tools ... to their own data sets. ... I am thinking about creating a course on exploratory data analysis and I can see adopting this volume for that course.” – Emilio Ferrer, University of California - Davis, USA
“[This] book will contribute significantly in making the field of Exploratory Data Mining more accessible to many researchers in the behavioral [and] ... social sciences, medicine, and business. ... Suitable for an advanced level research methods course…I would strongly recommend it.“ – Riyaz Sikora, University of Texas at Arlington, USA