An Introduction to Statistical Learning

with Applications in R

Author: Gareth James,Daniela Witten,Trevor Hastie,Robert Tibshirani

Publisher: Springer Science & Business Media

ISBN: 1461471389

Category: Mathematics

Page: 426

View: 9443


An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

Encyclopedia of Bioinformatics and Computational Biology

ABC of Bioinformatics

Author: N.A

Publisher: Elsevier

ISBN: 0128114320

Category: Medical

Page: 3284

View: 3486


Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics combines elements of computer science, information technology, mathematics, statistics and biotechnology, providing the methodology and in silico solutions to mine biological data and processes. The book covers Theory, Topics and Applications, with a special focus on Integrative –omics and Systems Biology. The theoretical, methodological underpinnings of BCB, including phylogeny are covered, as are more current areas of focus, such as translational bioinformatics, cheminformatics, and environmental informatics. Finally, Applications provide guidance for commonly asked questions. This major reference work spans basic and cutting-edge methodologies authored by leaders in the field, providing an invaluable resource for students, scientists, professionals in research institutes, and a broad swath of researchers in biotechnology and the biomedical and pharmaceutical industries. Brings together information from computer science, information technology, mathematics, statistics and biotechnology Written and reviewed by leading experts in the field, providing a unique and authoritative resource Focuses on the main theoretical and methodological concepts before expanding on specific topics and applications Includes interactive images, multimedia tools and crosslinking to further resources and databases

Probability and Statistics with R

Author: Maria Dolores Ugarte,Ana F. Militino,Alan T. Arnholt

Publisher: CRC Press

ISBN: 1466504404

Category: Mathematics

Page: 983

View: 6553


Cohesively Incorporates Statistical Theory with R Implementation Since the publication of the popular first edition of this comprehensive textbook, the contributed R packages on CRAN have increased from around 1,000 to over 6,000. Designed for an intermediate undergraduate course, Probability and Statistics with R, Second Edition explores how some of these new packages make analysis easier and more intuitive as well as create more visually pleasing graphs. New to the Second Edition Improvements to existing examples, problems, concepts, data, and functions New examples and exercises that use the most modern functions Coverage probability of a confidence interval and model validation Highlighted R code for calculations and graph creation Gets Students Up to Date on Practical Statistical Topics Keeping pace with today’s statistical landscape, this textbook expands your students’ knowledge of the practice of statistics. It effectively links statistical concepts with R procedures, empowering students to solve a vast array of real statistical problems with R. Web Resources A supplementary website offers solutions to odd exercises and templates for homework assignments while the data sets and R functions are available on CRAN.

Statistical Regression and Classification

From Linear Models to Machine Learning

Author: Norman Matloff

Publisher: CRC Press

ISBN: 1498710921

Category: Business & Economics

Page: 490

View: 6379


Statistical Regression and Classification: From Linear Models to Machine Learning takes an innovative look at the traditional statistical regression course, presenting a contemporary treatment in line with today's applications and users. The text takes a modern look at regression: * A thorough treatment of classical linear and generalized linear models, supplemented with introductory material on machine learning methods. * Since classification is the focus of many contemporary applications, the book covers this topic in detail, especially the multiclass case. * In view of the voluminous nature of many modern datasets, there is a chapter on Big Data. * Has special Mathematical and Computational Complements sections at ends of chapters, and exercises are partitioned into Data, Math and Complements problems. * Instructors can tailor coverage for specific audiences such as majors in Statistics, Computer Science, or Economics. * More than 75 examples using real data. The book treats classical regression methods in an innovative, contemporary manner. Though some statistical learning methods are introduced, the primary methodology used is linear and generalized linear parametric models, covering both the Description and Prediction goals of regression methods. The author is just as interested in Description applications of regression, such as measuring the gender wage gap in Silicon Valley, as in forecasting tomorrow's demand for bike rentals. An entire chapter is devoted to measuring such effects, including discussion of Simpson's Paradox, multiple inference, and causation issues. Similarly, there is an entire chapter of parametric model fit, making use of both residual analysis and assessment via nonparametric analysis. Norman Matloff is a professor of computer science at the University of California, Davis, and was a founder of the Statistics Department at that institution. His current research focus is on recommender systems, and applications of regression methods to small area estimation and bias reduction in observational studies. He is on the editorial boards of the Journal of Statistical Computation and the R Journal. An award-winning teacher, he is the author of The Art of R Programming and Parallel Computation in Data Science: With Examples in R, C++ and CUDA.

Probabilistic Graphical Models for Genetics, Genomics, and Postgenomics

Author: Christine Sinoquet,Raphaël Mourad

Publisher: OUP Oxford

ISBN: 0191019208

Category: Science

Page: 464

View: 2545


Nowadays bioinformaticians and geneticists are faced with myriad high-throughput data usually presenting the characteristics of uncertainty, high dimensionality and large complexity. These data will only allow insights into this wealth of so-called 'omics' data if represented by flexible and scalable models, prior to any further analysis. At the interface between statistics and machine learning, probabilistic graphical models (PGMs) represent a powerful formalism to discover complex networks of relations. These models are also amenable to incorporating a priori biological information. Network reconstruction from gene expression data represents perhaps the most emblematic area of research where PGMs have been successfully applied. However these models have also created renewed interest in genetics in the broad sense, in particular regarding association genetics, causality discovery, prediction of outcomes, detection of copy number variations, and epigenetics. This book provides an overview of the applications of PGMs to genetics, genomics and postgenomics to meet this increased interest. A salient feature of bioinformatics, interdisciplinarity, reaches its limit when an intricate cooperation between domain specialists is requested. Currently, few people are specialists in the design of advanced methods using probabilistic graphical models for postgenomics or genetics. This book deciphers such models so that their perceived difficulty no longer hinders their use and focuses on fifteen illustrations showing the mechanisms behind the models. Probabilistic Graphical Models for Genetics, Genomics and Postgenomics covers six main themes: (1) Gene network inference (2) Causality discovery (3) Association genetics (4) Epigenetics (5) Detection of copy number variations (6) Prediction of outcomes from high-dimensional genomic data. Written by leading international experts, this is a collection of the most advanced work at the crossroads of probabilistic graphical models and genetics, genomics, and postgenomics. The self-contained chapters provide an enlightened account of the pros and cons of applying these powerful techniques.

Modern Multivariate Statistical Techniques

Regression, Classification, and Manifold Learning

Author: Alan J. Izenman

Publisher: Springer Science & Business Media

ISBN: 9780387781891

Category: Mathematics

Page: 733

View: 5827


This is the first book on multivariate analysis to look at large data sets which describes the state of the art in analyzing such data. Material such as database management systems is included that has never appeared in statistics books before.

Statistical Methods for Quality Assurance

Basics, Measurement, Control, Capability, and Improvement

Author: Stephen B. Vardeman,J. Marcus Jobe

Publisher: Springer

ISBN: 038779106X

Category: Mathematics

Page: 437

View: 5282


This undergraduate statistical quality assurance textbook clearly shows with real projects, cases and data sets how statistical quality control tools are used in practice. Among the topics covered is a practical evaluation of measurement effectiveness for both continuous and discrete data. Gauge Reproducibility and Repeatability methodology (including confidence intervals for Repeatability, Reproducibility and the Gauge Capability Ratio) is thoroughly developed. Process capability indices and corresponding confidence intervals are also explained. In addition to process monitoring techniques, experimental design and analysis for process improvement are carefully presented. Factorial and Fractional Factorial arrangements of treatments and Response Surface methods are covered. Integrated throughout the book are rich sets of examples and problems that help readers gain a better understanding of where and how to apply statistical quality control tools. These large and realistic problem sets in combination with the streamlined approach of the text and extensive supporting material facilitate reader understanding. Second Edition Improvements Extensive coverage of measurement quality evaluation (in addition to ANOVA Gauge R&R methodologies) New end-of-section exercises and revised-end-of-chapter exercises Two full sets of slides, one with audio to assist student preparation outside-of-class and another appropriate for professors’ lectures Substantial supporting material Supporting Material Seven R programs that support variables and attributes control chart construction and analyses, Gauge R&R methods, analyses of Fractional Factorial studies, Propagation of Error analyses and Response Surface analyses Documentation for the R programs Excel data files associated with the end-of-chapter problem sets, most from real engineering settings