ARGOMENTO: BOOKS > INFORMATICA > TESTI GENERALI > INTELLIGENZA ARTIFICIALE, SISTEMI ESPERTI

friedland gerald - information-driven machine learning

Information-Driven Machine Learning Data Science as an Engineering Discipline

Name: Information-Driven Machine Learning
Price: 74,19 € EUR
Availability: InStock
Author: Friedland Gerald
ISBN: 9783031394768

friedland gerald

Disponibilità: solo 1 copia disponibile, compra subito!

PREZZO
78,10 €

NICEPRICE
74,19 €

SCONTO
5%

Acquista

Questo prodotto usufruisce delle SPEDIZIONI GRATIS
selezionando l'opzione Corriere Veloce in fase di ordine.

Pagabile anche con Carta della cultura giovani e del merito, 18App Bonus Cultura e Carta del Docente

Dettagli

Genere:Libro

Lingua: Inglese

Editore:

Springer

Pubblicazione: 12/2023

Edizione: 1st ed. 2024

Trama

This groundbreaking book transcends traditional machine learning approaches by introducing information measurement methodologies that revolutionize the field.

Stemming from a UC Berkeley seminar on experimental design for machine learning tasks, these techniques aim to overcome the 'black box' approach of machine learning by reducing conjectures such as magic numbers (hyper-parameters) or model-type bias. Information-based machine learning enables data quality measurements, a priori task complexity estimations, and reproducible design of data science experiments. The benefits include significant size reduction, increased explainability, and enhanced resilience of models, all contributing to advancing the discipline's robustness and credibility.

While bridging the gap between machine learning and disciplines such as physics, information theory, and computer engineering, this textbook maintains an accessible and comprehensive style, making complex topics digestible for a broad readership. Information-Driven Machine Learning explores the synergistic harmony among these disciplines to enhance our understanding of data science modeling. Instead of solely focusing on the "how," this text provides answers to the "why" questions that permeate the field, shedding light on the underlying principles of machine learning processes and their practical implications. By advocating for systematic methodologies grounded in fundamental principles, this book challenges industry practices that have often evolved from ideologic or profit-driven motivations. It addresses a range of topics, including deep learning, data drift, and MLOps, using fundamental principles such as entropy, capacity, and high dimensionality.

Ideal for both academia and industry professionals, this textbook serves as a valuable tool for those seeking to deepen their understanding of data science as an engineering discipline. Its thought-provoking content stimulates intellectual curiosity and caters to readers who desire more than just code or ready-made formulas. The text invites readers to explore beyond conventional viewpoints, offering an alternative perspective that promotes a big-picture view for integrating theory with practice. Suitable for upper undergraduate or graduate-level courses, this book can also benefit practicing engineers and scientists in various disciplines by enhancing their understanding of modeling and improving data measurement effectively.

Sommario

Preface

1 Introduction

1.1 Science

1.2 Data Science

1.3 Information Measurements

1.4 Exercises

1.5 Further Reading

2 The Automated Scientific Process

2.1 The Role of the Human

2.1.1 Curiosity

2.1.2 Data Collection

2.1.3 The Data Table

2.2 Automated Model Building

2.2.1 The Finite State Machine

2.2.2 How Machine Learning Generalizes

2.3 Exercises

2.4 Further Reading

3 The (Black Box) Machine Learning Process

3.1 Types of Tasks

3.1.1 Unsupervized Learning

3.1.2 Supervized Learning

3.2 Black Box Machine Learning Process

3.2.1 Training/Validation Split

3.2.2 Independent but Identically Distributed

3.3 Types of Models

3.3.1 Nearest Neighbors

3.3.2 Linear Regression

3.3.3 Decision Trees

3.3.4 Random Forests

3.3.5 Neural Networks

3.3.6 Support Vector Machines

3.3.7 Genetic Programming

3.4 Error Metrics

3.4.1 Binary Classification

3.4.2 Detection

3.4.3 Multi-class Classification

3.4.4 Regression

3.5 The Information-based Machine Learning Process

3.6 Exercises

3.7 Further Reading

4 Information Theory

4.1 Probability, Uncertainty, Information

4.1.1 Chance and Probability

4.1.2 Probability Space

4.1.3 Uncertainty and Entropy

4.1.4 Information

4.2 Minimum Description Length

4.3 Information in Curves

4.4 Information in a Table

4.5 Exercises

4.6 Further Reading

5 Capacity

5.1 Intellectual Capacity

5.1.1 Minsky’s Criticism

5.1.2 Cover’s Solution

5.1.3 MacKay’s Viewpoint

5.2 Memory-equivalent Capacity of a Model

5.3 Exercises

5.4 Further Reading

6 The Mechanics of Generalization

6.1 Logic Definition of Generalization

6.2 Translating a Table into a Finite State Machine

6.3 Generalization as Compression

6.4 Resilience

6.5 Adversarial Examples

6.6 Exercises

6.7 Further Reading

7 Meta-Math: Exploring the Limits of Modeling

7.1 Algebra

7.1.1 Garbage In, Garbage Out

7.1.2 Randomness

7.1.3 Transcendental Numbers

7.2 No Rule without Exception

7.2.1 Compression by Association

7.3 Correlation vs Causality

7.4 No Free Lunch

7.5 All Models are Wrong

7.6 Exercises

7.7 Further Reading

8 Capacity of Neural Networks

8.1 Memory-equivalent Capacity of Neural Networks

8.2 Upper-bounding the MEC Requirement of a Neural Network given

Training Data

8.3 Topological Concerns

8.4 MEC for Regression Networks

8.5 Exercises

8.6 Further Reading

9 Neural Network Architectures

9.1 Deep Learning and Convolutional Neural Networks

9.1.1 Convolutional Neural Networks

9.1.2 Residual Networks

9.2 Generative Adversarial Networks

9.3 Autoencoders

9.4 Transformers

9.4.1 Architecture

9.4.2 Self-Attention Mechanism

9.4.3 Positional Encoding

9.4.4 Example Transformation

9.4.5 Applications and Limitations

9.5 The Role of Neural Architectures

9.6 Exercises

9.7 Further Reading

10 Capacities of some other Machine Learning Methods

10.1 k-Nearest Neighbors

10.2 Support Vector Machines

10.3 Decision Trees

10.3.1 Converting a Table into a Decision Tree

10.3.2 Decision Trees

10.3.3 Generalization of Decision Trees

10.3.4 Ensembling

10.4 Genetic Programming

10.5 Unsupervized Methods

10.5.1 k-means Clustering

10.5.2 Hopfield Networks

10.6 Exercises

10.7 Further Reading

11 Data Collection and Preparation

11.1 Data Collection and Annotation

11.2 Task Definition

11.3 Well-Posedness

11.3.1 Chaos and how to avoid it

11.3.2 Forcing Well-Posedness

11.4 Tabularization

11.4.1 Table Data

11.4.2 Time-Series Data

11.4.3 Natural Language and other Varying-Dependency Data

11.4.4 Perceptual Data

11.4.5 Multimodal Data

11.5 Data Validation

11.5.1 Hard Conditions

11.5.2 Soft Conditions

11.6 Numerization

11.7 Imbalanced Data

11.7.1 Extension beyond simple Accuracy

11.8 Exercises

11.9 Further Reading

12 Measuring Data Sufficiency

12.1 Dispelling a Myth

12.2 Capacity Progression

12.3 Equilibrium Machine Learner

12.4 Data Sufficiency Using the Equilibrium Machine Learner

12.5 Exercises

12.6 Further Reading

13 Machine Learning Operations

13.1 What makes a predictor production-ready?

13.2 Quality Assurance for Predictors

13.2.1 Traditional Unit Testing

13.2.2 Synthetic Data Crash Tests

13.2.3 Data Drift Test

13.2.4 Adversarial Examples Test

13.2.5 Regression Tests

13.3 Measuring Model Bias

13.3.1 Where does the bias come from?

13.4 Security and Privacy

13.5 Exercises

13.6 Further Reading

14 Explainability

14.1 Explainable to Whom?

14.2 Occam’s Razor Revisited

14.3 Attribute Ranking: Finding what Matters

14.4 Heatmapping

14.5 Instance-based Explanations

14.6 Rule Extraction

14.6.1 Visualizing Neurons and Layers

14.6.2 Local Interpretable Model-agnostic Explanations (LIME)

14.7 Future Directions

14.7.1 Causal Inference

14.7.2 Interactive Explanations

14.7.3 Explainability Evaluation Metrics

14.8 Fewer Parameters

14.9 Exercises

14.10 Further Reading

15 Repeatability and Reproducibility

15.1 Traditional Software Engineering

15.2 Why Reproducibility Matters

15.3 Reproducibility Standards

15.4 Achieving Reproducibility

15.5 Beyond Reproducibility

15.6 Exercises

15.7 Further Reading

16 The Curse of Training and the Blessing of High Dimensionality

16.1 Training is Difficult

16.1.1 Common Workarounds

16.2 Training in Logarithmic Time

16.3 Building Neural Networks Incrementally

16.4 The Blessing of High Dimensionality

16.5 Exercises

16.6 Further Reading

17 Machine Learning and Society

17.1 Societal Reaction: The Hype Train, Worship, or Fear

17.2 Some Basic Suggestions from a Technical Perspective 208

17.2.1 Understand Technological Diffusion and Allow Society Time

to Adapt

17.2.2 Measure Memory-Equivalent Capacity (MEC)

17.2.3 Focus on Smaller, Task-Specific Models

17.2.4 Organic Growth of Large-Scale Models from Small-Scale

Models

17.2.5 Measure and Control Generalization to solve Copyright Issues

17.2.6 Leave Decisions to qualified Humans

17.3 Exercises 211

17.4 Further Reading

Appendix A Recap: The Logarithm

Appendix B More on Complexity

Appendix C Concepts Cheat Sheet

Appendix D A Review Form that Promotes Reproducibility

List of illustrations

Bibliography

Autore

Gerald Friedland: Listed in the AI2000 Most Influential Scholar list as one of the top-cited research scholars in AI in the last decade, Friedland's contributions to the field of machine learning have been both substantial and enduring since he started working in the field in 2001. His Simple Interactive Object Extraction algorithm has been part of open source image editing and creation tools since 2005 and his cloud-less MOVI Speech Recognition board has been used by makers since 2015. Currently, he is adjunct faculty at the University of California, Berkeley, a Faculty Fellow of the Berkeley Institute of Data Science, and a Principal Scientist in the Sagemaker team at Amazon AWS.

After earning his Ph.D. from Freie Universität Berlin in 2006, Gerald led a team of researchers in speech and multimedia content analysis as the Director of Audio and Multimedia research at the International Computer Science Institute in Berkeley. He then held the role of Principal Data Scientist at Lawrence Livermore National Lab from 2016 to 2019. That year, he co-founded Brainome, Inc., where he harnessed his technical expertise to develop an automatic machine learning tool rooted in the information measurement techniques central to this book. His journey then took him to Amazon AWS in 2022 as a Principal Scientist, AutoML.

Beyond his industry and academic roles, Gerald is a seasoned author. His literature contributions span from the textbooks Multimedia Computing (Cambridge University Press) and Multimodal Location Estimation of Videos and Images (Springer) to a programming book for young children published by Apress.