Please note: In order to keep Hive up to date and provide users with the best features, we are no longer able to fully support Internet Explorer. The site is still available to you, however some sections of the site may appear broken. We would encourage you to move to a more modern browser like Firefox, Edge or Chrome in order to experience the site fully.

Managing Datasets and Models, PDF eBook

Managing Datasets and Models PDF

PDF

Please note: eBooks can only be purchased with a UK issued credit card and all our eBooks (ePub and PDF) are DRM protected.

Description

This book contains a fast-paced introduction to data-related tasks in preparation for training models on datasets.

It presents a step-by-step, Python-based code sample that uses the kNN algorithm to manage a model on a dataset. Chapter One begins with an introduction to datasets and issues that can arise, followed by Chapter Two on outliers and anomaly detection.

The next chapter explores ways for handling missing data and invalid data, and Chapter Four demonstrates how to train models with classification algorithms.

Chapter 5 introduces visualization toolkits, such as Sweetviz, Skimpy, Matplotlib, and Seaborn, along with some simple Python-based code samples that render charts and graphs.

An appendix includes some basics on using awk. Companion files with code, datasets, and figures are available for downloading. Features: Covers extensive topics related to cleaning datasets and working with modelsIncludes Python-based code samples and a separate chapter on Matplotlib and SeabornFeatures companion files with source code, datasets, and figures from the book

Information

Information