Scikit-learn Cookbook : over 50 recipes to incorporate scikit-learn into every step of the data science pipeline, from feature extraction to model building and model evaluation / Trent Hauck.

By:

Hauck, Trent [author.]

Material type: Text

TextPublisher: Birmingham, U.K. : Packt Publishing, 2014Description: 1 online resource (1 volume) : illustrationsContent type:

text

Media type:

computer

Carrier type:

online resource

ISBN:

9781783989492
1783989491

Subject(s):

Genre/Form:

Additional physical formats: Print version:: Scikit-learn cookbook : over 50 recipes to incorporate scikit-learn into every step of the data science pipeline, from feature extraction to model builing and model evaluation.DDC classification:

641.5 23

LOC classification:

Q325.5

Online resources:

Click here to access online

Contents:

Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Premodel Workflow; Introduction; Getting sample data from external sources; Creating sample data for toy analysis; Scaling data to the standard normal; Creating binary features through thresholding; Working with categorical variables; Binarizing label features; Imputing missing values through various strategies; Using Pipelines for multiple preprocessing steps; Reducing dimensionality with PCA; Using factor analysis for decomposition

Kernel PCA for nonlinear dimensionality reductionUsing truncated SVD to reduce dimensionality; Decomposition to classify with DictionaryLearning; Putting it all together with Pipelines; Using Gaussian processes for regression; Defining the Gaussian process object directly; Using stochastic gradient descent for regression; Chapter 2: Working with Linear Models; Introduction; Fitting a line through data; Evaluating the linear regression model; Using ridge regression to overcome linear regression's shortfalls; Optimizing the ridge regression parameter; Using sparsity to regularize models

Taking a more fundamental approach to regularization with LARSUsing linear methods for classification -- logistic regression; Directly applying Bayesian ridge regression; Using boosting to learn from errors; Chapter 3: Building Models with Distance Metrics; Introduction; Using KMeans to cluster data; Optimizing the number of centroids; Assessing cluster correctness; Using MiniBatch KMeans to handle more data; Quantizing an image with KMeans clustering; Finding the closest objects in the feature space; Probabilistic clustering with Gaussian Mixture Models; Using KMeans for outlier detection

Using k-NN for regressionChapter 4: Classifying Data with scikit-learn; Introduction; Doing basic classifications with Decision Trees; Tuning a Decision Tree model; Using many Decision Trees -- random forests; Tuning a random forest model; Classifying data with Support Vector Machines; Generalizing with multiclass classification; Using LDA for classification; Working with QDA -- a nonlinear LDA; Using Stochastic Gradient Descent for classification; Classifying documents with Naïve Bayes; Label propagation with semi-supervised learning; Chapter 5: Post-model Workflow; Introduction

K-fold cross validationAutomatic cross validation; Cross validation with ShuffleSplit; Stratified k-fold; Poor man's grid search; Brute force grid search; Using dummy estimators to compare results; Regression model evaluation; Feature selection; Feature selection on L1 norms; Persisting models with joblib; Index

Summary: If you're a data scientist already familiar with Python but not Scikit-Learn, or are familiar with other programming languages like R and want to take the plunge with the gold standard of Python machine learning libraries, then this is the book for you.

Item type:

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Home library	Collection	Call number	Materials specified	Status	Date due	Barcode
Electronic-Books	OPJGU Sonepat- Campus	E-Books EBSCO			Available

"Quick answers to common problems."

Online resource; title from cover (Safari, viewed November 17, 2014).

Includes index.

If you're a data scientist already familiar with Python but not Scikit-Learn, or are familiar with other programming languages like R and want to take the plunge with the gold standard of Python machine learning libraries, then this is the book for you.

eBooks on EBSCOhost EBSCO eBook Subscription Academic Collection - Worldwide

There are no comments on this title.

to post a comment.