Amazon cover image
Image from Amazon.com

The data wrangler's handbook : simple tools for powerful results / Kyle Banerjee.

By: Material type: TextTextPublisher: Chicago : ALA Neal-Schuman, [2019]Copyright date: ©2019Description: 1 online resource (xx, 164 pages) : illustrationsContent type:
  • text
Media type:
  • computer
Carrier type:
  • online resource
ISBN:
  • 9780838919132
  • 0838919138
  • 0838919103
  • 9780838919118
  • 0838919111
  • 9780838919101
Subject(s): Genre/Form: Additional physical formats: Print version:: The data wrangler's handbookDDC classification:
  • 005.74/3 23
LOC classification:
  • QA76.9.D26 B36 2019eb
Online resources:
Contents:
Cover -- Title Page -- Copyright Page -- Contents -- List of Figures and Tables -- Acknowledgments -- Introduction -- Chapter 1. Getting Started with the Command Line -- Finding the Command Line -- Mac -- Windows -- Meet the Command Line -- Chapter 2. Command Line Concepts -- Two Powerful Symbols -- Direct Output to a File (Greater than Symbol) -- Direct Output to Another Program (Pipe Symbol) -- Command Substitution -- Regular Expressions-The Swiss Army Knife for Data -- Literal Characters -- Special Characters -- Wildcard Characters -- Logical Operators -- Grouping -- Scripting
Chapter 3. Understanding Formats, by David Forero -- Chapter 4. Simplify Complicated Problems -- Isolating Specific Data Elements -- Converting Data into Formats That Are Easier to Work With -- Chapter 5. Delimited Text -- CSV (Comma Separated Values) -- Commas and Quotation Marks in CSV Files -- Multiline Fields in CSV Files -- Multivalued Fields in Delimited Files -- Chapter 6. XML -- So What Is XML, Really? -- What Makes XML So Useful? -- Why Is XML So Easy? -- DOM (Document Object Model) -- XPath -- XSLT (eXtensible Stylesheet Language Transformations) -- Working with Large XML Files
Working with Complex XML Files -- XmlStarlet -- Installing XmlStarlet -- Converting XML Documents -- Chapter 7. JSON (JavaScript Object Notation) -- Chapter 8. Scripting -- Variables -- Arguments -- Conditional Execution -- Loops -- Chapter 9. Solving Common Problems -- Viewing Large Files -- Locating Files That Contain Particular Data -- Finding Files with Specific Characteristics -- Working with Internal Metadata -- Working with APIs -- Combining Data from Different Sources -- Other Tasks -- Chapter 10. Conclusions -- One-Line Wonders -- Locating, Viewing, and Performing Basic File Operations
Combine Information from Multiple Files into a Single File -- Combine Three Files, Each Consisting of a Single Column, into a Three-Column Table -- Extract 1,000 Random Lines or Records from a File -- Find Files with Specific Characteristics -- Find All Lines in All Files in the Current Directory as Well as All Subdirectories Containing a Regular Expression -- Identify All Files in Current Directories and Subdirectories That Contain a Value -- List All Files in Current Directory and Subdirectories over a 100 MB in Order of Decreasing Size
List the Names, Pixel Dimensions, and File Sizes of All Files in the Current Directory and Subdirectories in Tab Delimited Format -- Print Line Number of File That Match Occurred On -- Split Large Files into Smaller Chunks with Each File Breaking on a Line -- View 200 Characters Starting at Position 385621 in a File -- View Lines 4369-4374 of a File -- Retrieving and Sending Information over a Network -- Retrieve a Document from the Web and Send It to a File -- Send an XML Document to an API Requiring HTTP Authentication -- Sorting, Counting, Deduplication, and File Comparison
Summary: "Data manipulation and analysis are far easier than you might imagine - in fact, using tools that come standard with your desktop computer, you can learn how to extract, manipulate, and analyze data (and metadata) of any size and complexity. In this handbook, data wizard Banerjee will familiarize you with easily digestible but powerful concepts that will enable you to feel confident working with data. With his expert guidance, you'll learn how to use a single-word command to sort files of any size by any criteria, identify duplicates, and perform numerous other common library tasks; understand data formats, delimited text and CSV files, XML, JSON, scripting, and other key components of data; undertake more sophisticated tasks such as comparing files, converting data from one format to another, reformatting values, combining data from multiple files, and communicating with APIs (Application Programming Interfaces); and save time and stress through simple techniques for transforming text, recognizing symbols that perform important tasks, a Regular Expression cheat sheet, a glossary, and other tools"-- Provided by publisher.
Item type:
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Home library Collection Call number Materials specified Status Date due Barcode
Electronic-Books Electronic-Books OPJGU Sonepat- Campus E-Books EBSCO Available

Includes bibliographical references and index.

"Data manipulation and analysis are far easier than you might imagine - in fact, using tools that come standard with your desktop computer, you can learn how to extract, manipulate, and analyze data (and metadata) of any size and complexity. In this handbook, data wizard Banerjee will familiarize you with easily digestible but powerful concepts that will enable you to feel confident working with data. With his expert guidance, you'll learn how to use a single-word command to sort files of any size by any criteria, identify duplicates, and perform numerous other common library tasks; understand data formats, delimited text and CSV files, XML, JSON, scripting, and other key components of data; undertake more sophisticated tasks such as comparing files, converting data from one format to another, reformatting values, combining data from multiple files, and communicating with APIs (Application Programming Interfaces); and save time and stress through simple techniques for transforming text, recognizing symbols that perform important tasks, a Regular Expression cheat sheet, a glossary, and other tools"-- Provided by publisher.

Description based on print version record and CIP data provided by publisher; resource not viewed.

Cover -- Title Page -- Copyright Page -- Contents -- List of Figures and Tables -- Acknowledgments -- Introduction -- Chapter 1. Getting Started with the Command Line -- Finding the Command Line -- Mac -- Windows -- Meet the Command Line -- Chapter 2. Command Line Concepts -- Two Powerful Symbols -- Direct Output to a File (Greater than Symbol) -- Direct Output to Another Program (Pipe Symbol) -- Command Substitution -- Regular Expressions-The Swiss Army Knife for Data -- Literal Characters -- Special Characters -- Wildcard Characters -- Logical Operators -- Grouping -- Scripting

Chapter 3. Understanding Formats, by David Forero -- Chapter 4. Simplify Complicated Problems -- Isolating Specific Data Elements -- Converting Data into Formats That Are Easier to Work With -- Chapter 5. Delimited Text -- CSV (Comma Separated Values) -- Commas and Quotation Marks in CSV Files -- Multiline Fields in CSV Files -- Multivalued Fields in Delimited Files -- Chapter 6. XML -- So What Is XML, Really? -- What Makes XML So Useful? -- Why Is XML So Easy? -- DOM (Document Object Model) -- XPath -- XSLT (eXtensible Stylesheet Language Transformations) -- Working with Large XML Files

Working with Complex XML Files -- XmlStarlet -- Installing XmlStarlet -- Converting XML Documents -- Chapter 7. JSON (JavaScript Object Notation) -- Chapter 8. Scripting -- Variables -- Arguments -- Conditional Execution -- Loops -- Chapter 9. Solving Common Problems -- Viewing Large Files -- Locating Files That Contain Particular Data -- Finding Files with Specific Characteristics -- Working with Internal Metadata -- Working with APIs -- Combining Data from Different Sources -- Other Tasks -- Chapter 10. Conclusions -- One-Line Wonders -- Locating, Viewing, and Performing Basic File Operations

Combine Information from Multiple Files into a Single File -- Combine Three Files, Each Consisting of a Single Column, into a Three-Column Table -- Extract 1,000 Random Lines or Records from a File -- Find Files with Specific Characteristics -- Find All Lines in All Files in the Current Directory as Well as All Subdirectories Containing a Regular Expression -- Identify All Files in Current Directories and Subdirectories That Contain a Value -- List All Files in Current Directory and Subdirectories over a 100 MB in Order of Decreasing Size

List the Names, Pixel Dimensions, and File Sizes of All Files in the Current Directory and Subdirectories in Tab Delimited Format -- Print Line Number of File That Match Occurred On -- Split Large Files into Smaller Chunks with Each File Breaking on a Line -- View 200 Characters Starting at Position 385621 in a File -- View Lines 4369-4374 of a File -- Retrieving and Sending Information over a Network -- Retrieve a Document from the Web and Send It to a File -- Send an XML Document to an API Requiring HTTP Authentication -- Sorting, Counting, Deduplication, and File Comparison

eBooks on EBSCOhost EBSCO eBook Subscription Academic Collection - Worldwide

There are no comments on this title.

to post a comment.

O.P. Jindal Global University, Sonepat-Narela Road, Sonepat, Haryana (India) - 131001

Send your feedback to glus@jgu.edu.in

Hosted, Implemented & Customized by: BestBookBuddies   |   Maintained by: Global Library