Normal view MARC view ISBD view

Elasticsearch for Hadoop : integrate Elasticsearch into Hadoop to effectively visualize and analyze your data /

Shukla, Vishal,

Elasticsearch for Hadoop : integrate Elasticsearch into Hadoop to effectively visualize and analyze your data / Vishal Shukla. - 1 online resource (1 volume) : illustrations. - Community experience distilled . - Community experience distilled. .

Includes index.

Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Setting Up Environment; Setting up Hadoop for Elasticsearch; Setting up Java; Setting up a dedicated user; Installing SSH and setting up the certificate; Downloading Hadoop; Setting up environment variables; Configuring Hadoop; Configuring core-site.xml; Configuring hdfs-site.xml; Configuring yarn-site.xml; Configuring mapred-site.xml; The format distributed filesystem; Starting Hadoop daemons; Setting up Elasticsearch; Downloading Elasticsearch; Configuring Elasticsearch Installing Elasticsearch's Head pluginInstalling the Marvel plugin; Running and testing; Running the WordCount example; Getting the examples and building the job JAR file; Importing the test file to HDFS; Running our first job; Exploring data in Head and Marvel; Viewing data in Head; Using the Marvel dashboard; Exploring the data in Sense; Summary; Chapter 2: Getting Started with ES-Hadoop; Understanding the WordCount program; Understanding Mapper; Understanding the reducer; Understanding the driver; Using the old API -- org.apache.hadoop.mapred; Going real -- network monitoring data Getting and understanding the dataKnowing the problems; Solution approaches; Approach 1 -- Preaggregate the results; Approach 2 -- Aggregate the results at query-time; Writing the NetworkLogsMapper job; Writing the mapper class; Writing Driver; Building the job; Getting the data into HDFS; Running the job; Viewing the Top N results; Getting data from Elasticsearch to HDFS; Understanding the Twitter dataset; Trying it yourself; Creating the MapReduce job to import data from Elasticsearch to HDFS; Writing the Tweets2Hdfs mapper; Running the example; Testing the job execution output; Summary Chapter 3: Understanding ElasticsearchKnowing Search and Elasticsearch; The paradigm mismatch; Index; Type; Document; Field; Talking to Elasticsearch; CRUD with Elasticsearch; Creating the document request; Mappings; Data types; Create mapping API; Index templates; Controlling the indexing process; What is an inverted index?; The input data analysis; Removing stop words; Case insensitive; Stemming; Synonyms; Analyzers; Elastic searching; Writing search queries; The URI search; Matching all queries; The term query; The boolean query; The match query; The range query; The wildcard query FiltersAggregations; Executing the aggregation queries; The terms aggregation; Histograms; The range aggregation; The geo distance; Sub-aggregations; Try it yourself; Summary; Chapter 4: Visualizing Big Data Using Kibana; Setting up and getting started; Setting up Kibana; Setting up datasets; Try it out; Getting started with Kibana; Discovering data; Visualizing the data; The pie chart; The stacked bar chart; The date histogram with the stacked bar chart; The area chart; The split pie chart; The sun burst chart; The geographical chart; Trying it out; Creating dynamic dashboards; Summary

ISBN: 9781785282249 (electronic bk.) 1785282247 (electronic bk.)

Source: CL0500000675 Safari Books Online

Subjects--Uniform Titles:
Apache Hadoop.
Apache Hadoop.

Subjects--Topical Terms:
Information visualization.
Data mining.
Visualisation de l'information.
Exploration de données (Informatique)
COMPUTERS / Databases / Data Mining
Data mining.
Information visualization.

Index Terms--Genre/Form:
Electronic books.

LC Class. No.: QA76.9.D5

Dewey Class. No.: 004.36