Amazon cover image
Image from Amazon.com

Natural language processing with Java and LingPipe cookbook : over 60 effective recipes to develop your natural language processing (NLP) skills quickly and effectively / Breck Baldwin, Krishna Dayanidhi.

By: Contributor(s): Material type: TextTextSeries: Quick answers to common problemsPublisher: Birmingham : Packt Publishing, [2014]Copyright date: ©2014Description: 1 online resourceContent type:
  • text
Media type:
  • computer
Carrier type:
  • online resource
ISBN:
  • 1322348537
  • 9781322348537
  • 9781783284689
  • 1783284684
  • 1783284676
  • 9781783284672
Subject(s): Genre/Form: Additional physical formats: Print version :: Natural Language Processing with Java and LingPipe Cookbook.DDC classification:
  • 006.35 23
LOC classification:
  • QA76.9.N38
Online resources:
Contents:
Cover; Copyright; Credits; About the Authors; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Simple Classifiers; Introduction; Deserializing and running a classifier; Getting confidence estimates from a classifier; Getting data from the Twitter API; Applying a classifier to a .csv file; Evaluation of classifiers -- the confusion matrix; Training your own language model classifier; How to train and evaluate with cross validation; Viewing error categories -- false positives; Understanding precision and recall; How to serialize a LingPipe object -- classifier example
Eliminate near duplicates with the Jaccard distanceHow to classify sentiment -- simple version; Chapter 2: Finding and Working with Words; Introduction; Introduction to tokenizer factories -- finding words in a character stream; Combining tokenizers -- lowercase tokenizer; Combining tokenizers -- stop word tokenizers; Using Lucene/Solr tokenizers; Using Lucene/Solr tokenizers with LingPipe; Evaluating tokenizers with unit tests; Modifying tokenizer factories; Finding words for languages without white spaces; Chapter 3: Advanced Classifiers; Introduction; A simple classifier
Language model classifier with tokensNaïve Bayes; Feature extractors; Logistic regression; Multithreaded cross validation; Tuning parameters in logistic regression; Customizing feature extraction; Combining feature extractors; Classifier-building life cycle; Linguistic tuning; Thresholding classifiers; Train a little, learn a little -- active learning; Annotation; Chapter 4: Tagging Words and Tokens; Introduction; Interesting phrase detection; Foreground- or background-driven interesting phrase detection; Hidden Markov Models (HMM) -- part-of-speech; N-best word tagging
Confidence-based taggingTraining word tagging; Word-tagging evaluation; Conditional random fields (CRF) for word/token tagging; Modifying CRFs; Chapter 5: Finding Spans in Text -- Chunking; Introduction; Sentence detection; Evaluation of sentence detection; Tuning sentence detection; Marking embedded chunks in a string -- sentence chunk example; Paragraph detection; Simple noun phrases and verb phrases; Regular expression-based chunking for NER; Dictionary-based chunking for NER; Translating between word tagging and chunks -- BIO codec; HMM-based NER; Mixing the NER sources; CRFs for chunking
NER using CRFs with better featuresChapter 6: String Comparison and Clustering; Introduction; Distance and proximity -- simple edit distance; Weighted edit distance; The Jaccard distance; The Tf-Idf distance; Using edit distance and language models for spelling correction; The case restoring corrector; Automatic phrase completion; Single-link and complete-link clustering using edit distance; Latent Dirichlet allocation (LDA) for multitopic clustering; Chapter 7: Finding Coreference Between Concepts/People; Introduction; Named entity coreference with a document; Adding pronouns to coreference
Summary: Annotation This book is for experienced Java developers with NLP needs, whether academics, industrialists, or hobbyists. A basic knowledge of NLP terminology will be beneficial.
Item type:
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Home library Collection Call number Materials specified Status Date due Barcode
Electronic-Books Electronic-Books OPJGU Sonepat- Campus E-Books EBSCO Available

Includes bibliographical references and index.

Annotation This book is for experienced Java developers with NLP needs, whether academics, industrialists, or hobbyists. A basic knowledge of NLP terminology will be beneficial.

Cover; Copyright; Credits; About the Authors; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Simple Classifiers; Introduction; Deserializing and running a classifier; Getting confidence estimates from a classifier; Getting data from the Twitter API; Applying a classifier to a .csv file; Evaluation of classifiers -- the confusion matrix; Training your own language model classifier; How to train and evaluate with cross validation; Viewing error categories -- false positives; Understanding precision and recall; How to serialize a LingPipe object -- classifier example

Eliminate near duplicates with the Jaccard distanceHow to classify sentiment -- simple version; Chapter 2: Finding and Working with Words; Introduction; Introduction to tokenizer factories -- finding words in a character stream; Combining tokenizers -- lowercase tokenizer; Combining tokenizers -- stop word tokenizers; Using Lucene/Solr tokenizers; Using Lucene/Solr tokenizers with LingPipe; Evaluating tokenizers with unit tests; Modifying tokenizer factories; Finding words for languages without white spaces; Chapter 3: Advanced Classifiers; Introduction; A simple classifier

Language model classifier with tokensNaïve Bayes; Feature extractors; Logistic regression; Multithreaded cross validation; Tuning parameters in logistic regression; Customizing feature extraction; Combining feature extractors; Classifier-building life cycle; Linguistic tuning; Thresholding classifiers; Train a little, learn a little -- active learning; Annotation; Chapter 4: Tagging Words and Tokens; Introduction; Interesting phrase detection; Foreground- or background-driven interesting phrase detection; Hidden Markov Models (HMM) -- part-of-speech; N-best word tagging

Confidence-based taggingTraining word tagging; Word-tagging evaluation; Conditional random fields (CRF) for word/token tagging; Modifying CRFs; Chapter 5: Finding Spans in Text -- Chunking; Introduction; Sentence detection; Evaluation of sentence detection; Tuning sentence detection; Marking embedded chunks in a string -- sentence chunk example; Paragraph detection; Simple noun phrases and verb phrases; Regular expression-based chunking for NER; Dictionary-based chunking for NER; Translating between word tagging and chunks -- BIO codec; HMM-based NER; Mixing the NER sources; CRFs for chunking

NER using CRFs with better featuresChapter 6: String Comparison and Clustering; Introduction; Distance and proximity -- simple edit distance; Weighted edit distance; The Jaccard distance; The Tf-Idf distance; Using edit distance and language models for spelling correction; The case restoring corrector; Automatic phrase completion; Single-link and complete-link clustering using edit distance; Latent Dirichlet allocation (LDA) for multitopic clustering; Chapter 7: Finding Coreference Between Concepts/People; Introduction; Named entity coreference with a document; Adding pronouns to coreference

eBooks on EBSCOhost EBSCO eBook Subscription Academic Collection - Worldwide

There are no comments on this title.

to post a comment.

O.P. Jindal Global University, Sonepat-Narela Road, Sonepat, Haryana (India) - 131001

Send your feedback to glus@jgu.edu.in

Hosted, Implemented & Customized by: BestBookBuddies   |   Maintained by: Global Library