Normal view MARC view ISBD view

Natural language annotation for machine learning / (Record no. 44303)

MARC details
000 -LEADER
fixed length control field	06287cam a22002177a 4500
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
ISBN	9781449306663 (pbk.)
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
ISBN	9789351103738
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number	006.35
Item number	PUS
100 1# - MAIN ENTRY--AUTHOR NAME
Personal name	Pustejovsky, J.
245 10 - TITLE STATEMENT
Title	Natural language annotation for machine learning /
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT)
Place of publication	Mumbai :
Name of publisher	Shroff Publishers & Distr,
Year of publication	2013.
300 ## - PHYSICAL DESCRIPTION
Number of Pages	xiv, 324 p. ;
500 ## - GENERAL NOTE
General note	Includes bibliographical references (p.306-315) and index.
505 ## - FORMATTED CONTENTS NOTE
Formatted contents note	Machine generated contents note: The Importance of Language Annotation --<br/>The Layers of Linguistic Description --<br/>What Is Natural Language Processing? --<br/>A Brief History of Corpus Linguistics --<br/>What Is a Corpus? --<br/>Early Use of Corpora --<br/>Corpora Today --<br/>Kinds of Annotation --<br/>Language Data and Machine Learning --<br/>Classification --<br/>Clustering --<br/>Structured Pattern Induction --<br/>The Annotation Development Cycle --<br/>Model the Phenomenon --<br/>Annotate with the Specification --<br/>Train and Test the Algorithms over the Corpus --<br/>Evaluate the Results --<br/>Revise the Model and Algorithms --<br/>Summary --<br/>Defining Your Goal --<br/>The Statement of Purpose --<br/>Refining Your Goal: Informativity Versus Correctness --<br/>Background Research --<br/>Language Resources --<br/>Organizations and Conferences --<br/>NLP Challenges --<br/>Assembling Your Dataset --<br/>The Ideal Corpus: Representative and Balanced --<br/>Collecting Data from the Internet --<br/>Eliciting Data from People --<br/>The Size of Your Corpus --<br/>Existing Corpora --<br/>Distributions Within Corpora --<br/>Summary --<br/>Basic Probability for Corpus Analytics --<br/>Joint Probability Distributions --<br/>Bayes Rule --<br/>Counting Occurrences --<br/>Zipf's Law --<br/>N-grams --<br/>Language Models --<br/>Summary --<br/>Some Example Models and Specs --<br/>Film Genre Classification --<br/>Adding Named Entities --<br/>Semantic Roles --<br/>Adopting (or Not Adopting) Existing Models --<br/>Creating Your Own Model and Specification: Generality Versus Specificity --<br/>Using Existing Models and Specifications --<br/>Using Models Without Specifications --<br/>Different Kinds of Standards --<br/>ISO Standards --<br/>Community-Driven Standards --<br/>Other Standards Affecting Annotation --<br/>Summary --<br/>Metadata Annotation: Document Classification --<br/>Unique Labels: Movie Reviews --<br/>Multiple Labels: Film Genres --<br/>Text Extent Annotation: Named Entities --<br/>Inline Annotation --<br/>Stand-off Annotation by Tokens --<br/>Stand-off Annotation by Character Location --<br/>Linked Extent Annotation: Semantic Roles --<br/>ISO Standards and You --<br/>Summary --<br/>The Infrastructure of an Annotation Project --<br/>Specification Versus Guidelines --<br/>Be Prepared to Revise --<br/>Preparing Your Data for Annotation --<br/>Metadata --<br/>Preprocessed Data --<br/>Splitting Up the Files for Annotation --<br/>Writing the Annotation Guidelines --<br/>Example 1: Single Labels-Movie Reviews --<br/>Example 2: Multiple Labels-Film Genres --<br/>Example 3: Extent Annotations-Named Entities --<br/>Example 4: Link Tags-Semantic Roles --<br/>Annotators --<br/>Choosing an Annotation Environment --<br/>Evaluating the Annotations --<br/>Cohen's Kappa (K) --<br/>Fleiss's Kappa (K) --<br/>Interpreting Kappa Coefficients --<br/>Calculating K in Other Contexts --<br/>Creating the Gold Standard (Adjudication) --<br/>Summary --<br/>What Is Learning? --<br/>Defining Our Learning Task --<br/>Classifier Algorithms --<br/>Decision Tree Learning --<br/>Gender Identification --<br/>Naive Bayes Learning --<br/>Maximum Entropy Classifiers --<br/>Other Classifiers to Know About --<br/>Sequence Induction Algorithms --<br/>Clustering and Unsupervised Learning --<br/>Semi-Supervised Learning --<br/>Matching Annotation to Algorithms --<br/>Testing Your Algorithm --<br/>Evaluating Your Algorithm --<br/>Confusion Matrices --<br/>Calculating Evaluation Scores --<br/>Interpreting Evaluation Scores --<br/>Problems That Can Affect Evaluation --<br/>Dataset Is Too Small --<br/>Algorithm Fits the Development Data Too Well --<br/>Too Much Information in the Annotation --<br/>Final Testing Scores --<br/>Summary --<br/>Revising Your Project --<br/>Corpus Distributions and Content --<br/>Model and Specification --<br/>Annotation --<br/>Training and Testing --<br/>Reporting About Your Work --<br/>About Your Corpus --<br/>About Your Model and Specifications --<br/>About Your Annotation Task and Annotators --<br/>About Your ML Algorithm --<br/>About Your Revisions --<br/>Summary --<br/>The Goal of TimeML --<br/>Related Research --<br/>Building the Corpus --<br/>Model: Preliminary Specifications --<br/>Times --<br/>Signals --<br/>Events --<br/>Links --<br/>Annotation: First Attempts --<br/>Model: The TimeML Specification Used in TimeBank --<br/>Time Expressions --<br/>Events --<br/>Signals --<br/>Links --<br/>Confidence --<br/>Annotation: The Creation of TimeBank --<br/>TimeML Becomes ISO-TimeML --<br/>Modeling the Future: Directions for TimeML --<br/>Narrative Containers --<br/>Expanding TimeML to Other Domains --<br/>Event Structures --<br/>Summary --<br/>The TARSQI Components --<br/>GUTime: Temporal Marker Identification --<br/>EVITA: Event Recognition and Classification --<br/>GUTenLINK --<br/>Slinket --<br/>SputLink --<br/>Machine Learning in the TARSQI Components --<br/>Improvements to the TTK --<br/>Structural Changes --<br/>Improvements to Temporal Entity Recognition: BTime --<br/>Temporal Relation Identification --<br/>Temporal Relation Validation --<br/>Temporal Relation Visualization --<br/>TimeML Challenges: TempEval-2 --<br/>TempEval-2: System Summaries --<br/>Overview of Results --<br/>Future of the TTK --<br/>New Input Formats --<br/>Narrative Containers/Narrative Times --<br/>Medical Documents --<br/>Cross-Document Analysis --<br/>Summary --<br/>Crowdsourcing Annotation --<br/>Amazon's Mechanical Turk --<br/>Games with a Purpose (GWAP) --<br/>User-Generated Content --<br/>Handling Big Data --<br/>Boosting --<br/>Active Learning --<br/>Semi-Supervised Learning --<br/>NLP Online and in the Cloud --<br/>Distributed Computing --<br/>Shared Language Resources --<br/>Shared Language Applications --<br/>And Finally ... --<br/>Appendices.
520 ## - SUMMARY, ETC.
Summary, etc	Create your own natural language training corpus for machine learning. This example-driven book walks you through the annotation cycle, from selecting an annotation task and creating the annotation specification to designing the guidelines, creating a "gold standard" corpus, and then beginning the actual data creation with the annotation process.
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical Term	Natural language processing (Computer science)
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical Term	Corpora (Linguistics)
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical Term	Machine learning.
700 1# - ADDED ENTRY--PERSONAL NAME
Personal name	Stubbs, Amber.
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Koha item type	Lending Books

Holdings
Collection code	Home library	Current library	Shelving location	Date acquired	Source of acquisition	Cost, normal purchase price	Full call number	Accession Number	Koha item type
	Main Library	Main Library	Stacks	29/12/2017	Purchased	1820.00	006.35 PUS	015570	Lending Books