Natural language annotation for machine learning / (Record no. 44303)
[ view plain ]
000 -LEADER | |
---|---|
fixed length control field | 06287cam a22002177a 4500 |
020 ## - INTERNATIONAL STANDARD BOOK NUMBER | |
ISBN | 9781449306663 (pbk.) |
020 ## - INTERNATIONAL STANDARD BOOK NUMBER | |
ISBN | 9789351103738 |
082 04 - DEWEY DECIMAL CLASSIFICATION NUMBER | |
Classification number | 006.35 |
Item number | PUS |
100 1# - MAIN ENTRY--AUTHOR NAME | |
Personal name | Pustejovsky, J. |
245 10 - TITLE STATEMENT | |
Title | Natural language annotation for machine learning / |
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT) | |
Place of publication | Mumbai : |
Name of publisher | Shroff Publishers & Distr, |
Year of publication | 2013. |
300 ## - PHYSICAL DESCRIPTION | |
Number of Pages | xiv, 324 p. ; |
500 ## - GENERAL NOTE | |
General note | Includes bibliographical references (p.306-315) and index. |
505 ## - FORMATTED CONTENTS NOTE | |
Formatted contents note | Machine generated contents note: The Importance of Language Annotation --<br/>The Layers of Linguistic Description --<br/>What Is Natural Language Processing? --<br/>A Brief History of Corpus Linguistics --<br/>What Is a Corpus? --<br/>Early Use of Corpora --<br/>Corpora Today --<br/>Kinds of Annotation --<br/>Language Data and Machine Learning --<br/>Classification --<br/>Clustering --<br/>Structured Pattern Induction --<br/>The Annotation Development Cycle --<br/>Model the Phenomenon --<br/>Annotate with the Specification --<br/>Train and Test the Algorithms over the Corpus --<br/>Evaluate the Results --<br/>Revise the Model and Algorithms --<br/>Summary --<br/>Defining Your Goal --<br/>The Statement of Purpose --<br/>Refining Your Goal: Informativity Versus Correctness --<br/>Background Research --<br/>Language Resources --<br/>Organizations and Conferences --<br/>NLP Challenges --<br/>Assembling Your Dataset --<br/>The Ideal Corpus: Representative and Balanced --<br/>Collecting Data from the Internet --<br/>Eliciting Data from People --<br/>The Size of Your Corpus --<br/>Existing Corpora --<br/>Distributions Within Corpora --<br/>Summary --<br/>Basic Probability for Corpus Analytics --<br/>Joint Probability Distributions --<br/>Bayes Rule --<br/>Counting Occurrences --<br/>Zipf's Law --<br/>N-grams --<br/>Language Models --<br/>Summary --<br/>Some Example Models and Specs --<br/>Film Genre Classification --<br/>Adding Named Entities --<br/>Semantic Roles --<br/>Adopting (or Not Adopting) Existing Models --<br/>Creating Your Own Model and Specification: Generality Versus Specificity --<br/>Using Existing Models and Specifications --<br/>Using Models Without Specifications --<br/>Different Kinds of Standards --<br/>ISO Standards --<br/>Community-Driven Standards --<br/>Other Standards Affecting Annotation --<br/>Summary --<br/>Metadata Annotation: Document Classification --<br/>Unique Labels: Movie Reviews --<br/>Multiple Labels: Film Genres --<br/>Text Extent Annotation: Named Entities --<br/>Inline Annotation --<br/>Stand-off Annotation by Tokens --<br/>Stand-off Annotation by Character Location --<br/>Linked Extent Annotation: Semantic Roles --<br/>ISO Standards and You --<br/>Summary --<br/>The Infrastructure of an Annotation Project --<br/>Specification Versus Guidelines --<br/>Be Prepared to Revise --<br/>Preparing Your Data for Annotation --<br/>Metadata --<br/>Preprocessed Data --<br/>Splitting Up the Files for Annotation --<br/>Writing the Annotation Guidelines --<br/>Example 1: Single Labels-Movie Reviews --<br/>Example 2: Multiple Labels-Film Genres --<br/>Example 3: Extent Annotations-Named Entities --<br/>Example 4: Link Tags-Semantic Roles --<br/>Annotators --<br/>Choosing an Annotation Environment --<br/>Evaluating the Annotations --<br/>Cohen's Kappa (K) --<br/>Fleiss's Kappa (K) --<br/>Interpreting Kappa Coefficients --<br/>Calculating K in Other Contexts --<br/>Creating the Gold Standard (Adjudication) --<br/>Summary --<br/>What Is Learning? --<br/>Defining Our Learning Task --<br/>Classifier Algorithms --<br/>Decision Tree Learning --<br/>Gender Identification --<br/>Naive Bayes Learning --<br/>Maximum Entropy Classifiers --<br/>Other Classifiers to Know About --<br/>Sequence Induction Algorithms --<br/>Clustering and Unsupervised Learning --<br/>Semi-Supervised Learning --<br/>Matching Annotation to Algorithms --<br/>Testing Your Algorithm --<br/>Evaluating Your Algorithm --<br/>Confusion Matrices --<br/>Calculating Evaluation Scores --<br/>Interpreting Evaluation Scores --<br/>Problems That Can Affect Evaluation --<br/>Dataset Is Too Small --<br/>Algorithm Fits the Development Data Too Well --<br/>Too Much Information in the Annotation --<br/>Final Testing Scores --<br/>Summary --<br/>Revising Your Project --<br/>Corpus Distributions and Content --<br/>Model and Specification --<br/>Annotation --<br/>Training and Testing --<br/>Reporting About Your Work --<br/>About Your Corpus --<br/>About Your Model and Specifications --<br/>About Your Annotation Task and Annotators --<br/>About Your ML Algorithm --<br/>About Your Revisions --<br/>Summary --<br/>The Goal of TimeML --<br/>Related Research --<br/>Building the Corpus --<br/>Model: Preliminary Specifications --<br/>Times --<br/>Signals --<br/>Events --<br/>Links --<br/>Annotation: First Attempts --<br/>Model: The TimeML Specification Used in TimeBank --<br/>Time Expressions --<br/>Events --<br/>Signals --<br/>Links --<br/>Confidence --<br/>Annotation: The Creation of TimeBank --<br/>TimeML Becomes ISO-TimeML --<br/>Modeling the Future: Directions for TimeML --<br/>Narrative Containers --<br/>Expanding TimeML to Other Domains --<br/>Event Structures --<br/>Summary --<br/>The TARSQI Components --<br/>GUTime: Temporal Marker Identification --<br/>EVITA: Event Recognition and Classification --<br/>GUTenLINK --<br/>Slinket --<br/>SputLink --<br/>Machine Learning in the TARSQI Components --<br/>Improvements to the TTK --<br/>Structural Changes --<br/>Improvements to Temporal Entity Recognition: BTime --<br/>Temporal Relation Identification --<br/>Temporal Relation Validation --<br/>Temporal Relation Visualization --<br/>TimeML Challenges: TempEval-2 --<br/>TempEval-2: System Summaries --<br/>Overview of Results --<br/>Future of the TTK --<br/>New Input Formats --<br/>Narrative Containers/Narrative Times --<br/>Medical Documents --<br/>Cross-Document Analysis --<br/>Summary --<br/>Crowdsourcing Annotation --<br/>Amazon's Mechanical Turk --<br/>Games with a Purpose (GWAP) --<br/>User-Generated Content --<br/>Handling Big Data --<br/>Boosting --<br/>Active Learning --<br/>Semi-Supervised Learning --<br/>NLP Online and in the Cloud --<br/>Distributed Computing --<br/>Shared Language Resources --<br/>Shared Language Applications --<br/>And Finally ... --<br/>Appendices. |
520 ## - SUMMARY, ETC. | |
Summary, etc | Create your own natural language training corpus for machine learning. This example-driven book walks you through the annotation cycle, from selecting an annotation task and creating the annotation specification to designing the guidelines, creating a "gold standard" corpus, and then beginning the actual data creation with the annotation process. |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM | |
Topical Term | Natural language processing (Computer science) |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM | |
Topical Term | Corpora (Linguistics) |
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM | |
Topical Term | Machine learning. |
700 1# - ADDED ENTRY--PERSONAL NAME | |
Personal name | Stubbs, Amber. |
942 ## - ADDED ENTRY ELEMENTS (KOHA) | |
Koha item type | Lending Books |
Collection code | Home library | Current library | Shelving location | Date acquired | Source of acquisition | Cost, normal purchase price | Full call number | Accession Number | Koha item type |
---|---|---|---|---|---|---|---|---|---|
Main Library | Main Library | Stacks | 29/12/2017 | Purchased | 1820.00 | 006.35 PUS | 015570 | Lending Books |