My Account Log in

1 option

Text mining : a guidebook for the social sciences / Gabe Ignatow, University of North Texas, Rada Mihalcea, University of Michigan.

LIBRA H61.3 .I395 2017
Loading location information...

Available from offsite location This item is stored in our repository but can be checked out.

Log in to request item
Format:
Book
Author/Creator:
Ignatow, Gabe, author.
Radev, Dragomir, 1968- author.
Contributor:
Esther F. Kantrowitz & Lionel Kantrowitz Collection Endowment Fund.
Language:
English
Subjects (All):
Social sciences--Research--Methodology.
Social sciences.
Discourse analysis--Data processing.
Discourse analysis.
Communication--Network analsysis.
Communication.
Natural language processing (Computer science).
Data mining.
Physical Description:
xvi, 188 pages : illustrations ; 23 cm
Place of Publication:
Los Angeles : SAGE, [2017]
Summary:
Online communities generate massive volumes of natural language data, and the social sciences continue to learn how to best make use of this new information and the technology available for analyzing it. Text Mining brings together a broad range of contemporary qualitative and quantitative methods to provide strategic and practical guidance on analyzing large text collections. This accessible book, written by a sociologist and a computer scientist, surveys the fast-changing landscape of data sources, programming languages, software packages, and methods of analysis available today. Suitable for novice and experienced researchers alike, this book helps readers use text mining techniques more efficiently and productively. Book jacket.
Contents:
Part I Digital Texts, Digital Social Science 1
1 Social Science and the Digital Text Revolution 2
History of Text Analysis 3
Risks and Rewards of Text Mining for the Social Sciences 5
Social Data From Digital Environments 6
Theory and Metatheory 10
Ethics of Text Mining 12
Participant Consent, Privacy, and Anonymity 12
Prompted and Unprompted Data 13
Organization of This Volume 13
2 Research Design Strategies 16
Levels of Analysis 18
The Textual Level 18
The Contextual Level 18
The Sociological Level 18
Strategies for Document Selection and Sampling 19
Case Selection 19
Text Sampling 20
Types of Inferential Logic 22
Inductive Logic 23
Deductive Logic 24
Abductive Logic 25
Approaches to Research Design 27
Analysis of Discourse Positions 27
Conversation Analysis 28
Critical Discourse Analysis 28
Content Analysis 29
Foucauldian Intertextuality 30
Analysis of Texts as Social Information 31
Part II Text Mining Fundamentals 33
3 Web Crawling and Scraping 34
Web Statistics 36
Web Crawling 37
Process Steps in Crawling 37
Traversal Strategies 38
Crawler Politeness 38
Web Scraping 39
Software for Web Crawling and Scraping 41
4 Lexical Resources 42
WordNet 43
WordNet-Affect 45
Roget's Thesaurus 46
Linguistic Inquiry and Word Count 46
General Inquirer 48
Wikipedia 48
Wiktionary 51
Downloadable Lexical Resources and Application Program Interfaces 51
5 Basic Text Processing 52
Tokenization 54
Stop Word Removal 55
Stemming and Lemmatjzation 55
Text Statistics 56
Language Models 59
Other Text Processing 60
Part of Speech Tagging 60
Collocation identification 60
Syntactic Parsing 61
Named Entity Tagging 61
Word Sense Disambiguation 61
Software for Text Processing 61
6 Supervised Learning 62
Feature Representation and Weighting 65
Feature Weighting 65
Supervised Learning Algorithms 66
Decision Trees 67
Instance-Based Learning 68
Support Vector Machines 69
Evaluation of Supervised Learning 71
Software for Supervised Learning 71
Part III Text Analysis Methods from the Humanities and Social Sciences 73
7 Thematic Analysis, Qualitative Data Analysis Software, and Visualization 74
Thematic Analysis 75
Qualitative Data Analysis Software 77
Visualization Tools 83
Word Clouds 84
Word Trees and Phrase Nets 84
Matrices and Maps 85
Key Word in Context 86
Software for Thematic Analysis, Qualitative Data Analysis and Visualization 86
8 Narrative Analysis 88
Conceptual Foundations 90
Structural Approaches to Narrative 90
Functionalist Approaches to Narrative 91
Sociological Approaches to Narrative 92
Mixed Methods of Narrative Analysis 92
Automated Methods of Narrative Analysis 93
Future Directions 93
Software for Narrative Analysis 94
9 Metaphor Analysis 96
Theoretical Foundations 98
Qualitative Metaphor Analysis 99
Anthropology 99
Educational Research 99
Political Science 100
Psychology 100
Sociology 101
Mixed Methods of Metaphor Analysis 101
Management Research 101
Psychology 102
Sociology 102
Automated Metaphor Identification Methods 103
Software for Metaphor Analysis 103
Part IV Text Mining Methods from Computer Science 105
10 Word and Text Relatedness 106
Theoretical Foundations 107
Corpus-Based and Knowledge-Based Measures of Relatedness 108
Corpus-Based Measures of Word Relatedness 108
Knowledge-Based Measures of Word Relatedness 110
Measures of Text Relatedness 112
Software and Data Sets for Word and Text Relatedness 114
11 Text Classification 116
A Brief History of Text Classification 118
Applications of Text Classification 119
Topic Classification 119
E-Mail Spam Detection 120
Sentiment Analysis/Opinion Mining 120
Gender Classification 120
Deception Detection 122
Other Applications 122
Representing Texts for Supervised Text Classification 122
Feature Weighting and Selection 123
Text Classification Algorithms 124
Naive Bayes 124
Rocchio Classifier 125
Bootstrapping in Text Classification 126
Evaluation of Text Classification 127
Software and Data Sets for Text Classification 127
12 Information Extraction 130
Entity Extraction 132
Relation Extraction 133
Web Information Extraction 134
Template Filling 135
Software and Data Sets for Information Extraction and Text Mining 135
13 Information Retrieval 136
Theoretical Foundations 138
Components of an Information Retrieval System 138
Information Retrieval Models 140
The Vector Space Model 142
Evaluation of Information Retrieval Models 144
Web-Based Information Retrieval 145
Software and Data Sets for Information Retrieval 147
14 Sentiment Analysis 148
Theoretical Foundations 150
Lexicons 151
Corpora 152
Tools 153
Software and Data Sets for Sentiment Analysis 154
15 Topic Models 156
Digital Humanities 160
Political Science 160
Sociology 161
Software far Topic Modeling 161
Part V Conclusions 163
16 Text Mining, Text Analysis, and the Future of Social Science 164
Social and Computer Science Collaboration 166.
Notes:
Includes bibliographical references (pages 168-182) and index.
Local Notes:
Acquired for the Penn Libraries with assistance from the Esther F. Kantrowitz & Lionel Kantrowitz Collection Endowment Fund.
ISBN:
9781483369341
148336934X
OCLC:
933765455
Publisher Number:
99968910078

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account