My Account Log in

2 options

Machine learning techniques for text : apply modern techniques with Python for text processing, dimensionality reduction, classification, and evaluation / Nikos Tsourakis.

Ebook Central College Complete Available online

View online

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Tsourakis, Nikos, author.
Language:
English
Subjects (All):
Text data mining.
Machine learning.
Machine learning--Computer programs.
Python (Computer program language).
Physical Description:
1 online resource (448 pages)
Edition:
1st ed.
Place of Publication:
Birmingham, England : Packt Publishing Ltd., [2022]
Summary:
Machine learning and Python offer unique opportunities to process text data. This book will equip you with the skills you need to undertake a role in the field. The content keeps the right balance between need-to-know theory and hands-on practice, grounding the discussion around different case studies.
Contents:
Cover
Title Page
Copyright and Credits
Acknowledgments
Contributors
Table of Contents
Preface
Chapter 1: Introducing Machine Learning for Text
The language phenomenon
The data explosion
The era of AI
Relevant research fields
The machine learning paradigm
Taxonomy of machine learning techniques
Supervised learning
Unsupervised learning
Semi-supervised learning
Reinforcement learning
Visualization of the data
Evaluation of the results
Summary
Chapter 2: Detecting Spam Emails
Technical requirements
Understanding spam detection
Explaining feature engineering
Extracting word representations
Using label encoding
Using one-hot encoding
Using token count encoding
Using tf-idf encoding
Executing data preprocessing
Tokenizing the input
Removing stop words
Stemming the words
Lemmatizing the words
Performing classification
Getting the data
Creating the train and test sets
Preprocessing the data
Extracting the features
Introducing the Support Vector Machines algorithm
Understanding Bayes' theorem
Measuring classification performance
Calculating accuracy
Calculating precision and recall
Calculating the F-score
Creating ROC and AUC
Creating precision-recall curves
Chapter 3: Classifying Topics of Newsgroup Posts
Understanding topic classification
Performing exploratory data analysis
Executing dimensionality reduction
Understanding principal component analysis
Understanding linear discriminant analysis
Putting PCA and LDA into action
Introducing the k-nearest neighbors algorithm
Performing feature extraction
Performing cross-validation
Comparison to the baseline model
Introducing the random forest algorithm.
Contracting a decision tree
Extracting word embedding representation
Understanding word embedding
Performing vector arithmetic
Using the fastText tool
Chapter 4: Extracting Sentiments from Product Reviews
Understanding sentiment analysis
Using the Software dataset
Exploiting the ratings of products
Extracting the word count of reviews
Exploiting the helpfulness score
Introducing linear regression
Putting linear regression into action
Introducing logistic regression
Understanding gradient descent
Using logistic regression
Creating training and test sets
Applying regularization
Introducing deep neural networks
Understanding logic gates
Understanding perceptrons
Understanding artificial neurons
Creating artificial neural networks
Training artificial neural networks
Chapter 5: Recommending Music Titles
Understanding recommender systems
Cleaning the data
Extracting information from the data
Understanding the Pearson correlation
Introducing content-based filtering
Extracting music recommendations
Introducing collaborative filtering
Using memory-based collaborative recommenders
Applying SVD
Clustering handwritten text
Applying t-SNE
Using model-based collaborative systems
Introducing autoencoders
Chapter 6: Teaching Machines to Translate
Understanding machine translation
Introducing rule-based machine translation
Using direct machine translation
Using transfer-based machine translation
Using interlingual machine translation.
Introducing example-based machine translation
Introducing statistical machine translation
Modeling the translation problem
Creating the models
Introducing sequence-to-sequence learning
Deciphering the encoder/decoder architecture
Understanding long short-term memory units
Putting seq2seq in action
Measuring translation performance
Chapter 7: Summarizing Wikipedia Articles
Understanding text summarization
Introducing web scraping
Scraping popular quotes
Scraping book reviews
Scraping Wikipedia articles
Performing extractive summarization
Performing abstractive summarization
Introducing the attention mechanism
Introducing transformers
Putting the transformer into action
Measuring summarization performance
Chapter 8: Detecting Hateful and Offensive Language
Introducing social networks
Understanding BERT
Pre-training phase
Fine-tuning phase
Putting BERT into action
Introducing boosting algorithms
Understanding AdaBoost
Understanding gradient boosting
Understanding XGBoost
Creating validation sets
Learning the myth of Icarus
Extracting the datasets
Treating imbalanced datasets
Classifying with BERT
Training the classifier
Applying early stopping
Understanding CNN
Adding pooling layers
Including CNN layers
Chapter 9: Generating Text in Chatbots
Understanding text generation
Creating a retrieval-based chatbot
Understanding language modeling
Understanding perplexity
Building a language model
Creating a generative chatbot
Using a pre-trained model
Creating the GUI
Creating the web chatbot
Fine-tuning a pre-trained model
Chapter 10: Clustering Speech-to-Text Transcriptions.
Technical requirements
Understanding text clustering
Using speech-to-text
Introducing the K-means algorithm
Putting K-means into action
Introducing DBSCAN
Putting DBSCAN into action
Assessing DBSCAN
Introducing the hierarchical clustering algorithm
Putting hierarchical clustering into action
Introducing the LDA algorithm
Putting LDA into action
Index
Other Books You May Enjoy.
Notes:
Includes index.
Description based on print version record.
ISBN:
9781803236292
1803236299
OCLC:
1350182782

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account