1 option
Train Word embeddings from scratch with Nessvec and PyTorch.
- Format:
- Video
- Language:
- English
- Subjects (All):
- Natural language processing (Computer science).
- Machine learning.
- Physical Description:
- 1 online resource (1 video file (41 min.)) : sound, color.
- Edition:
- [First edition].
- Place of Publication:
- [Place of publication not identified] : Manning Publications, [2022]
- Summary:
- Hobson and his colleagues try to figure out how to train word embeddings from scratch using the WikiText2 dataset in PyTorch. The WikiText2 dataset contains redacted words, but they were unable to find the "labels" that reveal the words masked with the symbol ``. If you try to use the `Wikipedia` package to retrieve Wikipedia pages directly, you may hit the `suggest` bug. There are more than 100 unanswered issues on the project, and the maintainer has pushed any changes for many years. The Tangible AI fork on GitLab fixes this search suggestion bug so we could easily crawl Wikipedia. Unfortunately, the Wikipedia-API package is not very useful for searching and crawling Wikipedia to retrieve text.
- Notes:
- OCLC-licensed vendor bibliographic record.
- OCLC:
- 1312645700
- Publisher Number:
- 10000MNHV202264
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.