1 option

Crawling Wikipedia to create the training dataset for a text generation model.

O'Reilly Online Learning: Academic/Public Library Edition Available online

Format:: Video
Contributor:: Lane, Hobson, presenter.; Manning (Firm), publisher.
Language:: English
Subjects (All):: Data mining.; Information retrieval.
Physical Description:: 1 online resource (1 video file (55 min.)) : sound, color.
Edition:: [First edition].
Place of Publication:: [Place of publication not identified] : Manning Publications, [2022]
Summary:: In this video Hobson shows how to download wikipedia article text to create a large corpora of grammatically correct sentences to train a text generation model on.
Notes:: OCLC-licensed vendor bibliographic record.
OCLC:: 1312604187
Publisher Number:: 10000MNHV202267

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

1 option

Crawling Wikipedia to create the training dataset for a text generation model.

Find

My Account

Guides