My Account Log in

2 options

MADCAT phase 3 training set.

LIBRA -
Loading location information...

Available from offsite location This item is stored in our repository but can be checked out.

Log in to request item
LIBRA -
Loading location information...

Available from offsite location This item is stored in our repository but can be checked out.

Log in to request item
Format:
Datafile
Contributor:
Lee, David.
Linguistic Data Consortium.
Language:
Arabic
Subjects (All):
Arabic language--Written Arabic--Data processing.
Arabic language.
Arabic language--Machine translating.
Arabic language--Translating into English.
Machine translating.
Arabic language--Written Arabic.
Genre:
Dictionaries.
Physical Description:
1 DVD ; 4 3/4 in.
4 3/4 in.
Other Title:
Multilingual automatic document classification analysis and translation phase 3 training set
Place of Publication:
[Philadelphia, PA] : Linguistic Data Consortium, [2013]
Language Note:
Arabic
System Details:
data file
Summary:
"MADCAT (Multilingual Automatic Document Classification Analysis and Translation) Phase 3 Training Set contains all training data created by the Linguistic Data Consortium (LDC) to support Phase 3 of the DARPA MADCAT Program. The data in this release consists of handwritten Arabic documents, scanned at high resolution and annotated for the physical coordinates of each line and token. Digital transcripts and English translations of each document are also provided, with the various content and annotation layers integrated in a single MADCAT XML output. The goal of the MADCAT program is to automatically convert foreign text images into English transcripts." -- LDC online catalogue.
Notes:
Title from disc label.
Data type: Text.
Data sources: Newsgroups, newswire, weblogs.
Applications: Handwriting recognition, machine translation.
"LDC2013T16".
Authors: David Lee, Safa Ismael, Dave Doermann, Stephanie Strassel, Zhiyi Song, Stephen Grimes.
ISBN:
1585636517
9781585636518
OCLC:
863541408
Access Restriction:
Restricted for use by site license.

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account