Please, provide a detailed description of the issue.
Corpus, settings, query and your username are sent automatically.


Amharic WaC [2013 + 2015 + 2016]
This action may take several minutes for large corpora, please wait.

Amharic WaC [2013 + 2015 + 2016]

Amharic web corpus. Crawled by SpiderLing in August 2013 and October 2015 and January 2016. Encoded in UTF-8, cleaned, deduplicated. Tagged by TreeTagger trained on Amharic WIC corpus.

Counts
Tokens20287250
Words17320000
Sentences1208926
Paragraphs341327
Documents33542
General info
Corpus description Document
LanguageAmharic
EncodingUTF-8
Compiled05/05/2017 20:44:44
Tagset Description
Word sketch grammar Definition
Lexicon sizes
word
tag
sera