Please, provide a detailed description of the issue.
Corpus, settings, query and your username are sent automatically.

Interface language
This action may take several minutes for large corpora, please wait.

Corpus Tigrinya WaC [2016] – statistics and info

Tigrinya web corpus. Crawled by SpiderLing in January 2016. Encoded in UTF-8, cleaned, deduplicated.

Counts
Tokens2531443
Words2087613
Sentences139357
Paragraphs28552
Documents1907
General info
Corpus description Document
LanguageTigrinya
EncodingUTF-8
Compiled05/23/2017 15:20:40
Tagset Description
Word sketch grammar Definition
Lexicon sizes
word225132
tag15
sera220935
Tags legend (tagset)
adjectiveADJ
adverbADV
conjunction.CONJ
determinerDET
interjectionINTJ
nounNOUN
noun properPROPN
numeralNUM
particlePART
prepositionADP
pronounPRON
verbVERB

Structures and attributes

hide detail