Please, provide a detailed description of the issue.
Corpus, settings, query and your username are sent automatically.


Czech Web (czTenTen16) [2016, 2015]
This action may take several minutes for large corpora, please wait.

Czech Web (czTenTen16) [2016, 2015]

Czech web corpus crawled by SpiderLing in November and December 2015 and October to December 2016. Encoded in UTF-8, cleaned, deduplicated.

Counts
Tokens9307649368
Words7795495171
Sentences591133106
Paragraphs185928405
Documents32739566
General info
Corpus description Document
LanguageCzech
EncodingUTF-8
Compiled06/01/2017 12:31:26
Tagset Description
Word sketch grammar Definition
Lexicon sizes
word
tag
lemma
lc
lemma_lc
Tags legend
nounk1.*
adjectivek2.*
pronounk3.*
numeralk4.*
verbk5.*
adverbk6.*
prepositionk7.*
conjunctionk8.*