The thesaurus quality is heavily dependent on rich word sketches containing lots of collocates which is consequently dependent on a high %[frequency]% of the search word as well as high frequency of the potential synonyms. This means that a very large corpus is needed. A size of around 100,000 words is the bare minimum to produce some usable result for high-frequency words. However, a much larger corpus is needed for rare words to ensure sufficient frequency. The use of our multi-billion word corpora is highly recommended.
The synonym list may contain words which should not be included. This is a result of automatic processing. Sketch Engine cannot determine the similarity in meaning directly, it can only compare the collocates. If two words share the same collocates, they will be listed as synonyms even though the meaning is not similar. Such occasional inaccuracies do not make the tool less useful. To avoid this, use a larger corpus. Thesaurus for extremely rare words (frequency of just a few hundred words or less) will inevitably produce poor results or may not produce the thesaurus at all.