Classes#

Corpus([doctype, author, freetext, ...])

Class representing as DHLAB Corpus

Chunks([urn, chunks])

Create chunks from a text.

Collocations([corpus, words, before, after, ...])

Create collocations object

Concordance([corpus, query, window, limit])

Wrapper for concordance function

Counts([corpus, words])

Provide counts for a corpus - shouldn't be too large

GeoData([urn, model])

Fetch place data from a text (book, newspaper or ...) identified by URN with an appropriate and available spacy model.

GeoNames(names[, feature_class, feature_code])

Fetch data from a list of names

NER([urn, model, start_page, to_page])

Provide NER

POS([urn, model, start_page, to_page])

Provide POS and a parse

Models()

Show the spaCy language models available

WildcardWordSearch(word[, factor, ...])

Find a class of words matching a wildcard string

Ngram([words, from_year, to_year, doctype, ...])

Top level class for ngrams

NgramBook([words, title, publisher, city, ...])

Extract ngrams using metadata with functions to be inherited.

NgramNews([words, title, city, from_year, ...])

Ngram builder class.

WordParadigm(words[, lang])

Fetch inflection paradigms for a list of words, or just one word

WordLemma(words[, lang])

Fetch possbile lemmas for a given word form

WordForm(words[, lang])

Fetch possible forms of a word or list of words