home << dhlab reference << dhlab.api

dhlab_api#

from dhlab.api import dhlab_api

Functions#

collocation([corpusquery, word, before, after])

Make a collocation from a corpus query.

concordance([urns, words, window, limit])

Get a list of concordances from the National Library's database.

concordance_counts([urns, words, window, limit])

Count concordances (keyword in context) for a corpus query (used for collocation analysis).

document_corpus([doctype, author, freetext, ...])

Fetch a corpus based on metadata.

evaluate_documents([wordbags, urns])

Count and aggregate occurrences of topic wordbags for each document in a list of urns.

find_urns([docids, mode])

Return a list of URNs from a collection of docids.

geo_lookup(places[, feature_class, ...])

From a list of places, return their geolocations

get_chunks([urn, chunk_size])

Get the text in the document urn as frequencies of chunks

get_chunks_para([urn])

Fetch chunks and their frequencies from paragraphs in a document (urn).

get_dispersion([urn, words, window, pr])

Count occurrences of words in the given URN object.

get_document_corpus(**kwargs)

get_document_frequencies([urns, cutoff, words])

Fetch frequency counts of words in documents (urns).

get_identifiers([identifiers])

Convert a list of identifiers, oaiid, sesamid, urns or isbn10 to dhlabids

get_metadata([urns])

Get metadata for a list of URNs.

get_places(urn)

Look up placenames in a specific URN.

get_reference([corpus, from_year, to_year, ...])

Reference frequency list of the n most frequent words from a given corpus in a given period.

get_urn_frequencies([urns, dhlabid])

Fetch frequency counts of documents as URNs or DH-lab ids.

get_word_frequencies([urns, cutoff, words])

Fetch frequency numbers for words in documents (urns).

images([text, part])

Retrive images from bokhylla :param text: fulltext query expression for sqlite :param part: if a number the whole page is shown .

konkordans([urns, words, window, limit])

Wrapper for concordance().

ner_from_urn([urn, model, start_page, to_page])

Get NER annotations for a text (urn) using a spacy model.

ngram_book([word, title, period, publisher, ...])

Count occurrences of one or more words in books over a given time period.

ngram_news([word, title, period])

Get a time series of frequency counts for word in newspapers.

ngram_periodicals([word, title, period, ...])

Get a time series of frequency counts for word in periodicals.

pos_from_urn([urn, model, start_page, to_page])

Get part of speech tags and dependency parse annotations for a text (urn) with a SpaCy model.

query_imagination_corpus([category, author, ...])

Fetch data from imagination corpus

reference_words([words, doctype, from_year, ...])

Collect reference data for a list of words over a time period.

show_spacy_models()

Show available SpaCy model names.

totals([top_words])

Get aggregated raw frequencies of all words in the National Library's database.

urn_collocation([urns, word, before, after, ...])

Create a collocation from a list of URNs.

wildcard_search(word[, factor, freq_limit, ...])

word_concordance([urn, dhlabid, words, ...])

Get a list of concordances from the National Library's database.

word_form(word[, lang])

Look up the morphological feature specification of a word form.

word_form_many(wordlist[, lang])

Look up the morphological feature specifications for word forms in a wordlist.

word_lemma(word[, lang])

Find the list of possible lemmas for a given word form.

word_lemma_many(wordlist[, lang])

Find lemmas for a list of given word forms.

word_paradigm(word[, lang])

Find paradigms for a given word form.

word_paradigm_many(wordlist[, lang])

Find alternative forms for a list of words.

word_variant(word, form[, lang])

Find alternative form for a given word form.