home << dhlab reference << dhlab.api.dhlab_api

reference_words#

from dhlab.api.dhlab_api import reference_words
reference_words(words=None, doctype='digibok', from_year=1800, to_year=2000)[source]#

Collect reference data for a list of words over a time period.

Reference data are the absolute and relative frequencies of the words across all documents of the given doctype in the given time period (from_year - to_year).

Parameters:
  • words (list) – list of word strings.

  • doctype (str) –

    type of reference document. Can be "digibok" or "digavis". Defaults to "digibok".

    Note

    If any other string is given as the doctype, the resulting data is equivalent to what you get with doctype="digavis".

  • from_year (int) – first year of publication

  • to_year (int) – last year of publication

Returns:

a DataFrame with the words’ frequency data

Return type:

DataFrame