home << dhlab reference << dhlab.api.dhlab_api

get_dispersion#

from dhlab.api.dhlab_api import get_dispersion
get_dispersion(urn=None, words=None, window=300, pr=100)[source]#

Count occurrences of words in the given URN object.

Call the API BASE_URL endpoint /dispersion.

Parameters:
  • urn (str) – uniform resource name, example: URN:NBN:no-nb_digibok_2011051112001

  • words (list) – list of words. Defaults to a list of punctuation marks.

  • window (int) – The number of tokens to search through per row. Defaults to 300.

  • pr (int) – defaults to 100.

Returns:

a pandas.Series with frequency counts of the words in the URN object.

Return type:

Series