home << dhlab reference << dhlab.api.dhlab_api
evaluate_documents#
from dhlab.api.dhlab_api import evaluate_documents
- evaluate_documents(wordbags=None, urns=None)[source]#
Count and aggregate occurrences of topic
wordbags
for each document in a list ofurns
.- Parameters:
wordbags (dict) – a dictionary of topic keywords and lists of associated words. Example:
{"natur": ["planter", "skog", "fjell", "fjord"], ... }
urns (list) – uniform resource names, for example:
["URN:NBN:no-nb_digibok_2008051404065", "URN:NBN:no-nb_digibok_2010092120011"]
- Returns:
a
pandas.DataFrame
with the topics as columns, indexed by the dhlabids of the documents.- Return type:
DataFrame