Dictionary¶
This page describes Dictionary class.
-
class
artm.
Dictionary
(name=None, dictionary_path=None, data_path=None)¶ -
__init__
(name=None, dictionary_path=None, data_path=None)¶ Parameters: - name (str) – name of the dictionary
- dictionary_path (str) – can be used for default call of load() method in constructor
- data_path (str) – can be used for default call of gather() method in constructor
Note: all parameters are optional
-
copy
()¶ Description: returns a copy the dictionary loaded in lib with another name.
-
create
(dictionary_data)¶ Description: creates dictionary using DictionaryData object Parameters: dictionary_data (DictionaryData instance) – configuration of dictionary
-
filter
(class_id=None, min_df=None, max_df=None, min_df_rate=None, max_df_rate=None, min_tf=None, max_tf=None)¶ Description: filters the BigARTM dictionary of the collection, which was already loaded into the lib
Parameters: - dictionary_name (str) – name of the dictionary in the lib to filter
- dictionary_target_name (str) – name for the new filtered dictionary in the lib
- class_id (str) – class_id to filter
- min_df (float) – min df value to pass the filter
- max_df (float) – max df value to pass the filter
- min_df_rate (float) – min df rate to pass the filter
- max_df_rate (float) – max df rate to pass the filter
- min_tf (float) – min tf value to pass the filter
- max_tf (float) – max tf value to pass the filter
Note: the current dictionary will be replaced with filtered
-
gather
(data_path, cooc_file_path=None, vocab_file_path=None, symmetric_cooc_values=False)¶ Description: creates the BigARTM dictionary of the collection, represented as batches and load it in the lib
Parameters: - data_path (str) – full path to batches folder
- cooc_file_path (str) – full path to the file with cooc info
- vocab_file_path (str) – full path to the file with vocabulary. If given, the dictionary token will have the same order, as in this file, otherwise the order will be random
- symmetric_cooc_values (bool) – if the cooc matrix should considered to be symmetric or not
-
load
(dictionary_path)¶ Description: loads the BigARTM dictionary of the collection into the lib Parameters: dictionary_path (str) – full filename of the dictionary
-
load_text
(dictionary_path, encoding='utf-8')¶ Description: loads the BigARTM dictionary of the collection from the disk in the human readable text format
Parameters: - dictionary_path (str) – full file name of the text dictionary file
- encoding (str) – an encoding of text in diciotnary
-
save
(dictionary_path)¶ Description: saves the BigARTM dictionary of the collection on the disk Parameters: dictionary_path (str) – full file name for the dictionary
-
save_text
(dictionary_path, encoding='utf-8')¶ Description: saves the BigARTM dictionary of the collection on the disk in the human readable text format
Parameters: - dictionary_path (str) – full file name for the text dictionary file
- encoding (str) – an encoding of text in diciotnary
-