Contents:
Modules:
- pythonbasictools package
- Subpackages
- Submodules
- pythonbasictools.collections_tools module
- pythonbasictools.decorators module
- pythonbasictools.device module
- pythonbasictools.google_drive module
- pythonbasictools.hash_tools module
- pythonbasictools.lock module
- pythonbasictools.logging_tools module
- pythonbasictools.multiprocessing_tools module
- Module contents
pythonbasictools.experiment_utils package¶
Submodules¶
pythonbasictools.experiment_utils.metadata_file module¶
- class pythonbasictools.experiment_utils.metadata_file.MetadataFile(output_dir: str | Path, filename: str = 'METADATA', data: dict | None = None, save_every_set: bool = True, **kwargs)¶
Bases:
RunOutputFile- DEFAULT_FILENAME: str = 'METADATA'¶
- EXT: str = '.meta.json'¶
- __init__(output_dir: str | Path, filename: str = 'METADATA', data: dict | None = None, save_every_set: bool = True, **kwargs)¶
- property env¶
pythonbasictools.experiment_utils.output_folder module¶
- class pythonbasictools.experiment_utils.output_folder.ExperimentState(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
Enum- FAILED = 'FAILED'¶
- FINISHED = 'FINISHED'¶
- RUNNING = 'RUNNING'¶
- UNKNOWN = 'UNKNOWN'¶
- WAITING = 'WAITING'¶
- class pythonbasictools.experiment_utils.output_folder.ExperimentStateFile(folder: str | Path, state: ExperimentState = ExperimentState.UNKNOWN)¶
Bases:
object- FILENAME = 'state'¶
- __init__(folder: str | Path, state: ExperimentState = ExperimentState.UNKNOWN)¶
- append(new_text: str)¶
- property path: Path¶
- read() str¶
- property state: ExperimentState¶
- write(new_text: str)¶
- class pythonbasictools.experiment_utils.output_folder.OutputFolder(path: str | Path, metadata_file: MetadataFile | None = None, data_file: RunOutputFile | None = None)¶
Bases:
object- __init__(path: str | Path, metadata_file: MetadataFile | None = None, data_file: RunOutputFile | None = None)¶
- property data_file: RunOutputFile¶
- gather_files(pattern: str = '*') List[Path]¶
Gather files in the output folder matching the given pattern.
- Parameters:
pattern – The glob pattern to match files.
- Returns:
A list of file paths matching the pattern.
- classmethod gather_output_folders(root_folder: str | Path, metadata_ext: str = '.meta.json', data_ext: str = '.out.json') List[OutputFolder]¶
- property metadata_file: MetadataFile¶
- property path: Path¶
- classmethod root_folder_to_dataframe(root_folder: str | Path, metadata_ext: str = '.meta.json', data_ext: str = '.out.json') Tuple[DataFrame, DataFrame]¶
Gather all the metadata files and data files and build two dataframes, one for metadata and a second for the data. Each row in the dataframe corresponds to the data of one file. The dataframes can be merge together using the _output_folder column.
- Example:
>>> metadata_df, data_df = OutputFolder.root_folder_to_dataframe("./data/root") >>> full_df = pd.merge(metadata_df, data_df, on="_output_folder")
- property state: ExperimentState¶
- property state_file: ExperimentStateFile¶
- update_data(other: dict | None = None, **kwargs)¶
- update_metadata(other: dict | None = None, **kwargs)¶
Updates the metadata of the instance with information from the provided dictionary and any additional keyword arguments. If a dictionary is not provided, an empty dictionary is used. The method updates the instance’s metadata with both the values from the given dictionary and the keyword arguments.
- Parameters:
other (Optional[dict]) – Optional initial dictionary of metadata to update. Defaults to None.
kwargs – Additional metadata key-value pairs to update.
- Returns:
None
pythonbasictools.experiment_utils.run_output_file module¶
- class pythonbasictools.experiment_utils.run_output_file.RunOutputFile(output_dir: str | Path, filename: str = 'run_output', data: dict | None = None, save_every_set: bool = True, **kwargs)¶
Bases:
objectThis object is used to save and load data of a script run to a JSON file. The data is saved as a dictionary and can be accessed as an attribute of the object. The data is saved to a JSON file in the output directory of the script. It is useful when you want to store some data or state of the script run to a file and load it later.
Example
```python from run_output_file import RunOutputFile
output = RunOutputFile(“output_dir”, save_every_set=True) output_file.update({“status”: “STARTING”}) print(“Doing some work…”) work = 1 + 1 output_file.update({“status”: “WORKING”, “work”: work}) print(“Doing some other stuff…”) stuff = 2 + 2 output_file.update({“status”: “WORKING”, “stuff”: stuff}) print(“Done working.”) output_file.update({“status”: “DONE”}) ```
- Parameters:
output_dir (Union[str, Path]) – The directory where the output file will be saved.
filename (str) – The name of the output file. Default is “run_output”.
data (dict) – The initial data to be saved to the output file.
save_every_set (bool) – If True, the data will be saved to the output file every time an item is added or updated.
kwargs – Additional keyword arguments
- DEFAULT_FILENAME: str = 'run_output'¶
- EXT: str = '.out.json'¶
- RAVEL_DICT_KEY_SEP = '.'¶
- __init__(output_dir: str | Path, filename: str = 'run_output', data: dict | None = None, save_every_set: bool = True, **kwargs)¶
- as_dataframe(add_path: bool = True) DataFrame¶
- as_series(add_path: bool = True) Series¶
- property exists: bool¶
- classmethod from_file(path: str | Path, **kwargs)¶
- get(key, default=None)¶
- get_raveled_state(key_sep: str | None = None)¶
- legacy_load()¶
- load()¶
- load_if_exists()¶
- log(msg: str, level=20, print_msg: bool = True, **kwargs)¶
- classmethod parse_results_from_dir_to_dataframe(root_dir: str | Path, requires_columns: List[str] | None = None, file_column: str = '_file', mp_kwargs: Dict[str, Any] | None = None) DataFrame¶
Parse the results from a root directory. If requires_columns is not None, the rows that do not contain all the required columns will be removed.
- Parameters:
root_dir (str) – The directory containing the results.
requires_columns (Optional[List[str]]) – The columns that are required in the DataFrame. If None, no columns are required. If a row does not contain all the required columns, it will be removed from the final DataFrame.
file_column (str) – The name of the column that contains the file path.
mp_kwargs (Optional[Dict[str, Any]]) – The keyword arguments to pass to the multiprocessing function.
- Returns:
The DataFrame containing the results.
- property path: Path¶
- print_logs(level=20, sep: str = '\n')¶
- property raveled_state¶
- classmethod raveled_state_from_file(path: str | Path, raise_on_error: bool = False, **kwargs) dict¶
Get the raveled state of the data in the file.
- Parameters:
path (str) – The path to the file.
raise_on_error (bool) – If True, raise an error if an error occurs while loading the file. If False, return an empty dictionary if an error occurs.
kwargs – Additional keyword arguments.
- Returns:
The raveled state of the data in the file.
- Return type:
dict
- save()¶
- save_if_save_every_set()¶
- update(other: Dict[str, Any], print_updated: bool = True, print_header: str = 'New Data')¶