Contents:

Modules:

pythonbasictools package

pythonbasictools.experiment_utils package¶

Submodules¶

pythonbasictools.experiment_utils.metadata_file module¶

class pythonbasictools.experiment_utils.metadata_file.MetadataFile(output_dir: str | Path, filename: str = 'METADATA', data: dict | None = None, save_every_set: bool = True, **kwargs)¶

Bases: RunOutputFile

DEFAULT_FILENAME: str = 'METADATA'¶

EXT: str = '.meta.json'¶

__init__(output_dir: str | Path, filename: str = 'METADATA', data: dict | None = None, save_every_set: bool = True, **kwargs)¶

property env¶

pythonbasictools.experiment_utils.output_folder module¶

class pythonbasictools.experiment_utils.output_folder.ExperimentState(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)¶

Bases: Enum

FAILED = 'FAILED'¶

FINISHED = 'FINISHED'¶

RUNNING = 'RUNNING'¶

UNKNOWN = 'UNKNOWN'¶

WAITING = 'WAITING'¶

class pythonbasictools.experiment_utils.output_folder.ExperimentStateFile(folder: str | Path, state: ExperimentState = ExperimentState.UNKNOWN)¶

Bases: object

FILENAME = 'state'¶

__init__(folder: str | Path, state: ExperimentState = ExperimentState.UNKNOWN)¶

append(new_text: str)¶

property path: Path¶

read() → str¶

property state: ExperimentState¶

write(new_text: str)¶

class pythonbasictools.experiment_utils.output_folder.OutputFolder(path: str | Path, metadata_file: MetadataFile | None = None, data_file: RunOutputFile | None = None)¶

Bases: object

__init__(path: str | Path, metadata_file: MetadataFile | None = None, data_file: RunOutputFile | None = None)¶

property data_file: RunOutputFile¶

gather_files(pattern: str = '*') → List[Path]¶

Gather files in the output folder matching the given pattern.

Parameters:: pattern – The glob pattern to match files.
Returns:: A list of file paths matching the pattern.

classmethod gather_output_folders(root_folder: str | Path, metadata_ext: str = '.meta.json', data_ext: str = '.out.json') → List[OutputFolder]¶

property metadata_file: MetadataFile¶

property path: Path¶

classmethod root_folder_to_dataframe(root_folder: str | Path, metadata_ext: str = '.meta.json', data_ext: str = '.out.json') → Tuple[DataFrame, DataFrame]¶

Gather all the metadata files and data files and build two dataframes, one for metadata and a second for the data. Each row in the dataframe corresponds to the data of one file. The dataframes can be merge together using the _output_folder column.

Example:

>>> metadata_df, data_df = OutputFolder.root_folder_to_dataframe("./data/root")
>>> full_df = pd.merge(metadata_df, data_df, on="_output_folder")

property state: ExperimentState¶

property state_file: ExperimentStateFile¶

update_data(other: dict | None = None, **kwargs)¶

update_metadata(other: dict | None = None, **kwargs)¶

Updates the metadata of the instance with information from the provided dictionary and any additional keyword arguments. If a dictionary is not provided, an empty dictionary is used. The method updates the instance’s metadata with both the values from the given dictionary and the keyword arguments.

Parameters:

other (Optional[dict]) – Optional initial dictionary of metadata to update. Defaults to None.
kwargs – Additional metadata key-value pairs to update.

Returns:

None

pythonbasictools.experiment_utils.run_output_file module¶

class pythonbasictools.experiment_utils.run_output_file.RunOutputFile(output_dir: str | Path, filename: str = 'run_output', data: dict | None = None, save_every_set: bool = True, **kwargs)¶

Bases: object

This object is used to save and load data of a script run to a JSON file. The data is saved as a dictionary and can be accessed as an attribute of the object. The data is saved to a JSON file in the output directory of the script. It is useful when you want to store some data or state of the script run to a file and load it later.

Example

```python from run_output_file import RunOutputFile

output = RunOutputFile(“output_dir”, save_every_set=True) output_file.update({“status”: “STARTING”}) print(“Doing some work…”) work = 1 + 1 output_file.update({“status”: “WORKING”, “work”: work}) print(“Doing some other stuff…”) stuff = 2 + 2 output_file.update({“status”: “WORKING”, “stuff”: stuff}) print(“Done working.”) output_file.update({“status”: “DONE”}) ```

Parameters:

output_dir (Union[str, Path]) – The directory where the output file will be saved.
filename (str) – The name of the output file. Default is “run_output”.
data (dict) – The initial data to be saved to the output file.
save_every_set (bool) – If True, the data will be saved to the output file every time an item is added or updated.
kwargs – Additional keyword arguments

DEFAULT_FILENAME: str = 'run_output'¶

EXT: str = '.out.json'¶

RAVEL_DICT_KEY_SEP = '.'¶

__init__(output_dir: str | Path, filename: str = 'run_output', data: dict | None = None, save_every_set: bool = True, **kwargs)¶

as_dataframe(add_path: bool = True) → DataFrame¶

as_series(add_path: bool = True) → Series¶

property exists: bool¶

classmethod from_file(path: str | Path, **kwargs)¶

get(key, default=None)¶

get_raveled_state(key_sep: str | None = None)¶

legacy_load()¶

load()¶

load_if_exists()¶

log(msg: str, level=20, print_msg: bool = True, **kwargs)¶

classmethod parse_results_from_dir_to_dataframe(root_dir: str | Path, requires_columns: List[str] | None = None, file_column: str = '_file', mp_kwargs: Dict[str, Any] | None = None) → DataFrame¶

Parse the results from a root directory. If requires_columns is not None, the rows that do not contain all the required columns will be removed.

Parameters:

root_dir (str) – The directory containing the results.
requires_columns (Optional[List[str]]) – The columns that are required in the DataFrame. If None, no columns are required. If a row does not contain all the required columns, it will be removed from the final DataFrame.
file_column (str) – The name of the column that contains the file path.
mp_kwargs (Optional[Dict[str, Any]]) – The keyword arguments to pass to the multiprocessing function.

Returns:

The DataFrame containing the results.

property path: Path¶

print_logs(level=20, sep: str = '\n')¶

property raveled_state¶

classmethod raveled_state_from_file(path: str | Path, raise_on_error: bool = False, **kwargs) → dict¶

Get the raveled state of the data in the file.

Parameters:

path (str) – The path to the file.
raise_on_error (bool) – If True, raise an error if an error occurs while loading the file. If False, return an empty dictionary if an error occurs.
kwargs – Additional keyword arguments.

Returns:

The raveled state of the data in the file.

Return type:

dict

save()¶

save_if_save_every_set()¶

update(other: Dict[str, Any], print_updated: bool = True, print_header: str = 'New Data')¶

pythonbasictools.experiment_utils package¶

Submodules¶

pythonbasictools.experiment_utils.metadata_file module¶

pythonbasictools.experiment_utils.output_folder module¶

pythonbasictools.experiment_utils.run_output_file module¶

Module contents¶