Base Wrapper#
Overview Information Here: boa.wrappers
- class boa.wrappers.base_wrapper.BaseWrapper(config_path: Optional[PathLike] = None, config: Optional[BOAConfig] = None, setup=True, *args, **kwargs)[source]#
Bases:
object- Parameters
config_path (PathLike) –
config (BOAConfig) –
- property metric_names#
list of metric names associated with this experiment
- property metric_params: dict#
dictionary of metric name to list of parameter names associated with each metric
- property experiment_dir#
- property working_dir#
- property output_dir#
- load_config(config_path: PathLike, *args, **kwargs) BOAConfig[source]#
Load config takes a configuration path of either a JSON file or a YAML file and returns your configuration dataclass.
Load_config will (unless overwritten in a subclass), do some basic “normalizations” to your configuration for convenience. See
BOAConfigand its __init__ method for more information about how the normalization works and what config options you can control.This implementation offers a default implementation that should work for most JSON or YAML files, but can be overwritten in subclasses if need be.
- Parameters
config_path (PathLike) – File path for the experiment configuration file
- Returns
loaded_config
- Return type
- mk_experiment_dir(experiment_dir: PathLike = None, output_dir: PathLike = None, experiment_name: str = None, append_timestamp: bool = None, **kwargs) Path[source]#
Make the experiment directory that boa will write all of its trials and logs to.
All parameters can be set in your configuration file as well. experiment_dir -> optimization_options -> experiment_dir experiment_name -> optimization_options -> experiment -> name append_timestamp -> script_options -> append_timestamp
- Parameters
experiment_dir (PathLike) – Path to the directory for the output of the experiment You may specify this or output_dir in your configuration file instead. (Defaults to your configuration file and then None)
output_dir (PathLike) –
Output directory of project, If you specify output_dir, then output will be saved in output_dir / experiment_name Because of this only either experiment_dir or output_dir may be specified. (if neither experiment_dir
nor output_dir are specified, output_dir defaults to whatever pwd returns (and equivalent on windows))
experiment_name (str) – Name of experiment, used for creating path to experiment dir with the output dir (Defaults to your configuration file and then boa_runs)
append_timestamp (bool) – Whether to append a timestamp to the end of the experiment directory to ensure uniqueness (Defaults to your configuration file and then True)
- Return type
- setup(*args, **kwargs)[source]#
method to override for subclasses to run any setup code they need either on class init (which will happen by default unless passing setup=False) or after init by calling this method directly
By default, this method will run mk_experiment_directory, so if you override this method to do more setup, either include that a call to mk_experiment_directory, (the default version or your own implementation) or call
super().setup(*args, **kwargs)which will then call the original version, which will call mk_experiment_directory.
- write_configs(trial: Trial) None[source]#
This function is usually used to write out the configurations files used in an individual optimization trial run, or to dynamically write a run script to start an optimization trial run.
- Parameters
trial (Trial) –
- Return type
None
- run_model(trial: Trial) None[source]#
Runs a model by deploying a given trial.
- Parameters
trial (Trial) –
- Return type
None
- set_trial_status(trial: Trial) None[source]#
Marks the status of a trial to reflect the status of the model run for the trial.
Each trial will be polled periodically to determine its status (completed, failed, still running, etc). This function defines the criteria for determining the status of the model run for a trial (e.g., whether the model run is completed/still running, failed, etc). The trial status is updated accordingly when the trial is polled.
The approach for determining the trial status will depend on the structure of the particular model and its outputs. One example is checking the log files of the model.
Todo
Add examples/links of different approaches
- Parameters
trial (Trial) –
- Return type
None
Examples
trial.mark_completed() trial.mark_failed() trial.mark_abandoned() trial.mark_early_stopped()
You can also do:
from ax.core.base_trial import TrialStatus trial.mark_as(TrialStatus.COMPLETED)
or:
trial.mark_as(3) # TrialStatus is an ENUM with COMPLETED being equivalent to 3
Relevant ENUM list
You can set it to either to text version, or the numerical equivalent
Relevant ENUM list
Numerical Equivalent
FAILED
2
COMPLETED
3
RUNNING
4 – you don’t need to set it to running, it is already set to running
ABANDONED
4
EARLY_STOPPED
7
See also
# TODO add sphinx link to ax trial status
- fetch_trial_data(*, parameters: Dict[str, Optional[Union[str, bool, float, int]]], metric_name: str, metric_properties: dict, trial: Trial, param_names: list[str] = None, **kwargs) dict[source]#
Retrieves the trial data for either the one metric that is specified in metric_name or all metrics at once.
For example, for a case where you are minimizing the error between a model and observations, using RMSE as a metric, this function would load the model output and the corresponding observation data that will be passed to the RMSE metric.
The return value of this function is a dictionary of dictionaries. The keys are the names of the metrics that each dictionary goes to, then each sub dictionary is the key value pair of parameters to pass to those metric functions. If you are just returning one metric, you do not need to return an embedded dictionary, and can just return the dictionary of key value parameter pairs.
In the key value parameter pairs, you can also specify the key “sem” for the standard error for this metric on this trial.
- Parameters
parameters (Dict[str, Optional[Union[str, bool, float, int]]]) – The parameters for the current trial. Format is a dictionary of key value pairs. This is a convenience argument, as these can also be accessed as trial.arm.parameters.
metric_name (str) – the name of the metric that the arguments are being fetched for if you choose to only return one metric at a time
metric_properties (dict) – collection of all metric properties for all metrics as a nested dictionary. a specific metric properties can be accessed as metric_properties[“metric_name1”]
trial (Trial) – The current trial. parameters can be accessed as trial.arm.parameters and trial index can be accessed by trial.index
param_names (list[str]) – A list of names of parameters to restrict this metric_name metric to. Useful for filtering out parameters before those parameters are passed to your metric. Defaults to [].
- Returns
A dictionary with the keys being the name of a specific metric, and the values being a dictionary of key word arguments to pass to that metric function. ex: Mean uses’ np.mean, which expects the parameters a (a array like object), so you could return {“Mean”: {“a”: [1, 2, 3, 4]}} You can also include a key “sem” that is the standard error of the mean for these trials metric value.
example return values
{ "Mean": {"a": trial.arm.parameters, "sem": 4.5}, "RMSE": { "y_true": [1.12, 1.25, 2.54, 4.52], "y_pred": trial.arm.parameters, }, }
{"Mean": {"a": trial.arm.parameters}}
{"a": trial.arm.parameters, "sem": 1}
- Return type
Examples
This example returns all the metrics at once. You can imagine instead having a “calc_stuff” for whatever you need to throw into these
>>> def fetch_trial_data(self, parameters, metric_name, metric_properties, trial, param_names=None, **kwargs): ... return { ... "Mean": {"a": parameters.values(), "sem": 4.5}, ... "RMSE": { ... "y_true": [1.12, 1.25, 2.54, 4.52], ... "y_pred": parameters.values(), ... }, ... }
This one only returns one metric at a time, it has some fragilities in that if you change the name of the metrics in the config, this will break. But for quick and dirty things, this can be great.
>>> def fetch_trial_data(self, parameters, metric_name, metric_properties, trial, param_names=None, **kwargs): ... if metric_name == "Mean": ... return {"a": parameters.values(), "sem": 4.5} ... elif metric_name == "RMSE": ... return { ... "y_true": [1.12, 1.25, 2.54, 4.52], ... "y_pred": parameters.values(), ... }
This one is a little more complicated, but it assumes in your config for each metric, you define a properties section, which allows arbitrary information to be passed. You can then associate a particular metric with a function and lookup that function at runtime in a dictionary (a hashmap if coming from other languages).
>>> def func_a(array): ... return np.mean(np.exp(array))
>>> def func_b(array): ... return np.exp(np.mean(array))
>>> funcs = {func_a.__name__: func_a, func_b.__name__: func_b}
>>> def fetch_trial_data(self, parameters, metric_name, metric_properties, trial, param_names=None, **kwargs): ... # we define in our config the names of functions to associate with certain metrics ... # and look them up at run time ... return {"a": funcs[metric_properties[metric_name]["function"]](parameters)}