.. # Copyright (C) 2020-2023 Intel Corporation .. # SPDX-License-Identifier: Apache-2.0 .. _log_metric_callback: *********************** Metric Logging Callback *********************** By default, both the director based flow and the taskrunner API support `Tensorboard `_ to log metrics. Once the experiment is over, the logs can be invoked from the workspace with :code:`tensorboard --logdir logs`. The metrics that are logged by default are: - Aggregated model validation accuracy (Aggregator/aggregated_model_validate/acc, validate_agg/aggregated_model_validate/acc) - Locally tuned model validation accuracy (Aggregator/locally_tuned_model_validate/acc, validate_local/locally_tuned_model_validate/acc) - Train loss (Aggregator/train/train_loss, trained/train/train_loss) You can also use custom metric logging function for each task via Python\*\ API or command line interface. This function calls on the aggregator node. Python API ========== For logging metrics through Tensorboard, once :code:`fl_experiment.stream_metrics()` is called from the frontend API, it saves logs in the tensorboard format. After the experiment has finished, the logs can be invoked from the workspace with :code:`tensorboard --logdir logs`. You could also add your custom metric logging function by defining the function with the follow signature: .. code-block:: python def callback_name(node_name, task_name, metric_name, metric, round_number): """ Write metric callback Args: node_name (str): Name of node, which generate metric task_name (str): Name of task metric_name (str): Name of metric metric (np.ndarray): Metric value round_number (int): Round number """ your code Example of MLFlow's Metric Callback ===================================== This example shows how to use MLFlow logger to log metrics: .. code-block:: python import mlflow def callback_name(node_name, task_name, metric_name, metric, round_number): """ Write metric callback Args: node_name (str): Name of node, which generate metric task_name (str): Name of task metric_name (str): Name of metric metric (np.ndarray): Metric value round_number (int): Round number """ mlflow.log_metrics({f'{node_name}/{task_name}/{metric_name}': float(metric), 'round_number': round_number}) You could view the log results either through UI interactively by typing :code:`mlflow ui` or through the use of :code:`MLflowClient`. By default, only the last logged value of the metric is returned. If you want to retrieve all the values of a given metric, uses :code:`mlflow.get_metric_history` method. .. code-block:: python import mlflow client = mlflow.tracking.MlflowClient() print(client.get_metric_history("", "validate_local/locally_tuned_model_validation/accuracy")) Command Line Interface ====================== For logging through Tensorboard, enable the parameter :code:`write_logs : true` in `aggregator's plan settings `_ : .. code-block:: yaml aggregator : template : openfl.component.Aggregator settings : write_logs : true Follow the steps below to write your custom callback function instead. As an example, a full implementation can be found at `Federated_Pytorch_MNIST_Tutorial.ipynb `_ and in the **torch_cnn_mnist** workspace. 1. Define the callback function, like how you defined in Python API, in the **src** directory in your workspace. 2. Provide a way to your function with the ``log_metric_callback`` key in the ``aggregator`` section of the **plan.yaml** file in your workspace. .. code-block:: yaml aggregator : defaults : plan/defaults/aggregator.yaml template : openfl.component.Aggregator settings : init_state_path : save/torch_cnn_mnist_init.pbuf best_state_path : save/torch_cnn_mnist_best.pbuf last_state_path : save/torch_cnn_mnist_last.pbuf rounds_to_train : 10 write_logs : true log_metric_callback : template : src.mnist_utils.callback_name Example of TensorBoard's Metric Callback ========================================== The following is an example of a log metric callback, which writes metric values to the TensorBoard. .. code-block:: python from torch.utils.tensorboard import SummaryWriter writer = SummaryWriter('./logs/cnn_mnist', flush_secs=5) def write_metric(node_name, task_name, metric_name, metric, round_number): writer.add_scalar("{}/{}/{}".format(node_name, task_name, metric_name), metric, round_number)