Fairness indicator metrics do not show up #138

zywind · 2021-08-04T19:37:55Z

System information

Have I written custom code (as opposed to using a stock example script
provided in TensorFlow Model Analysis): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Chromium OS 12.0_pre408248_p20201125-r7
TensorFlow Model Analysis installed from (source or binary): binary
TensorFlow Model Analysis version (use command below): 0.26
Python version: 3.7.9
Jupyter Notebook version: 6.1.3
Exact command to reproduce:
First method using TFX

import tensorflow_model_analysis as tfma
from tfx.components import Evaluator

eval_config = tfma.EvalConfig(
  model_specs=[
    tfma.ModelSpec(label_key='toxicity', signature_name='serve_tfexample')
  ],
  metrics_specs=[
    tfma.MetricsSpec(
      metrics=[
        tfma.MetricConfig(class_name='ExampleCount'),
        tfma.MetricConfig(class_name='BinaryAccuracy')
      ]
    )
  ],
  slicing_specs=[
    tfma.SlicingSpec(),
    tfma.SlicingSpec(feature_keys=['race']),
  ]
)

evaluator = Evaluator(
  examples=example_gen.outputs['examples'],
  schema=schema_gen.outputs['schema'],
  model=trainer.outputs['model'],
  fairness_indicator_thresholds=[0.3, 0.5, 0.7],
  eval_config=eval_config,
)

context.run(evaluator)
eval_result_uri = evaluator.outputs['evaluation'].get()[0].uri
eval_result = tfma.load_eval_result(eval_result_uri)

from tensorflow_model_analysis.addons.fairness.view import widget_view
widget_view.render_fairness_indicator(eval_result=eval_result)

Second method without TFX

tfma_eval_result_path = './bert/tfma_eval_result'
import tensorflow_model_analysis.addons.fairness.post_export_metrics.fairness_indicators
from google.protobuf import text_format

metrics_callbacks = [
    tfma.post_export_metrics.fairness_indicators(thresholds=[0.3, 0.5, 0.7]),
]

eval_config = tfma.EvalConfig(
  model_specs=[
    tfma.ModelSpec(label_key=LABEL)
  ],
  metrics_specs=[
    tfma.MetricsSpec(
      metrics=[
        tfma.MetricConfig(class_name='ExampleCount'),
        tfma.MetricConfig(class_name='BinaryAccuracy')
      ]
    )
  ],
  slicing_specs=[
    # An empty slice spec means the overall slice, i.e. the whole dataset.
    tfma.SlicingSpec(),
    tfma.SlicingSpec(feature_keys=['race']),
  ]
)

eval_shared_model = tfma.default_eval_shared_model(
  eval_saved_model_path='./checkpoints',
  add_metrics_callbacks=metrics_callbacks,
  eval_config=eval_config)

eval_result = tfma.run_model_analysis(
  eval_config=eval_config,
  eval_shared_model=eval_shared_model,
  data_location=validate_tf_file,
  output_path=tfma_eval_result_path)

from tensorflow_model_analysis.addons.fairness.view import widget_view
widget_view.render_fairness_indicator(eval_result=eval_result)

Describe the problem

I am trying to display fairness metrics with a TF2 based model, but for some reason, the fairness metrics (false discovery rate, false positive rate, etc.) do not show up in eval_result or in the fairness indicator widget (see screenshot below). This happened both when I use TFX's Evaluator component and when I run TFMA directly. Is it a bug or am I doing something wrong?

The text was updated successfully, but these errors were encountered:

kumarpiyush · 2021-08-23T05:40:38Z

Hi zywind,
Looks like TF2 based models are incompatible with metrics_callback. Could you try using Fairness Indicators as a metric (https://www.tensorflow.org/responsible_ai/fairness_indicators/guide)? In your case:

metrics_specs=[
  tfma.MetricsSpec(
    metrics=[
      tfma.MetricConfig(class_name='ExampleCount'),
      tfma.MetricConfig(class_name='BinaryAccuracy'),
      tfma.MetricConfig(class_name='FairnessIndicators', config='{"thresholds":[0.3,0.5,0.7]}'),
    ]
  )
]

This syntax should work with both TFX and non-TFX examples you've pasted.

Also, could you point me to the guide you have been using for Fairness Indicators so that I could capture it there?

zywind · 2021-08-23T14:03:56Z

Thank you @kumarpiyush for the information. I tried your suggestion but I get a different error:
IndexError: arrays used as indices must be of integer (or boolean) type [while running 'ExtractEvaluateAndWriteResults/ExtractAndEvaluate/EvaluateMetricsAndPlots/ComputeMetricsAndPlots()/ComputePerSlice/ComputeUnsampledMetrics/CombinePerSliceKey/WindowIntoDiscarding']

This might be because I'm using an outdated TFMA (v0.26)? My company will update our TFMA soon so I will test this again later.

As for the guide, I was just following the standard guide here: https://www.tensorflow.org/tfx/guide/fairness_indicators#compute_fairness_metrics. It suggested using the add_metrics_callbacks parameter. For TFX, there is no guide. I just found the fairness_indicator_thresholds parameter in the Evaluator's API. Adding a guide for TFX's Evaluator would be very useful.

DirkjanVerdoorn · 2021-09-22T09:23:50Z

@zywind are you by any chance applying transformations to your label? I experienced a similar issue when transforming string labels to one-hot encoded array's. In your case, you are using the output from the exampleGen component rather than the transform component. If your labels are strings, then your error is probably generated by the following line of code, coming from the one_hot function in tensorflow_model_analysis/metrics/metric_util.py:
tensor = np.delete(np.eye(target.shape[-1] + 1)[tensor], -1, axis=-1)

To solve this, you should take the output of the transform component rather than the exampleGen component. Again, I don't know your full situation, but this might be a possible reason your pipeline is failing.

zywind · 2021-09-24T13:53:33Z

@DirkjanVerdoorn Thanks for the suggestion. The labels are integer types so that's really not the problem.

zywind · 2021-11-01T15:30:34Z

Just to follow up on this, now our environment is updated to TFMA 0.31 and I can confirm that the fairness_indicator_thresholds parameter in Evaluator still doesn't work, but @kumarpiyush's method worked.
@kumarpiyush You may want to update the guide here: https://www.tensorflow.org/tfx/guide/fairness_indicators#compute_fairness_metrics

arghyaganguly self-assigned this Aug 5, 2021

arghyaganguly added the type:bug label Aug 5, 2021

arghyaganguly assigned mdreves and unassigned arghyaganguly Aug 5, 2021

arghyaganguly added the stat:awaiting tensorflower label Aug 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fairness indicator metrics do not show up #138

Fairness indicator metrics do not show up #138

Fairness indicator metrics do not show up #138

Fairness indicator metrics do not show up #138

Comments

System information

Describe the problem