K-Folds and TF.Datasets

Hello everyone,

I’m search for make k-folds or stratified k-folds with Tf.Datasets. It’s possible?

I don´t knowledge all tools of TF.Dataset, but use in some basics functions, maybe exist some function with help me to do this.

Thanks.

HI @Andre_Galhardo ,

Here is the demo Model for the same" your_dataset_name" should be replaced with the name of the dataset you’re using from TensorFlow Datasets , While TensorFlow Datasets (TFDS) primarily provides pre-processed datasets for machine learning tasks, you can use it in conjunction with other TensorFlow components to implement k-fold cross-validation.

Additionally, you’ll need to customize the model definition, data preprocessing, and training/validation procedures according to your specific task and dataset.

import tensorflow_datasets as tfds

# Load dataset
dataset = tfds.load('your_dataset_name', split='train')

# Define number of folds
k = 5

# Split dataset into folds
folded_dataset = dataset.enumerate().batch(len(dataset) // k)

# Perform k-fold cross-validation
for fold, (idx, fold_data) in enumerate(folded_dataset):
    # Create training and validation datasets
    validation_data = fold_data
    training_data = dataset.filter(lambda x, _: tf.reduce_all(tf.not_equal(idx, _)))

    # Create data pipelines
    # Define preprocessing steps, batching, shuffling, etc.

    # Define and compile your model
    model = ...

    # Train the model
    model.fit(training_data)

    # Evaluate the model on validation set
    validation_loss, validation_accuracy = model.evaluate(validation_data)

    print(f'Fold {fold+1}: Validation Accuracy = {validation_accuracy}')

Thank You !