Split a dataset created by Tensorflow dataset API in to Train and Test?

前端 未结 8 754
天命终不由人
天命终不由人 2020-12-08 02:12

Does anyone know how to split a dataset created by the dataset API (tf.data.Dataset) in Tensorflow into Test and Train?

8条回答
  •  暗喜
    暗喜 (楼主)
    2020-12-08 02:42

    @ted's answer will cause some overlap. Try this.

    train_ds_size = int(0.64 * full_ds_size)
    valid_ds_size = int(0.16 * full_ds_size)
    
    train_ds = full_ds.take(train_ds_size)
    remaining = full_ds.skip(train_ds_size)  
    valid_ds = remaining.take(valid_ds_size)
    test_ds = remaining.skip(valid_ds_size)
    

    use code below to test.

    tf.enable_eager_execution()
    
    dataset = tf.data.Dataset.range(100)
    
    train_size = 20
    valid_size = 30
    test_size = 50
    
    train = dataset.take(train_size)
    remaining = dataset.skip(train_size)
    valid = remaining.take(valid_size)
    test = remaining.skip(valid_size)
    
    for i in train:
        print(i)
    
    for i in valid:
        print(i)
    
    for i in test:
        print(i)
    

提交回复
热议问题