Invalid .lst file in sagemaker

落花浮王杯 提交于 2019-12-24 10:48:16

问题


Folder structure for my S3 bucket is:

Bucket
    ->training-set
           ->medium
                 ->    img1.jpeg
                 ->    img2.jpeg
                 ->    img3.PNG

My training-set.lst file looks like this:

1  \t 1  \t medium/img1.jpeg
2  \t 1  \t medium/img2.jpeg
3  \t 1  \t medium/img3.PNG

I created this using excel sheet.

Error: Training failed with the following error: ClientError: Invalid lst file: training-set.lst

   "InputDataConfig": [
        {
          "ChannelName": "train",
          "CompressionType": "None",
          "ContentType": "application/x-image",
          "DataSource": {
            "S3DataSource": {
              "S3DataDistributionType": "FullyReplicated",
              "S3DataType": "S3Prefix",
              "S3Uri": 's3://{}/training-set/'.format(bucket)
            }
          },
          "RecordWrapperType": "None"
        },
        {
          "ChannelName": "validation",
          "CompressionType": "None",
          "ContentType": "application/x-image",
          "DataSource": {
            "S3DataSource": {
              "S3DataDistributionType": "FullyReplicated",
              "S3DataType": "S3Prefix",
              "S3Uri": 's3://{}/test-set/'.format(bucket)
            }
          },
          "RecordWrapperType": "None"
        },
        {
          "ChannelName": "train_lst",
          "CompressionType": "None",
          "ContentType": "application/x-image",
          "DataSource": {
            "S3DataSource": {
              "S3DataDistributionType": "FullyReplicated",
              "S3DataType": "S3Prefix",
              "S3Uri": "s3://bucket/training-set/training-set.lst"
            }
          },
          "RecordWrapperType": "None"
        },
        {
          "ChannelName": "validation_lst",
          "CompressionType": "None",
          "ContentType": "application/x-image",
          "DataSource": {
            "S3DataSource": {
              "S3DataDistributionType": "FullyReplicated",
              "S3DataType": "S3Prefix",
              "S3Uri": "s3://bucket/test-set/test-set.lst"
            }
          },
          "RecordWrapperType": "None"
        }
    ]

I am trying to use this in Amazon Sagemaker. But I'm unable to do that. Can someone please help?


回答1:


Could you please post the lst files you are using, looking at the documentation you need a tab delimited file place at the top of the folder hierarchy in your S3 bucket. Here is an example of a train_set.lst file from a flower classification example I built:

1   0   daisy/754296579_30a9ae018c_n.jpg
2   1   dandelion/18089878729_907ed2c7cd_m.jpg
3   1   dandelion/284497199_93a01f48f6.jpg
4   1   dandelion/3554992110_81d8c9b0bd_m.jpg
5   0   daisy/4065883015_4bb6010cb7_n.jpg

Please note that the sequence index (the first column) is required, and that the classes for your classification problem need to be number coded (starting at zero).

hope this helps!




回答2:


Your question doesn't explicitly say this - but based on your description of the problem am I right in assuming you are trying to use the SageMaker Image Classification algorithm (https://docs.aws.amazon.com/sagemaker/latest/dg/image-classification.html)?

Can you please double-check by downloading "s3://bucket/training-set/training-set.lst" (don't use the local copy you have) and checking the contents of this file - don't use Excel to open it, open it with a text editor and check that the format conforms to specification documented above - in particular I'd make sure the file is not in encoded in a non-standard encoding (it should be in UTF8) and that there are no extra tabs or spaces.

Also have a look at your training job's logs there may be additional clues there as to what went wrong.



来源:https://stackoverflow.com/questions/51670563/invalid-lst-file-in-sagemaker

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!