How do I load python modules which are not available in Sagemaker?

后端 未结 2 1640
日久生厌
日久生厌 2020-12-07 05:43

I want to install spacy which is not available as part of the Sagemaker platform. How should can I pip install it?

相关标签:
2条回答
  • 2020-12-07 05:55

    When creating you model, you can specify the requirements.txt as an environment variable.

    For Eg.

    env = {
        'SAGEMAKER_REQUIREMENTS': 'requirements.txt', # path relative to `source_dir` below.
    }
    sagemaker_model = TensorFlowModel(model_data = 's3://mybucket/modelTarFile,
                                      role = role,
                                      entry_point = 'entry.py',
                                      code_location = 's3://mybucket/runtime-code/',
                                      source_dir = 'src',
                                      env = env,
                                      name = 'model_name',
                                      sagemaker_session = sagemaker_session,
                                     )
    

    This would ensure that the requirements file is run after the docker container is created, before running any code on it.

    0 讨论(0)
  • 2020-12-07 06:17

    Great answer from Raman. I wanted to add another way of specifying the required python modules in the training instance, in case someone is looking.

    tf_estimator = TensorFlow(entry_point='tf-train.py', role='SageMakerRole',
                              training_steps=10000, evaluation_steps=100,
                              train_instance_count=1,
                              source_dir='./',
                              requirements_file='requirements.txt',
                              train_instance_type='ml.p2.xlarge')
    

    source_dir and requirements_file both have to be defined for it to work. The path is wrt to the notebook instance. If requirements.txt is under the same directory as the notebook, then just use './'

    Docs is here.

    0 讨论(0)
提交回复
热议问题