amazon-sagemaker | 易学教程

Logistic regression in sagemaker

阅读更多关于 Logistic regression in sagemaker

问题 I am using the aws sagemaker for logistic regression. For validating the model on test data, the following code is used runtime= boto3.client('runtime.sagemaker') payload = np2csv(test_X) response = runtime.invoke_endpoint(EndpointName=linear_endpoint, ContentType='text/csv', Body=payload) result = json.loads(response['Body'].read().decode()) test_pred = np.array([r['score'] for r in result['predictions']]) The result contains the prediction values and the probability scores. I want to know

How do I load python modules which are not available in Sagemaker?

阅读更多关于 How do I load python modules which are not available in Sagemaker?

问题 I want to install spacy which is not available as part of the Sagemaker platform. How should can I pip install it? 回答1: When creating you model, you can specify the requirements.txt as an environment variable. For Eg. env = { 'SAGEMAKER_REQUIREMENTS': 'requirements.txt', # path relative to `source_dir` below. } sagemaker_model = TensorFlowModel(model_data = 's3://mybucket/modelTarFile, role = role, entry_point = 'entry.py', code_location = 's3://mybucket/runtime-code/', source_dir = 'src',

call sagemaker endpoint using lambda function

阅读更多关于 call sagemaker endpoint using lambda function

问题 I have some data in S3 and I want to create a lambda function to predict the output with my deployed aws sagemaker endpoint then I put the outputs in S3 again. Is it necessary in this case to create an api gateway like decribed in this link ? and in the lambda function what I have to put. I expect to put (where to find the data, how to invoke the endpoint, where to put the data) Thanks 回答1: you definitely don't have to create an API in API Gateway. You can invoke the endpoint directly using

How to create a pipeline in sagemaker with pytorch

阅读更多关于 How to create a pipeline in sagemaker with pytorch

问题 I am dealing with a classification problem with text data in sagemaker. Where, i first fit and transform it into structured format(say by using TFIDF in sklearn) then i kept the result in S3 bucket and i used it for training my pytorch model for which i have written the code in my entry point. if we notice, by the end of the above process, i have two models sklearn TFIDF model actual PyTorch model So, when every time i need to predict on a new text data, i need to separately process(transform

AWS sagemaker RandomCutForest (RCF) vs scikit lean RandomForest (RF)?

阅读更多关于 AWS sagemaker RandomCutForest (RCF) vs scikit lean RandomForest (RF)?

问题 Is there a difference between the two, or are they different names for the same algorithm? 回答1: RandomCutForest (RCF) is an unsupervised method primarily used for anomaly detection, while RandomForest (RF) is a supervised method that can be used for regression or classification. For RCF, see documentation (here) and notebook example (here) 来源： https://stackoverflow.com/questions/56728230/aws-sagemaker-randomcutforest-rcf-vs-scikit-lean-randomforest-rf

how can I preprocess input data before making predictions in sagemaker?

阅读更多关于 how can I preprocess input data before making predictions in sagemaker?

问题 I am calling a Sagemaker endpoint using java Sagemaker SDK. The data that I am sending needs little cleaning before the model can use it for prediction. How can I do that in Sagemaker. I have a pre-processing function in the Jupyter notebook instance which is cleaning the training data before passing that data to train the model. Now I want to know if I can use that function while calling the endpoint or is that function already being used? I can show my code if anyone wants? EDIT 1 Basically

XGBoost prediction always returning the same value - why?

阅读更多关于 XGBoost prediction always returning the same value - why?

问题 I'm using SageMaker's built in XGBoost algorithm with the following training and validation sets: https://files.fm/u/pm7n8zcm When running the prediction model that comes out of the training with the above datasets always produces the exact same result. Is there something obvious in the training or validation datasets that could explain this behavior? Here is an example code snippet where I'm setting the Hyperparameters: { {"max_depth", "1000"}, {"eta", "0.001"}, {"min_child_weight", "10"}, {

Unable to get AWS SageMaker to read RecordIO files

阅读更多关于 Unable to get AWS SageMaker to read RecordIO files

问题 I'm trying to convert an object detection lst file to a rec file and train with it in SageMaker. My list looks something like this: 10 2 5 9.0000 1008.0000 1774.0000 1324.0000 1953.0000 3.0000 2697.0000 3340.0000 948.0000 1559.0000 0.0000 0.0000 0.0000 0.0000 0.0000 IMG_1091.JPG 58 2 5 11.0000 1735.0000 2065.0000 1047.0000 1300.0000 6.0000 2444.0000 2806.0000 1194.0000 1482.0000 1.0000 2975.0000 3417.0000 1739.0000 2139.0000 IMG_7000.JPG 60 2 5 12.0000 1243.0000 1861.0000 1222.0000 1710.0000

Re-hosting a trained model on AWS SageMaker

阅读更多关于 Re-hosting a trained model on AWS SageMaker

问题 I have started exploring AWS SageMaker starting with these examples provided by AWS. I then made some modifications to this particular setup so that it uses the data from my use case for training. Now, as I continue to work on this model and tuning, after I delete the inference endpoint once, I would like to be able to recreate the same endpoint -- even after stopping and restarting the notebook instance (so the notebook / kernel session is no longer valid) -- using the already trained model

upload data to S3 with sagemaker

阅读更多关于 upload data to S3 with sagemaker

问题 I have a problem with SageMaker when I try to upload Data into S3 bucket . I get this error : NameError Traceback (most recent call last) <ipython-input-26-d21b1cb0fcab> in <module>() 19 download('http://data.mxnet.io/data/caltech-256/caltech-256-60-train.rec') 20 ---> 21 upload_to_s3('train', 'caltech-256-60-train.rec') <ipython-input-26-d21b1cb0fcab> in upload_to_s3(channel, file) 13 data = open(file, "rb") 14 key = channel + '/' + file ---> 15 s3.Bucket(bucket).put_object(Key=key, Body