Cloud ML Feature methods

问题

The pre-processing page in the cloud ML How to guide (https://cloud.google.com/ml/docs/how-tos/preprocessing-data) says that you should see the SDK reference documentation for details about each type of feature and the

Can anyone point me to this documentation or a list of feature types and their methods? I'm trying to setup a discrete target but keep getting "data type int64 expected type: float" errors whenever I set my target to .discrete() rather than .continuous()

回答1:

You need to download the SDK reference documentation:

Navigate to the directory where you want to install the docs in the
command line. If you used ~/google-cloud-ml to download the samples as recommended in the setup guide, it's a good place.
Copy the documentation archive to your chosen directory using gsutil:
```
gsutil cp gs://cloud-ml/sdk/cloudml-docs.latest.tar.gz .
```
Unpack the archive:
```
tar -xf cloudml-docs.latest.tar.gz
```

This creates a docs directory inside the directory that you chose. The documentation is essentially a local website: open docs/index.html in your browser to open it at its root. You can find the transform references in there.

(This information is now in the setup guide as well. It's the final step under LOCAL: MAC/LINUX)

回答2:

On the type-related errors, let's assume for a bit that your feature set is specified somewhat along the following lines:

feature_set = {
    'target': features.target('category').discrete()
}

When a discrete target is specified like above, the data-type of the target feature is an int64 due to one of the following:

No vocab for target data-column (i.e. 'category') was generated during the analysis of your data, i.e. the metadata (in the generated metadata.yaml) has an empty list for the target data-column's vocab.
A vocab for 'category' was indeed generated, and the data-type of the very first item (or key) of this vocab was an int.

Under these circumstances, if a float is encountered, the transformation to the target feature's data-type will fail.

Instead, casting the entire data-column ('category' in this case) into a float should help with this.

来源：https://stackoverflow.com/questions/40512481/cloud-ml-feature-methods

标签

google-cloud-ml