This shows you the differences between two versions of the page.
googlecloud [2017/09/05 20:06] humphrey created |
googlecloud [2021/06/30 23:42] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | 1. Install gcloud sdk on your local machine (I personally used window linux subsystem, therefore I chose the apt-get option) | ||
- | reference: https://cloud.google.com/sdk/downloads | ||
- | 2. Call "gcloud init" on the command line, set user account, set region of computation unit. | ||
- | |||
- | 3. On the web api, click on storage, and create a new storage bucket if there hasn't been one. Let's call it byu_tf_ml for our example. | ||
- | |||
- | 4. On the local machine console, call: gcloud ml-engine jobs submit training my_job --package-path ./trainer --module-name trainer.py_task --staging-bucket gs://byu_tf_ml --scale-tier BASIC | ||
- | reference: https://cloud.google.com/sdk/gcloud/reference/ml-engine/jobs/submit/training | ||
- | hints: this step is a bit tricky, the command "gcloud ml-engine jobs submit training" is a google cloud version of packaging up our python machine learning project and uploading that to the cloud platform and run it. There are four fields required: | ||
- | a. job: in our example, the value is my_job, it's the job id showing up in the web api after submitting the job. | ||
- | b. package-path: the local machine directory which contains the python source code. | ||
- | c. module-name: the main python script. | ||
- | d. staging-bucket: the place on google cloud where the ml model is stored. | ||
- | |||
- | optional: | ||
- | e. scale-tier: this is optional, but allow a fine control on how much computation power we want to use with the project. | ||
- | f. package-path: the path where packages you imported into the project but not listed here: https://cloud.google.com/ml-engine/docs/concepts/runtime-version-list | ||
- | |||
- | 5. On the web api, go to ML Engine and click on job, you should be able to see the project submitted. | ||
- | |||
- | 6. After the training finished, you will be able to see the results and logs on the web api. | ||
- | |||
- | Others: | ||
- | 7. If you want to reuse the trained weights of of the model, include the savedmodel function in the application. | ||
- | reference: https://cloud.google.com/ml-engine/docs/concepts/prediction-overview | ||
- | |||
- | 8. I haven't try out tensorbroad yet, but it seems like that it's not too bad to achieve. | ||
- | reference: https://cloud.google.com/ml-engine/docs/how-tos/monitor-training#monitoring_with_tensorboard |