User Tools

Site Tools


googlecloud

This is an old revision of the document!


1. Install gcloud sdk on your local machine (I personally used window linux subsystem, therefore I chose the apt-get option) reference: https://cloud.google.com/sdk/downloads

2. Call “gcloud init” on the command line, set user account, set region of computation unit.

3. On the web api, click on storage, and create a new storage bucket if there hasn't been one. Let's call it byu_tf_ml for our example.

4. On the local machine console, call: gcloud ml-engine jobs submit training my_job –package-path ./trainer –module-name trainer.py_task –staging-bucket gs:byu_tf_ml –scale-tier BASIC reference: https://cloud.google.com/sdk/gcloud/reference/ml-engine/jobs/submit/training hints: this step is a bit tricky, the command “gcloud ml-engine jobs submit training” is a google cloud version of packaging up our python machine learning project and uploading that to the cloud platform and run it. There are four fields required: a. job: in our example, the value is my_job, it's the job id showing up in the web api after submitting the job. b. package-path: the local machine directory which contains the python source code. c. module-name: the main python script. d. staging-bucket: the place on google cloud where the ml model is stored. optional: e. scale-tier: this is optional, but allow a fine control on how much computation power we want to use with the project. f. package-path: the path where packages you imported into the project but not listed here: https://cloud.google.com/ml-engine/docs/concepts/runtime-version-list 5. On the web api, go to ML Engine and click on job, you should be able to see the project submitted. 6. After the training finished, you will be able to see the results and logs on the web api. Others: 7. If you want to reuse the trained weights of of the model, include the savedmodel function in the application. reference: https://cloud.google.com/ml-engine/docs/concepts/prediction-overview 8. I haven't try out tensorbroad yet, but it seems like that it's not too bad to achieve. reference: https://cloud.google.com/ml-engine/docs/how-tos/monitor-training#monitoring_with_tensorboard

googlecloud.1504641962.txt.gz · Last modified: 2021/06/30 23:40 (external edit)