Reputation
Badges 1
43 × Eureka!Just want to know if it would be possible when you have your ClearML server inside your GCP environment, and you want to launch training jobs using Vertex AI. Would the training script be able to register to the server when there is no public IP?I guess it's more related to networking inside GCP, but just wanted to know if anyone tried it.
I see, I can confirm that these packages (except for google_cloud_storage) are imported directly in the main script
So what changed?
We changed other bits of code, but not that one..
But maybe we are focusing on the wrong thing, the question now is why is ClearML only detecting these packages (running a different experiment than Diego)
Pillow == 8.0.1
clearml == 0.17.5
google_cloud_storage == 1.40.0
joblib == 0.17.0
numpy == 1.19.5
pandas == 1.3.1
seaborn == 0.11.0
tensorflow_gpu == 2.3.1
tqdm == 4.54.1
I don't understand though..why doesn't this happen on my other experiments?
I need to wait 100 epochs 😅
but the reason I said the comparison could be an issue is because I'm not being able to do comparisons of experiments
oh wait, I was using clearml == 0.17.5 and I also had this issue
the thing is that this runs before you create the virtual environment, so then in the new environment those settings are no longer there
and would it be possible to run it using the normal local agent?
great! thank you for such a quick response!
sorry, in my case it's the default mode
I see the correct confusion matrices in tensorboard
how quick is "very quickly"? we are talking about maybe 30 minutes to reach 100 epochs
I'm gettingValueError: Task object can only be updated if created or in_progress
Worked perfectly, thanks!
are you referring to extra_docker_shell_ scrip t SuccessfulKoala55 ?
Awesome! I'll let you know if it works now
I'm afraid I'm still having the same issue..