
Reputation
Badges 1
533 × Eureka!can't remember, I just restarted everything so I don't have this info now
you can use pgrep -af "trains-agent"
AgitatedDove14 worked like a charm, thanks a lot!
Cool - what kind of objects are returned by .artifacts.
getitem
? I want to check their docs
I only found Project ID, which I'm not sure what this refers to - I have the project name
I was refering to what is the returned object of Task.artifacts['...']
- when I call .get
I understand what I get, I'm asking because I want to see how the object I'm calling .get
on behaves
I tried what you said in the previous response, setting sdk.aws.s3.key
and sdk.aws.s3.secret
to the ones in my MINIO. Yet when I try to download an object, i get the following
` >>> result = manager.get_local_copy(remote_url="s3://*******:9000/test-bucket/test.txt")
2020-10-15 13:24:45,023 - trains.storage - ERROR - Could not download s3://*****:9000/test-bucket/test.txt , err: SSL validation failed for https://*****:9000/test-bucket/test.txt [SSL: WRONG_VERSION_NU...
Any news on this? This is kind of creepy, it's something so basic that I can't trust my prediction pipeline because sometimes it fails randomly with no reason
and the machine I have is 10.2.
I also tried nvidia/cuda:10.2-base-ubuntu18.04 which is the latest
` # Python 3.8.10 (default, Jun 2 2021, 10:49:15) [GCC 9.4.0]
clearml == 1.0.5
hyperopt == 0.2.5
matplotlib == 3.4.3
numpy == 1.21.2
pandas == 1.3.2
plotly == 5.3.0
python_dateutil == 2.8.2
scikit_learn == 0.24.2
statsmodels == 0.12.2
tqdm == 4.62.2
Detailed import analysis
**************************
IMPORT PACKAGE clearml
tasks/data_projection.py: 9
tasks/hp_optimization.py: 6
tasks/hpo_n_best_evaluation.py: 6
tasks/pipelines/monthly_predictions.py: 4
IMPORT PACKAGE hypero...
Confirmed working 😄
glad I managed to help back in some way
Okay so regarding the version - we are using 1.1.1
The thing with this error it that it happens sometimes, and when it happens it never goes away...
I don't know what causes it, but we have one host where it works okay, then someone else checks out the repo and tried and it fails for this error, while another guy can do the same and it will work for him
ClearML results page:
`
Launching step: 2019-09-03_2021-01-25_choose_best
Parameters:
{***}
Configurations:
None
Overrides:
None
Launching step: 2019-10-23_2021-01-15_choose_best
Parameters:
{********}
Configurations:
None
Overrides:
None
Launching step: 2019-05-26_2020-12-26_choose_best
Parameters:
{******}
Configurations:
None
Overrides:
None
Launching step: 2019-07-15_2021-01-05_choose_best
Parameters:
{************}
Configurations:
None
Overrides:
None
Launching step...
One sec I'll paste the relevant pieces of code
You should try trains-agent daemon --gpus device=0,1 --queue dual_gpu --docker --foreground
and if it doesn't work try quoting trains-agent daemon --gpus '"device=0,1"' --queue dual_gpu --docker --foreground