Reputation
Badges 1
64 × Eureka!SuccessfulKoala55
we are using the fileserver which is configured at clearml.conf to a path on a network drive (i.e the NAS) -files_server: file:///mnt/clearml_storage
can you provide some mode details please ? Do you intend to store your artefacts locally or remotely ?
Does the manual reporting also fails ?
If you could also give your clearml packages versions it could help
I store the artifacts on a minio server (in my LAN).
If I run the python script locally (i.e. no execute remotely()
it works fine).
I use the latest clearml 1.6.2
Did you by any chance save the checkpoint without any file extention? Or with a weird name containing sl...
CostlyOstrich36
I do get errors - failing to launch the clearml images.WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested standard_init_linux.go:228: exec user process caused: exec format error
Well, after diving into this, it seems like the clearml images were built usin amd64 (on top of amd64 base images...)
That would be a very useful feature.
What is the status of that issue? I havn't found it on github.
AgitatedDove14
I'm not sure.
In my case I'm not trying to reproduce a local environment in the agent, but to run a script inside a docker which already has the environment built in.
The environment is conda based.
After signing with google, the login page is stuck at this
AgitatedDove14 ,
From the experiment’s console log:
` - boto3==1.16.2
- botocore==1.19.2 `
what does that actually mean?2022-07-17 07:59:40,339 - clearml.storage - ERROR - Failed uploading: Parameter validation failed: Invalid type for parameter ContentType, value: None, type: <class 'NoneType'>, valid types: <class 'str'>
Not sure I understand the purpose of this.
it meant pip will look for wheels at thus url?
Thanks AgitatedDove14 !
I’ll use clearml 1.4.1 until the fix is out.
AgitatedDove14 its running inside a docker based worker.
Are you interested in the full pip freeze of that docker?
The script is intended to be executed remotely.
Can I declare an absolute path in this case?
Oh wow AgitatedDove14 . Appreciate it!
Are you sure it’s just a matter of the python version?
The same experiment script, was working on the exact docker image in the past (with older clearml versions though…).
For example this experiment log:
CostlyOstrich36
Is that command evaluated prior to the task creation?
Or only after the task is executed remotely?
Why do you have this part? isn’t it the same code, the script entry point is auto detected ?
Because I don’t always run the script locally from it’s directory and I have additional modules in the same directory that I import.
Sure this will work
I’ll make sure to update it
Yes you are right.
This is the default docker image from clearml, and I was thinking that the agent will install conda if it's not already there (like it installs pip...) Isn't it?
AgitatedDove14
Yes, I'd like to point to a specific binary, which is in a conda environment.
(b.t.w how can I specify the python version on the Task?)
AgitatedDove14
It's still failing.
I updated clearml-agent to 1.2.0rc7 and also:docker_setup_bash_script= [ "export PATH=""/workspace/miniconda/bin:$PATH", "export LOCAL_PYTHON=/workspace/miniconda/bin/python3", "conda activate"])
But the conda activate (base env) returns:CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
I noticed that conda ...
Hi @<1523701070390366208:profile|CostlyOstrich36> ,
The idea is indeed to control the object via API, but in that particular case, if I don't want the seed to be specified by the API but just set it to be current timestamp.
Could you think of a better use?
AgitatedDove14 .
Note that the actual error is /workspace/miniconda/bin/python3: No module named clearml_agent
since all the packages (including clearml_agent) were already installed by the agent on the default (non conda) python binary.
@<1523701070390366208:profile|CostlyOstrich36>
from clearml import Task
from clearml.automation import HyperParameterOptimizer, UniformIntegerParameterRange, DiscreteParameterRange
task = Task.init(
project_name="examples",
task_name="HP optimizer",
task_type=Task.TaskTypes.optimizer,
reuse_last_task_id=False,
)
task.execute_remotely(queue_name="services")
an_optimizer = HyperParameterOptimizer(
base_task_id="c7618e30ff5c4955b4942971b410f72d",
...
@<1523701070390366208:profile|CostlyOstrich36> am I doing anything wrong here?
Thank for the great explanation! Now it makes much more sense.
You are right about the issue that 'kwcoco' isn't being detected, and Im actually running this as a single script, and the kwcoco not imported directly (but from within another package).
Ill try running it from a repo and see how it works.
AgitatedDove14
It was installed by 'pip install kwcoco' while my conda env was active.
Not sure if it answers your question..
So I run the same script as part of a git repo - but unfortunately the package is still missing.
I'm not sure if it matters but 'kwcoco' is being imported inside one of the repo's functions and not on the script's header.
AgitatedDove14 , here's the log