Reputation
Badges 1
64 × Eureka!AgitatedDove14
Yes, I'd like to point to a specific binary, which is in a conda environment.
(b.t.w how can I specify the python version on the Task?)
AgitatedDove14 ,
From the experiment’s console log:
` - boto3==1.16.2
- botocore==1.19.2 `
Why do you have this part? isn’t it the same code, the script entry point is auto detected ?
Because I don’t always run the script locally from it’s directory and I have additional modules in the same directory that I import.
Sure this will work
I’ll make sure to update it
Oh wow AgitatedDove14 . Appreciate it!
Are you sure it’s just a matter of the python version?
The same experiment script, was working on the exact docker image in the past (with older clearml versions though…).
For example this experiment log:
what does that actually mean?2022-07-17 07:59:40,339 - clearml.storage - ERROR - Failed uploading: Parameter validation failed: Invalid type for parameter ContentType, value: None, type: <class 'NoneType'>, valid types: <class 'str'>
Hi @<1523701070390366208:profile|CostlyOstrich36> ,
The idea is indeed to control the object via API, but in that particular case, if I don't want the seed to be specified by the API but just set it to be current timestamp.
Could you think of a better use?
@<1523701070390366208:profile|CostlyOstrich36> am I doing anything wrong here?
Something like:
model = SomePytorchModel()
checkpoint = {'model_state_dict': model.state_dict()}
torch.save(checkpoint, “model.tar”)
AgitatedDove14 .
Note that the actual error is /workspace/miniconda/bin/python3: No module named clearml_agent
since all the packages (including clearml_agent) were already installed by the agent on the default (non conda) python binary.
AgitatedDove14
I'm not sure.
In my case I'm not trying to reproduce a local environment in the agent, but to run a script inside a docker which already has the environment built in.
The environment is conda based.
SuccessfulKoala55
we are using the fileserver which is configured at clearml.conf to a path on a network drive (i.e the NAS) -files_server: file:///mnt/clearml_storage
Because we want all our data to be stored on premises.
UnevenDolphin73 Thanks! Ill look into and reach out if needed
Oh!
That was so silly on my side...
It didn't work either. Still same error.
conda command was already in PATH and the conda activate
is executed properly, but it prompts to run conda init
(i.e conda wasn't configures at that shell):
` You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.
CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run
$ conda init <SHELL_NAME>
Currently supported shells are:
- ba...
The script is intended to be executed remotely.
Can I declare an absolute path in this case?
Thanks ExasperatedCrab78
AgitatedDove14 - attached
AgitatedDove14
It's still failing.
I updated clearml-agent to 1.2.0rc7 and also:docker_setup_bash_script= [ "export PATH=""/workspace/miniconda/bin:$PATH", "export LOCAL_PYTHON=/workspace/miniconda/bin/python3", "conda activate"])
But the conda activate (base env) returns:CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
I noticed that conda ...
Not sure I understand the purpose of this.
it meant pip will look for wheels at thus url?
AgitatedDove14 Yes, thats correct.
It's in my local conda environment though.
Thank for the great explanation! Now it makes much more sense.
You are right about the issue that 'kwcoco' isn't being detected, and Im actually running this as a single script, and the kwcoco not imported directly (but from within another package).
Ill try running it from a repo and see how it works.
Thanks @<1523701087100473344:profile|SuccessfulKoala55> @<1523701070390366208:profile|CostlyOstrich36> .
Fixed with the RC.
Thanks @<1523701070390366208:profile|CostlyOstrich36> .
What are considered as experiment objects?
No. But this isn't the only package that was installd using pip inside a conda env.
Compare it with Pandas as an example - was installed using pip nad it's actually instaled inside the worker as well (using pip).
AgitatedDove14 , did you test it using a worker, or with local execution?
I just tested https://github.com/allegroai/clearml/blob/master/examples/frameworks/pytorch/pytorch_mnist.py with a (docker based) worker and it yields the same error
` 2022-07-17 07:59:40,330 - clearml.Task - INFO - Waiting to finish uploads
2022-07-17 07:59:40,330 - clearml.storage - INFO - Starting upload: /tmp/.clearml.upload_model_0_4d_ikk.tmp => tapsff.local:9000/clearml/examples/PyTorch MNIST train.02ed1df11bf54...
Absolute in my hard drive?