
Reputation
Badges 1
25 × Eureka!Hi SubstantialElk6ClearML-Data
doesn't actually "load" the data, it brings it locally and returns a folder with all your data files, from that point onward, it's up to your code to load it to the framework. Make sense ?
Meanwhile you can just sleep for 24hours and put it all on the services queue. it should work 🙂
Example here:
https://github.com/allegroai/trains/blob/master/examples/services/cleanup/cleanup_service.py
Hi JitteryCoyote63 , I cannot reproduce it... when I call set initial iteration 0, it does what I'm expecting, and resend the scalar. I tested with the clearml ignite example, any thoughts on how I can reproduce?
I'm not able to compare the tables of two experiments, is it a known issue ?
How so? they should appear one next to the other, the content of the two tables is not "really" compared, the standard is too complicated 😞 (apparently this is far from trivial)
SubstantialElk6 (2) yes definitely will be fixed
Regrading (1), what do you mean by "via the code" ? Do you mean like as a Task docker cmd ?
CooperativeFox72 could you expand on "not working"?
If you have a yaml file, I would do:
` # local_path = './my_config.yaml'
path = task.connect_configuration(local_path, name=name)
if task.running_locally():
with open(local_path, "r") as config_file:
my_params_dict = yaml.load(config_file, Loader=yaml.FullLoader)
my_params_dict['change_me'] = 'new value'
my_params_text = yaml.dump(my_params_dict)
store back the change, my_params assumed to be the content of the param file (tex...
UnevenDolphin73
we'd like the remote task to be able to spawn new tasks,
Why is this an issue? this should work out of the box ?
For classification it's F1 score but for other task it maybe and I don't think that's problem. we just have to log it right?
Correct 🙂
Give me few days, I will work on your sugestions and then let you know if I am not able to do this
Sounds good!
BTW:previous_tasks = Task.get_tasks(task_filter={'tags': 'best'}) local_model_file = previous_tasks[0].artifcats['my_model'].get_local_copy()
Thanks SubstantialElk6 !
I believe an initial a fix was pushed 😉 A full one (merging Task --env with k8s template) will be added soon
Could you run your code not from the git repository.
I have a theory, you never actually added the entry point file to the git repo, so the agent never actually installed it, and it just did nothing (it should have reported an error, I'll look into it)
WDYT?
maybe worth updating the main Readme.md in the github.. if someone try to follow the instructions there it breaks
Hmm I thought we already did, Yes you are absolutely correct, I'll make sure we do
This is the prerequisites of the docker service installed on the host machine (where the agent is running)
Basically follow: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html
https://docs.docker.com/compose/gpu-support/
well at this point I'm not sure it is still essential, we have 3 run-modes offline, local-server, cloud-sever and this option made it work for all of them.. can be that it is not required anymore and its just legacy..
LOL, sure if you have so many setups, that makes sense 🙂
this is strange.. you ran it with the dataclass config I added?
Yes but I had to remove the:from config_files import cfg
and instead used:
` @hydra.main(config_path="config_files", config_name="confi...
Woot woot
ChubbyLouse32 when you get it working please PR it, this is very very cool!
(I'll be happy to help 🙂 )
quick update, still trying to reproduce ...
yes.
Obviously when you import the offline session, you will need to set it to point to your server with the correct credentials
Yes, I think we just found out it breaks clearml 🙂
could you test with the latest stable, just in case ?
(I'll make sure we have an RC that supports the hydra dev version)
SillyPuppy19 yes you are correct, actually I can promise you the callback will be called from a different thread (basically the monitoring thread) so it's on the user to make sure the callback can handle it .
How about we move this discussion to GitHub?
https://github.com/allegroai/clearml/blob/fcad50b6266f445424a1f1fb361f5a4bc5c7f6a3/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py#L86
you can just pass the instance of the OptunaOptimizer, you created, and continue the study
I think you have it on the workers and queues page when you click on the worker you have its detials
But I do not have anything linked correctly since I rely in conda installing cuda/cudnn for me
From the log it installed:cudatoolkit==11.1.1
based on the CUDA it found on the host machine: agent.cuda_version = 110
But for some reason it installed the pytorch from the conda "pytorch" repo without the cuda support.
@<1523701868901961728:profile|ReassuredTiger98> in the UI can you see it in the "installed packages" section under the Execution Tab ?
Uninstall the current clearml-agent and reinstall this wheel, I hacked it to have ==, let's see if that works
Hi @<1689808977149300736:profile|CharmingKoala14> , let me double check that
@<1523701868901961728:profile|ReassuredTiger98> thank you so much for testing it!