Reputation
Badges 1
25 × Eureka!assuming you have http://hparams.my _param
my suggestion is:
` @hydra.main(config_path="solver/config", config_name="config")
def train(hparams: DictConfig):
task = Task.init(hparams.task_name, hparams.tag)
overrides = {'my_param': hparams.value}
task.connect(overrides, name='overrides')
in remote this will print the value we put in "overrides/my_param"
print(overrides['my_param'])
now we actually use overrides['my_param'] `Make sense ?
Not at all, we love ideas on improving ClearML.
I do not think there is a need to replace feast, it seems to do a lot, I'm just thinking on integrating it into the ClearML workflow. Do you have a specific use case we can start to work on? Or maybe a workflow that would make sense to implment?
Can you post here the docker-compose.yml you are spinning? Maybe it is the wring one?
Step 4 here:
https://github.com/thepycoder/asteroid_example#deployment-phase
I pull all the parameters, and then manually filter on the HP keys (manually=I have to plug them in, they are not part of optimizer object)
So is this an improvement to optimizer._get_child_tasks_ids(...)
interface ?
e.g. return a structure like:[ { 'id': task_id, 'hp1': value, 'hp2': value, 'hp3': value, 'objective': dict(title='title', series='series', value=42 }, ]
AttractiveCockroach17 could it be Hydra actually kills these processes?
(I'm trying to figure out if we can fix something with the hydra integration so that it marks them as aborted)
ReassuredTiger98 could you provide more information ? (versions, scenario. etc.)
Hi @<1524922424720625664:profile|TartLeopard58>
canβt i embed scalars to notion using clearml sdk?
I think that you need the hosted version for it (it needs some special CORS stuff on the server side to make it work)
Did you try in the clearml report? does that work?
Thanks! I think I was able to locate the issue, but I wanted to verify π
Hi JitteryCoyote63
If you want to refresh the task object, call task.reload()
It will also refresh the artifacts.
The reason for not always do so when accessing the .artifacts objects is for speed optimization (It might be slow compared to dict access, and we assume users will expect it to behave the dict)
but does that mean I have to unpack all the dictionary values as parameters of the pipeline function?
I was just suggesting a hack π the fix itself is transparent (I'm expecting it to be pushed tomorrow), basically it will make sure the sample pipeline will work as expected.
regardless and out of curiosity, if you only have one dict passed to the pipeline function, why not use named arguments ?
Hi JitteryCoyote63
What do you have in the agent.cuda_version
?
(you can see it printed at the beginning of the log)
Hi @<1544128915683938304:profile|DepravedBee6>
You mean like backup the entire instance and restore it on another machine? Or are you referring to specific data you want to migrate?
BTW if you are upgrading old versions of the server I would recommend upgrading to every version in the middle (there are some migration scripts that need to be run in a few of them)
The issue only arises upon sending Images. (Both numpy, mpl and PIL)
BTW: they should appear under debug-samples
Tab in the results
Can I change the parameters before executing the draft task
Yes you can, after you clone the experiment everything becomes editable, so you can edit the config in the UI.
For example, let's assume I have config.yml, and in my code I do:my_file = task.connect_configuration('config.yml') with open(my_file, 'rt') as f: ...
Then after I clone it in the UI and edit the configuration, when it will be executed remotely,my_file
will contain the content of the configuration as s...
Weird that this code is also uploading to the 'Plots'. I replicated the same thing as my main script, but main script is still uploading to Debug Samples.
SmarmyDolphin68 are you saying the same code behaves differently ?
Hi GrotesqueDog77
and after some time I want to delete artifact with
You can simply upload with the same local file name and same artifact name, it will override the target storage. wdyt?
Hi BoredPigeon26
what do you mean by "reuse the task" ? is this manual execution (i.e. from code)?
How about archiving the old version?
You can also force Task.init to always create a new Task (which preserves the previous run alongside the execution tab)
Basically what's the specific use case ?
JitteryCoyote63 with pleasure π
BTW: the Ignite TrainsLogger will be fixed soon (I think it's on a branch already by SuccessfulKoala55 ) to fix the bug ElegantKangaroo44 found. should be RC next week
If this doesn't help.
Go to your ~/clearml.conf
file, at the bottom of the file you can add agent.python_binary
and change it to to the location of python3.6 (you can run which python3.6
to get the full path):agent.python_binary: /full/path/to/python3.6
can we somehow in clearml-session choose the pool of ports for work?
Yes, I think you can.
How do you spin the worker nodes? Is it Kubernetes ?
Hi @<1567321739677929472:profile|StoutGorilla30>
Is it necessary to serve keras model using triton engine?
It is not, but it is the most efficient way to serve keras models, and this is why by default clearml-serving is using Nvidia Triton (we are talking 10x factors)
I would start with the keras example, see that it works and then work your way into your example (notice you always need to provide the layers form the in/out of the model)
[None](https://github.com/allegroai/clearml-s...
PompousBeetle71 a few questions:
is this like using PyTorch distributed , only manually? Why don't you use call trains.init
in all the sub processes? We had a few threads on that, it seems like a recurring question, I'll make sure we have an example on GitHub. Basically trains will take care of passing the arg-parser commands to the sub processes, and also on torch node settings. It will also make sure they all report to the tame experiment.What do you think?
Hi HealthyStarfish45
You can disable the entire TB logging :Task.init('examples', 'train', auto_connect_frameworks={'tensorflow': False})
(without having to execute it first on Machine C)
Someone some where has to create the definition of the environment...
The easiest to go about it is to execute it one.
You can add to your code the following linetask.execute_remotely(queue_name='default')
This will cause you code to stop running and enqueue itself on a specific queue.
Quite useful if you want to make sure everything works, (like run a single step) then continue on another machine.
Notice that switching between cpu...
Hmm what's the clearml version? Whats the python version, whats the OS? And pytorch version?
Hi AdorableFrog70
I assume so, there's API for everything so you can always get the data. wdty?