
Reputation
Badges 1
25 × Eureka!ok, but this happens in my local machine, not in the agent
resource monitoring is always running in the background, even on local machines. (of course you can turn it off)
PricklyRaven28 did you set the iam role support in the conf?
https://github.com/allegroai/clearml/blob/0397f2b41e41325db2a191070e01b218251bc8b2/docs/clearml.conf#L86
but does that mean I have to unpack all the dictionary values as parameters of the pipeline function?
I was just suggesting a hack π the fix itself is transparent (I'm expecting it to be pushed tomorrow), basically it will make sure the sample pipeline will work as expected.
regardless and out of curiosity, if you only have one dict passed to the pipeline function, why not use named arguments ?
Can you please tell me if you know whether it is necessary to rewrite the Docker compose file?
not by default, it should basically work out of the nox as long as you create the same data folders on the host machine (e.g. /opt/clearml)
This cam be as simple as a pod or a more complete helm chart.
True, and this could be good for batch processing, but if you want restapi service then clearml-serving is probably a better fit
does that make sense ?
(I suspect you are correct, but I'm missing some information in order to understand where the problem is)
WackyRabbit7 can you send mock code that explains how you create the pipeline ?
I managed to do it by using logger.report_scalar, thanks!
Sure, but for future reference where (in ignite callbacks) did you add the report_scalar
call ?
requirements specified with git repo
you mean the reuqirements.txt is inside the gir repo? or do you mean a link to the git-repo as part of the requirements?
Can you also provide an example of the content, I think I have an idea
StaleMole4 you are printing the values before Task.init had the chance to populate it.
Basically try moving the print after closing the Task (closing the tasks waits for the async update)
Make sense ?
Hi @<1523704198338711552:profile|RoughTiger69>
From this scenario can we assume the "selection" will be tagging the model manually?
Also, how would an human operator decide on the best model, that is what is the input to base the decision on?
Then the only other option is the /tmp
is out of space (pip uses it to uncompress the .whl files, then it deletes them)
wdyt?
The problem is not really for the agents to wait (this is easily solved by additional high priority queue) the problem is will you have a "free" agent... you see my point ?
Looking at theΒ
supervisor
Β method of the baseΒ
AutoScaler
Β class, where are the worker IDs kept.
Is it in the class attributeΒ
queues
Β ?
Actually the supervisor is passing a fixed prefix, then it asks the clearml-server on workers starting with this name.
This way we can have a fixed init script for all agents, while we still can differentiate them from the other agent instances in the system. Make sense ?
CheerfulGorilla72 sounds like a great idea, I'll pass along to documentation ppl π
I will take any suggestion πgit remote -v
could be a good start but I'm not familiar with the output structure, is there a template for parsing ?
DistressedGoat23 check this example:
https://github.com/allegroai/clearml/blob/master/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.pyaSearchStrategy = RandomSearch
It will collect everything on the main Task
This is a curial point for using clearml HPO since comparing dozens of experiments in the UI and searching for the best is just not manageable.
You can of course do that (notice you can actually order them by scalars they report, and even do ...
Hurray π
BTW: the next version will have a project level "readme alike" markdown embedded in the UI, so hopefully you will be able to add all the graphs there :)
I had again the same problem but within a remote pipeline setup.
Are you saying the ussue is not fixed? can you verify the pipeline & pipeline components are using the at least 1.104rc0 version?
@<1523720500038078464:profile|MotionlessSeagull22> you cannot have two graphs with the same title, the left side panel presents graph titles. That means that you cannot have a title=loss series=train & title=loss series=test on two diff graphs, they will always be displayed on the same graph.
That said, when comparing experiments, all graph pairs (i.e. title+series) will be displayed as a single graph, where the diff series are the experiments.
RipeGoose2 models are automatically registered
i.e. added to the models artifactory, but it only points to where the files are stored
Only if you are passing the output_uri
argument to the Task.init, they will be actually uploaded.
If you want to disable this behavior you can passTask.init(..., auto_connect_frameworks={'pytorch': False})
UPD: works on 1.7.0 as well, the bug is introduced in 1.8.0
Thanks JitteryCoyote63 , just to be clear, is this only in comparison or also on the individual Tasks ?
Thanks SmallDeer34 , I think you are correct, the 'output' model is returned properly, but "input" are returned as model name not model object.
Let me check something
NastyFox63 ask SuccessfulKoala55 tomorrow, I think there is a way to change the default settings even with the current version.
(I.e. increase the default 100 entries limit)
this?ids = [t.id for t in top_task]
No worries, it's always good to know what can be built later.
I would start with a static .env file (i.e. the same for everyone), or start with hacking the python code to load the .env at the beginning π€