Reputation
Badges 1
25 × Eureka!You can definitely configure the watchdog to set the timeout to 15min, it should not have any effect on running processes, they basically ping every 30 sec alive message
DilapidatedDucks58 if you have so many parameters, why don't you use the
task.connect_configuration(dict)
It will put it in the artifacts, as an editable json alike string.
if I use automatic code analysis it will not find all packages because ofย
importlib
.
But you can manually add them with Task.add_requirements, no?
It might be that the worker was killed before unregistered, you will see it there but the last update will be stuck (after 10min it will be automatically removed)
could be nice to have a direct "task comparison" link in the UI somewhere,
you mean like a "cart" for comparison ? or just to "save the state" so you can move between projects ?
Yes, it could, crontab uses the user it is running from (root if used with sudo)
what do you mean? the same env for all components ? if they are using/importing exactly the same packages, and using the same container, then yes it could
You can make reports on experiments with interactive graphs
Yes, I can totally see how this is a selling point. The closest is the Project Overview (full markdown capabilities, with the ability to embed links to specific experiments). You can also add a "leader metric", so you can track the project performance/progress over time.
I have to admit that creating a better reporting tool is always pushed down in priority as I think this is a good selling point to management but the actual ...
Okay, so I can't figure why it would "kill" the new experiments, I mean it should run them, but is there any "smart stopping" that causes it to kill he process before it ends ?
BTW: can this be reproduced with the clearml hydra example ?
Hi NaughtyFish36
c++ module fails to import, anyone have any insight? required c++ compilers seem to be installed on the docker container.
Can you provide log for the failed Task?
BTW: if you need build-essentials you can add it as the Task startup scriptapt-get install build-essentials
Hi @<1529633468214939648:profile|CostlyElephant1>
what seems to be the issue? I could not locate anything in the log
"Environment setup completed successfully
Starting Task Execution:"
Do you mean it takes a long time to setup the environment inside the container?
CLEARML_AGENT_SKIP_PIP_VENV_INSTALL and CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL,
It seems to be working, as you can see no virtual environment is created, the only thing that is installed is the cleartml-agent that i...
That depends on the HPO algorithm, basically the will be pushed based on the limit of "concurrent jobs", so you do not end up exploding the queue. It also might be a Bayesian process, i.e. based on previous set of parameters and runs, like how hyper-band works (optuna/hpbandster)
Make sense ?
Hi ElegantCoyote26
is there a way to get a Task's docker container id/name?
you mean like Task.get_task("task_id_here").get_base_docker() ?
ow a Task's results page also has a plot for this, but I guess it's at the machine level and not the task level?
This is actually on the container level, meaning checked from inside the container. It should be what you are looking for
Seems lime someone sitting in the middle and reroutes the request (maybe both https and port) ?!
Not intentional! When I launched the AMI it was running an older version
I think this is exactly the reason they decided to change the location ๐ so you will have to manually upgrade, reasoning is we changed directory names (maybe a few more things)
Yes shutdown the current docker copse curl the new docker compose rename folder spin it up againFull instructions here:
https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_server_aws_ec2_ami.html#upgrading
which part of the code?
the main script?!
but is not part of the package
is the repo it self a package ?
It's dead simple to install:
Pip install trains-agent
the.n you can simply do:
Trains-agent execute --id myexperimentid
PompousParrot44
you can always manually store/load models, example: https://github.com/allegroai/trains/blob/65a4aa7aa90fc867993cf0d5e36c214e6c044270/examples/reporting/model_config.py#L35 Sure, you can patch any frame work with something similar to what we do in xgboost, any such PR will be greatly appreciated! https://github.com/allegroai/trains/blob/master/trains/binding/frameworks/xgboost_bind.py
Hi BroadMole64
'from X import Y', which says that there isn't such module X. any help? thanks.
can you see package X under the "Execution" tab "Installed Packages" section ?
(think of this section as requirements.txt section, in order for the agent to install the package on the remote machine it should have it listed there)
Hi UnevenBee3
the optuna study is stored on the optuna class
https://github.com/allegroai/clearml/blob/fcad50b6266f445424a1f1fb361f5a4bc5c7f6a3/clearml/automation/optuna/optuna.py#L186
And actually you could store and restore it
https://github.com/allegroai/clearml/blob/fcad50b6266f445424a1f1fb361f5a4bc5c7f6a3/clearml/automation/optuna/optuna.py#L104
I think we should improve the interface though, maybe also add get_study(), wdyt?
cannot schedule new futures after interpreter shutdown
This implies the process is shutting down.
Where are you uploading the model? What is the clearml version you are using ? can you check with the latest version (1.10) ?
Hmm, I see the jump from 50 to 100, is that consistent with the last iteration on the aborted Task (before continuing )?
Hi FreshKangaroo33
clearml.conf is HOCON format, to parse you can use pyhocon:
https://github.com/chimpler/pyhocon
Or the built in version of clearml:from clearml.utilities.pyhocon import ConfigFactory config_dict = ConfigFactory.parse_string(text).as_plain_ordered_dict()You can also just get the parsed objectfrom clearml.config import config_obj
Hi @<1541954607595393024:profile|BattyCrocodile47>
Has anyone used ClearML for this use case?
you mean as experiment management / model registry / data? I think this is the bread&butter of clearml ๐
regrading the other options ion the list, I think most of them are alternatives to metaflow, not covering the parts you mentioned, no?
if they're mission critical, but rather the clearml cache folder?
hmmm... they are important, but only when starting the process. any specific suggestion ?
(and they are deleted after the Task is done, so they are temp)
@<1523704157695905792:profile|VivaciousBadger56> regrading: None
Is this a discussion or PR ?
(general ranting is saved for our slack channel ๐ )