Reputation
Badges 1
25 × Eureka!Yeah the ultimate goal I'm trying to achieve is to flexibly running tasks for example before running, could have a claim saying how many resources I can and the agent will run as soon as it find there are enough resources
Checkout Task.execute_remotely()
you can push it anywhere in your code, when execution get to it, If you are running without an agent it will stop the process and re-enqueue it to be executed remotely, on the remote machine the call itself becomes a noop,
I...
VexedCat68 I think this is the issue described here:
https://github.com/allegroai/clearml/issues/491
Can you test with the latest RC:pip install clearml==1.1.5rc1
VexedCat68 actually a few users already suggested we auto log the dataset ID used as an additional configuration section, wdyt?
Thank you, I would love to make sure we fix it
hey, that worked! what library is being used that reads that configuration?
It's passed to boto3, but the pyhon interface and aws cli use different configuration, I guess, because otherwise it should have worked...
can we also put the path to the CA?
Yes :)
Sure thing :)
BTW could you maybe PR this argument (marked out) so that we know for next time?
Thanks!
In the conf file, I guess this will be where ppl will look for it.
Specifically for this one, this is the auto generated docstring from the actual code, so PR to the
https://github.com/allegroai/clearml/blob/e53a76b713910adaf87578c69e86f8154d4ab4c1/clearml/logger.py#L152
Thank you GreasyPenguin14 , I think you are correct, in offline mode it should not check the "demo server" configuration (as it will not try to connect to a server anyhow).
Could you open a github issue? so this issue is addressed quickly
I'm kind of at a point where I don't know a lot of what to even search for.
we feel you 💗 , yes there still isn't a very good source of information on where to get started...
This is because the entire field is constantly changing and evolving, and one solution will usually only apply to specific use case...
I would start with the mlops community slack channel, and youtube talks (specifically those by companies describe how they built their own internal infrastructure, i...
Hi @<1545216070686609408:profile|EnthusiasticCow4>
is there a way to get the date from the InputModel?
You should be able to with model._get_model_data()
But I think we should have it all exposed, wdyt?
ShallowCat10 Thank you for the kind words 🙂
so I'll be able to compare the two experiments over time. Is this possible?
You mean like match the loss based on "images seen" ?
I mean manually you can get the results and rescale but, not through the UI
So General would have created a General instead of Args?
yes,
This is a must, you have to specify the hyperparameters section you are referencing.
https://github.com/allegroai/clearml/blob/5a9155b2039413280f13dfded1121470c4c4323d/examples/pipeline/step2_data_processing.py#L21
This is actually:task.connect(args, name='General')
Basically there is no "random_state" only "General/random_state"
Make sense ?
I would clone the first experiment, then in the cloned experiment, I would change the initial weights (assuming there is a parameter storing that) to point to the latest checkpoint, i.e. provide the full path/link. Then enqueue it for execution. The downside is that the iteration counter will start from 0 and not the previous run.
If you have the check point (see output_uri for automatically uploading it) then you can always load it. Do you mean if you can change the iteration/ step counter? Or do you mean with trains-agent?
While I'll look into it, you can do:from clearml import OutputModel output_model = OutputModel() output_model.update_weights("best_model.onnx")
Does it work if I launch the clearml-agent on a docker and pip doesn't know the packages to install
Not sure I follow... the "detect_with_pip_freeze" flag (when set) will tell clearml (at runtime) to create the "installed packages" directly from pip freeze (instead of analyzing the code)
Should not be complicated, it's basically here
https://github.com/allegroai/clearml/blob/1eee271f01a141e41542296ef4649eeead2e7284/clearml/task.py#L2763
wdyt?
What happened in the server configuration that all of a sudden you have zero ports open?
Hmm I would recommend passing it as an artifact, or returning it's value from the decorated pipeline function. Wdyt?
PompousBeetle71 that actually brings me to another question, how do you feel about "parent" experiment ?
Hi VexedElephant56
Yes it is:
Define CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1
(if running in doecker mode add -e CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=1 as container args)
https://clear.ml/docs/latest/docs/clearml_agent/clearml_agent_env_var
Yea the "-e ." seems to fit this problem the best.
👍
It seems like whatever I add to
docker_bash_setup_script
is having no effect.
If this is running with the k8s glue, there console out of the docker_bash_setup_script ` is currently Not logged into the Task (this bug will be solved in the next version), But the code is being executed. You can see the full logs with kubectl, or test with a simple export test
docker_bash_setup_script
` export MY...