
Reputation
Badges 1
25 × Eureka!my experiment logic
you mean the actual code doing the training ?
so that it gets lazily executed and not at task definition time
Task definition time -> when creating the Pipeline Task? remember the base_task_factory a the end creates a Task object (it does not run the code itslef).
BTW: if you have simple training logic you can use pipeline decorators , it might be a better fit?
https://clear.ml/docs/latest/docs/fundamentals/pipelines#pipeline-from-function-decorator
Hi SubstantialElk6
quick update, once clearml 1.1 is out, we will push the clearml-data improvement, supporting chunks per version (i.e. packaging the changeset into multiple zip files, instead of a single one as the current version does).
regrading (1) storage limit server.
Ideally, we should be able to specify the batch size that we want to download, or even better, tie this in with the training by parallelising the data download, data preprocessing and batch trains.
With the nex...
Does what you suggested here >
Yes, it is basically the same underlying mechanism, only instead of 1-to-1 it's 1-to-many
To clarify, there might be cases where we get helm chart /k8s manifests to deploy a inference services. A black box to us.
I see, in that event, yes you could use clearml queues to do that, as long as you have the credentials the "Task" is basically just a deployment helm task.
You could also have a monitoring code there so that the same Task is pure logic, spinning the helm chart, monitoring the usage, and when it's done taking it down
Okay great, so we do have the Args section there.
What do you have in the "Execution" tab?
Does it work if I launch the clearml-agent on a docker and pip doesn't know the packages to install
Not sure I follow... the "detect_with_pip_freeze" flag (when set) will tell clearml (at runtime) to create the "installed packages" directly from pip freeze (instead of analyzing the code)
I want the task of human tagging a model to be βjust another step in the pipelineβ
That makes total sense.
Quick question, would you prefer the pipeline controller to "wait" for the tagging and then continue, or would it make more sense to create a trigger on the tagging ?
Hi @<1720249421582569472:profile|NonchalantSeaanemone34>
pipeline decorator where lambda function call another function(say
xyz
) and during pipeline execution, error is thrown that
xyz
is not defined?
Each pipeline function becomes a standalone "script", which I assume if the lambda function is defined outside of the decorated pipeline component function, would throw an undefined error.
My suggestion would be to define the lambda function as a nes...
somehow set docker_args and docker_bash_setup_script equivalent??
task.set_base_docker(...)# somehow setup repo and branch to download to remote instance before running
This is automatically detected based on your local commit/branch as well ass uncommitted changes
I find it quite difficult to explain these ideas succinctly, did I make any sense to you?
Yep, I think we are totally on the same wavelength π
However, it also seems to be not too prescriptive,
One last question, what do you mean by that?
This line π
None
Notice Triton (and so is clearml-serving) needs the pytorch model to be converted into torchscript, so that the triton backend can load it
StorageHelper is used internally.
I'll make sure we remove it from the examples/docs
Hi StraightDog31
I am having trouble using theΒ
StorageManager
Β to upload files to GCP bucket
Are you using the storagemanager
directly ? or are you using task.upload_artifact
?
Did you provide the GS credentials in the clearml.conf file, see example here:
https://github.com/allegroai/clearml/blob/c9121debc2998ec6245fe858781eae11c62abd84/docs/clearml.conf#L110
I might have found it, tqdm is sending{ 1b 5b 41 } unicode arrow up?
https://github.com/horovod/horovod/issues/2367
SourOx12
Hmmm. So if last iteration was 75, the next iteration (after we continue) will be 150 ?
Hi UnevenDolphin73
You mean this part?
https://github.com/allegroai/clearml-agent/blob/5afb604e3d53d3f09dd6de81fe0a494dacb2e94d/docs/clearml.conf#L212
(In other words, theΒ
the Task's Environment section
Β is a bit unclear)
Yes we should expand, but generally you are correct it should work as you described π
apologies @<1798887585121046528:profile|WobblyFrog79> somehow I missed your reply,
My workflow is based around executing code that lives in the same repository, so itβs cumbersome having to specify repository information all over the place, and changing commit hash as I add new code.
It automatically infers the repo if the original as long as the pipeline code itself is inside the repo, by that I mean the pipeline logic, when you run it the first time (think development etc), if it s...
Hi @<1715900788393381888:profile|BitingSpider17>
Notice that you need __ (double underscore) for converting "." in the clearml.conf file,
this means agent.docker_internal_mounts.sdk_cache
will be CLEARML_AGENT__AGENT__DOCKER_INTERNAL_MOUNTS__SDK_CACHE
None
Actually it is better to leave it as is, it will just automatically mount the .ssh folder into the container, i will make sure the docs point to this option first
yup! that's what I was wondering if you'd help me find a way to change the timings of. Is there an option I can override to make the retry more aggressive?
you mean wait for less?
None
add to your clearml.conf:
api.http.retries.backoff_factor = 0.1
Is it not possible to serve a model with preprocessing pipeline from scikit-learn using clearml-serving?
of course it is, did you first try the example , here: None
If you need to run your own LogisticRegression
call you can use this example:
None
Notice this is where the custom endpoint actually calls the prediction: [None](https...
This would be a good example?
https://github.com/allegroai/clearml/blob/master/examples/services/monitoring/slack_alerts.py
The only important for me is to know if exist anyway to get more information in the apiserver log
what do you mean by that ?
Long story short, this is done internally when you call the Task.init (I think, there is a chance it is called before)
One way of controlling it would be to have something like:Task.init(auto_connect_frameworks={'hydra': {'log_before_resolve': True}})
That said, I think it will be simpler to store both (in different section of course)
Maybe "Configuration Object: OmegaConf" and "Configuration Object: OmegaConfDefinition" ?
task=Task.current_task()
Will get me the task object. (right?)
PanickyMoth78 yes, always, from anywhere, this is a singleton object π
Yeah.. that should have worked ...
What's the exact error you are getting ?