Reputation
Badges 1
25 × Eureka!Makes total sense!
Interesting, you are defining the sub-component inside the function, I like that, this makes the code closer to how this is executed!
Hi @<1523701168822292480:profile|ExuberantBat52>
I am trying to execute a pipeline remotely,
How are you creating your pipeline? and are you referring to an issue with the pipeline logic or is it a component that needs that repo installed ?
I would just add git+
None to your requirements (either in the requirements.txt or even better as part of the pipeline/component where you also specify the repo to be used)
The agent will automatically push the crednetilas when it installs the repo as wheel.
wdyt?
btw: you might also get away with adding -e .
into the requirements.txt (but you will need to test that one)
The second problem that I am running into now, is that one of the dependencies in the package is actually hosted in a private repo.
Add your private repo to the extra index section in the clearml.conf:
None
Hi ExuberantBat52
I do not think you can... i would use aws secret manager to push the entire user list config file wdyt?
Hi @<1645597514990096384:profile|GrievingFish90>
You mean the agent itself inside a docker then the agent spins sibling dockers for the Tasks ?
Great to hear SourSwallow36 , contributions are always appreciated π
Regrading (3), MongoDB was not build for large scale logging, elastic-search on the other hand was build and designed to log millions of reports and give you the possibility to search over them. For this reason we use each DB for what it was designed for, MongoDB to store the experiment documents (a.k.a env, meta-data etc.) and elastic-search to log the execution outputs.
Also, I would like to add some other plots t...
if they're mission critical, but rather the clearml cache folder?
hmmm... they are important, but only when starting the process. any specific suggestion ?
(and they are deleted after the Task is done, so they are temp)
Expected behaviour is that it reads last iteration correctly. At least it is stated in docs so.
This is exactly what should happen, are you saying that for some reason it fails?
GiganticTurtle0
I'm assuming here that self.dask_client.map(read_and_process_file, filepaths)
actually does the multi process/node processing. The way it needs to work, it has to store the current state of the process and then restore it on any remote node/process. In practice this means pickling the local variables (Task included).
First I would try to use a standalone static function for the map, DASK might be able to deduce it does not need to pickle anything, as it is standalone.
A...
Hi GiganticTurtle0
Is there a simple way to makeΒ
Task.init
Β compatible with
Dask.distributed
Β client?
Please tell me more π
I think Dask is trying to pickle you Task object (which is not pickable).
You can however create the Task once with Task.init
and pass the Task ID to the child processes and then use Task.init(..., continue_last_task=task_id_here)
wdyt?
How can I make it show progress less often/rewrite?
I'm not sure this is configurable ... you mean like reports on the uploads right? (i.e. report every 5mb I think is the default)
while we are at it, maybe we should use twdm if it is installed
wdyt?
JitteryCoyote63 the new wizard was pushed, you can check it out here:
https://github.com/allegroai/trains/blob/master/examples/services/aws-autoscaler/aws_autoscaler.py
BTW: next release to include it all is next week (hopefully :))
Thanks @<1719524641879363584:profile|ThankfulClams64> having a code that can reproduce it is exactly what we need.
One thing I might have missed and is very important , what is your tensorboard package version?
Hi @<1739818374189289472:profile|SourSpider22>
could you send the entire console log? maybe there is a hint somewhere there?
(basically what happens after that is the agent is supposed to be running from inside the container, but maybe it cannot access the clearml-server for some reason)
PompousParrot44
Check out the task.execute_remotely()
You can call it right after the task init, and it will enqueue your running Task, and leave the process (if you want).
https://github.com/allegroai/trains/blob/65a4aa7aa90fc867993cf0d5e36c214e6c044270/trains/task.py#L1437
Hi OutrageousGrasshopper93
When the Task is executed on a worker, the presence of spaces breaks the URLs and from the UI I cannot access to the resources on the bucket
You are saying the URLs generated in a remote execution are "broken" and on local execution are working, even though it is the same project/task name ?
Sorry if it's something trivial. I recently started working with ClearML.
No worries, this has actually more to do with how you work with Dask
The Task ID is the unique id of the any Task in the system (task.id will return the UID str)
Can you post a toy Dash code here, I'll explain how to make it compatible with clearml π
BTW: if you need to set env variables you can also add -e PYTHONPATH=/new/path
to the docker args
First I would check the CLI command it will basically prefill it for you:
https://clear.ml/docs/latest/docs/apps/clearml_task
Specifically to your question, working directory "." is the root of the git repo
But I would avoid adding it manually, use the CLI, it will either use ask you to provide info or take the git repo details from the local copy
We already redesigned the implementation so it should be quite easy to extend to GCP and Azure, what are you planning ?
Can't say I have noticed that, is this a delay on the send ? Which for some reason is correlated with the epochs ? What was the case with 0.17.5?
Hi OutrageousSheep60
Is there a way to instantiate a
clearml-task
while providing it a
Dockerfile
that it needs to build prior to executing the task?
Currently not really, as at the aned the agent does need to pull a container,
But you can cheive basically the same by adding the "dockerfile" script as --docker_bash_setup_script
Notice of course that this is an actual bash script not Docker script, so no need for "RUN" prefix.
wdyt?
correct on both.
notice that with upload
you can specify any storage (S3/GS/Azure atc)
Hi AttractiveWoodpecker16
I think is the correct channel for that question.
(any chance you can move your thread there?)
Specifically just email billing@clear.ml they will cancel (no need to worry about the beginning of the month, just explain and they will not charge over Nov)
EDIT: I know they are working on making it a one click in the UI, main limit is what happens with the data that was stored and was above the free tier threshold, anyhow I think next version will sort that as well.
Hi QuaintJellyfish58
You can always set it inside the function, withTask.current_task().output_uri = "s3://"
I have to ask, I would assume the agents are pre-configured with "default_output_uri" in the clearml.conf, why would you need to set it manually?
RoundMosquito25 good news, no no need to open any ports π
Basically B_i agents are always polling the server for "jobs" create an http/s request from them to the server, so all connections are out connections. Firewall is intact π