Reputation
Badges 1
25 × Eureka!UpsetTurkey67 are you saying there is a sym link in the original repository, and when it copies it, it breaks the symlink ?
Hi ElegantCoyote26
what's the clearml version you are using?
Thanks MinuteGiraffe30 , fix will be pushed later today
t seems there is some async behavior going on. After ending a run, this prompt just hangs for a long time:
2021-04-18 22:55:06,467 - clearml.Task - INFO - Waiting to finish uploads
And there's no sign of updates on the dashboard
Hmm that could point to an issue uploading the last images (which are larger than regular scalars) could you try flushing and waiting ?
i.e.task.flush() sleep(45)
Can you let me know if i can override the docker image using template.yaml?
No, you cannot.
But you can pass OS environment "CLEARML_DOCKER_IMAGE" to set a diff default one
Do people generally update the same model βentryβ? That feels so wrong to meβ¦how do you reproduce a older model version or do a rollback etc?
Correct, they do not π On the Task itself the output models will reflect the diff filenames you saved, usually ppl just add a running number.
So the "packages" are the packages you need in the steps themselves ?
This doesn't seem to be running inside a container...
What's the clearml-agent launch command you are using ? (i.e. do you have --docker flag)
Hi SourSwallow36
- The same docker image is used for all three jobs, just because it is easier to manage and faster to download. The full code is available on the trains-server GitHub. If you want to spin the containers manually, check the docker-compose.yml on the main repo, it has all the commands there
- Fork the trains-server, commit the changes and don't forget to PR them ;)
- Elastic search is a database, we use it to log all the experiments outputs, console logs metrics etc. This...
Hi GiddyTurkey39
us the config file connect to the Task via Task.connect_configuration ?
Hi PleasantGiraffe85
Did you set git_host to only point to your host ? do you expect all the git clones to use SSH? how does the requirements.txt git link looks like ?
https://github.com/allegroai/clearml-agent/blob/bf07b7f76d3236c1118b81730c6d9718705a795a/docs/clearml.conf#L22
For .git-credentials remove the git_pass/git_user from the clearml.conf
If you want to use ssh you need to also add:force_git_ssh_protocol: truehttps://github.com/allegroai/clearml-agent/blob/a2db1f5ab5cbf178840da736afdc370cfff43f0f/docs/clearml.conf#L25
Will the new fix avoid this issue and does it still requires theΒ
incremental
Β flag?
It will avoid the issue, meaning even when incremental is not specified, it will work
That said the issue any other logger will be cleared as well, so, just good practice ...
From theΒ
logging
Β documentation ...
Hmmm so I guess Kedro should not use dictConfig ?! I'm not sure on the exact use case, but just clearing all loggers seems like a harsh approach
I mean, can you install it with something like ?pip install git+Basically the agent will install main repository, and any git submodules. But it cannot install multiple repositories, as the directory structure might be too much.
wdyt?
Can you clone the git with the .ssh credentials on the host machine ?
If so, can you do the same manually inside a docker (i.e. spin a docker with mount -v /home/hostuser/.ssh:/root/.ssh) ?
Weird ?!, I see this in the code:
https://github.com/allegroai/clearml/blob/382d361bfff04cb663d6d695edd7d834abb92787/clearml/automation/controller.py#L2871
Thanks SolidSealion72 !
Also, I found out that adding "pool.join()" after pool.close() seem to solve the issue in the minimal example.
This is interesting, I'm pretty sure it has something to do with the subprocess not "closing" properly (or too fast or something)
Let me see if I can reproduce
Hi @<1523715429694967808:profile|ThickCrow29> , thank you for pinging!
We fixed the issue (hopefully) can you verify with the latest RC? 1.14.0rc0 ?
Hi @<1523701066867150848:profile|JitteryCoyote63>
Setting to redis from version 6.2 to 6.2.11 fixed it but I have new issues now
Was the docker tag incorrect in the docker compose ?
Thanks PompousBaldeagle18 !
Which software you used to create the graphics?
Our designer, should I send your compliments π ?
You should add which tech is being replaced by each product.
Good point! we are also missing a few products from the website, they will be there soon, hence the "soft launch"
Disable automatic model uploads
Disable the auto uploadtask = Task.init(..., auto_connect_frameworks{'pytorch': False})
Hi ScantChimpanzee51
How are you launching the code ?
Basically the easiest way is to do so with the example you just mentioned,
Can this issue be reproduced ?
BTW: dockerhub is free and relatively cheap to upgrade π
(GitHub also offers docker registry)
but actually that path doesn't exist and it is giving me an error
So you are saying you only uploaded the "meta-data" i.e. a text file with links to the files, and this is why it is missing?
Is there a way to change the path inside the .txt file to clearml cache, because my images are stored in clearml cache only
I think a good solution would be to store the path in the txt file as relative path, i.e. instead of /Users/adityachaudhry/data/folder... as ./data/folder
It's dead simple to install:
Pip install trains-agent
the.n you can simply do:
Trains-agent execute --id myexperimentid
This is odd I was running the example code from:
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
It is stored inside a repo, but the steps that are created (i.e. checking the Task that is created) do not have any repo linked to them.
What's the difference ?
HugePelican43 sure you can, usually the limiting factor is memory, as it cannot be shared among processes, so if one allocated all memory the second process will crash with out of memory error
it is shown in the recording above
It was so odd, I had to ask π okay let me see if we can reproduce
I donβt have any error message in the browser console - Just an empty array returned on events.get_task_logs. This bug didnβt exist on version 1.1.0 and is quite annoyingβ¦
meaning the RestAPI returns nothing, is that correct ?
Is the clearml-agent queue not available in the open source?
fully available in the open source, what is missing is the SLURM connection, in the open source daemon is installed per machine (node) and spins containers/venv on the machine. The enterprise version adds support so it uses SLURM to provision the node. I hope it helps π
so do you think it would be possible to spin up another daemon, which listens to this daemon, which then runs a slurm job?
This is exactly what the ...