Reputation
Badges 1
25 × Eureka!Ohh so you are saying you can store it properly, but only editing in the UI is limited ? (Maybe this is just a UI thing)
Hi @<1523702786867335168:profile|AdventurousButterfly15>
I am running cross_validation, training a bunch of models in a loop like this:
Use the wildcard or disable all together:
task = Task.init(..., auto_connect_frameworks={"joblib": False})
You can also do
task = Task.init(..., auto_connect_frameworks={"joblib": ["realmodelonly.pkl", ]})
And you are calling Task.init? And the scalars show under scalars and the images are not under debug samples?
Is it possible in Clearml to somehow allocate resources so that maybe after running a number of Alice's tasks, Bob's task get processed (Like maybe Round robin fashion)
Hi DeliciousBluewhale87
A few options here:
set the agent with high / low priority queues. Make sure Alice pushes into low priority (aka HPO) then Bob can push into high priority when he needs. This makes a lot of sense when you have automation processes spinning many experiments. expanding (1) you could set differe...
Hi DeliciousBluewhale87
You can achieve the same results programmatically with Task.create
https://github.com/allegroai/clearml/blob/d531b508cbe4f460fac71b4a9a1701086e7b6329/clearml/task.py#L619
in the UI, find the task (just search for the Task ID, it will find it), then tight click it, and select "reset"
Yes, there is no real limit, I think the only requirements id docker v19+
Hey IntriguedRat44 ,
Is this what you are after?
https://github.com/allegroai/trains/issues/181
For reporting the console logs you can use :logger.report_text("my log line here", print_console=False)
https://github.com/allegroai/clearml/blob/b4942321340563724bc16f60ea5dd78c9161778d/clearml/logger.py#L120
Maybe we should add it to Storage Manager? What do you think?
But the git apply failed, the error message is the "xxx already exists in working directory" (xxx is the name of the untracked file)
DefeatedOstrich93 what's the clearml-agent
version?
No sure I follow, you mean to launch it on the kubernretes cluster from the ClearML UI?
(like the clearml-k8s-glue ?)
Hi AntsySeagull45
Any chance the original code was running with python2?
Which version of trains-agent are you using?
VivaciousWalrus99
Yes this is odd:1608392232071 spectralab:gpu0 DEBUG New python executable in /cs/usr/gal.hyams/.trains/venvs-builds/3.7/bin/python2
So it thinks it has python v3.7 but it is using python2 in the venv...
In your trains.conf file, set agent.python_binary to the python3.7 binary. It should be something like:agent.python_binary=/path/to/python/python3.7
Each user creates aย
.env
ย file for their needs or exports them in the shell running the python code. Currently I copy the environment variables to an S3 bucket and download it from there
That is a great hack, but who carries the credentials for the S3 bucket? the reason for asking is I;m thinking maybe the code would directly do that (meaning download the .env file and apply them?!)
Hi GrotesqueOctopus42
Dispite having reuse_last_task_id=True on Task.init, it always creates a new task id. Anyone ever had this issue?
So the way "reuse_last_task_id=True" works is that if there are no artifacts on the Task it will reuse it, but when running inside jupyter it always has artifacts (the notebook itself), so it starts a new Task.
You can however pass a specific Task ID and it will reuse it "reuse_last_task_id=aabb11", would that help?
Let me check, it was supposed to be automatically aborted
I was just able to reproduce with "localhost"
The new parameterย
abort_on_failed_steps
ย could be a list containing the name of the
I like that, we can also have it as an argument per step (i.e. the decorator can say, abort_pipeline_on_fail or continue_pipeline_processing)
I'm hoping we are ready to release
Hi SmallDeer34
I need some help what is the difference between the manual one and the automatic one ?
from your previous log, this is the bash command executed inside the container, can you try to "step by step" try to catch who/what is messing it up ?
` docker run -it --gpus "device=1" -e CLEARML_WORKER_ID=Gandalf:gpu1 -e CLEARML_DOCKER_IMAGE=nvidia/cuda:11.4.0-devel-ubuntu18.04 -v /home/dwhitena/.git-credentials:/root/.git-credentials -v /home/dwhitena/.gitconfig:/root/.gitconfig -v /tmp/...
LovelyHamster1 Now I see... Interesting credentials ability. Specifically all the S3 access on trains is derived from the ~/clearml.conf
credentials section :
https://github.com/allegroai/clearml/blob/ebc0733357ac9ead044d0ed32d41447763f5797e/docs/clearml.conf#L73
( or the AWS S3 environment variables )
I'm not sure how this AWS feature works, I suspect it is changing the AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY variables on the ec2 instance. If this is the case, it should work out of...
Notice that you can embed links to specific view of an experiment, by copying the full address bar when viewing it.
LOL yes ๐
just make sure it won't be part of the uncommitted changes of the AWS autoscaler ๐