Reputation
Badges 1
25 × Eureka!CourageousLizard33 Are you using the docker-compose to setup the trains-server?
but here I can tell them: return a dictionary of what you want to save
If this is the case you have two options, either store the dict as an artifact (this makes sense if this is not standalone model you would like to later use), or store as an artifact.
Artifact example:
https://github.com/allegroai/clearml/blob/master/examples/reporting/artifacts.py
getting them back
https://github.com/allegroai/clearml/blob/master/examples/reporting/artifacts_retrieval.py
Model example:
https:/...
Why can't it be updated after creation?
You can but then you have to rerun it again. I mean technically this is obviously solvable, but the idea was to make it simple to use, and since we "assume" in most cases there is a single Task per execution, it made sense. wdyt?
Hi ClumsyElephant70
s there a way to run all pipeline steps, not in isolation but consecutive in the same environment?
You mean as part of a real-time inference process ?
Hi WickedGoat98 ,
I think you are correct π
I would guess it is something with the ingress configuration (i.e. ConfigMap)
Hi ScantChimpanzee51
btw: this seems like an S3 internal error
https://github.com/boto/s3transfer/issues/197
UnevenDolphin73 i would use apiclient:
APIClient().projects.edit(project=project_id, system _tags=[])
*I might have a few typos above but that should be the gist
Okay, I think I lost you...
DilapidatedDucks58 you mean detect at which "iteration" the max value was reported, and then extract all the other metrics for that iteration ?
Hi @<1529633468214939648:profile|CostlyElephant1>
Is it possible to get user ID of the current user
On the Task.data object itself there should be a filed named " user " that's the user ID of the owner (creator) of the Task.
You can filter based on this id with
Tasks.get_tasks(..., task_filter={'user': ["user-id-here"]})
wdyt?
User/pass should be enough,
Could it be the specific commit ID is not pushed?
Hi @<1610083503607648256:profile|DiminutiveToad80>
do you have a full log? can you share the code you are trying to run?
Could it be that this is the callback that causes it?
None
Please hit Ctrl-F5 refresh the entire page, see if it is till empty....
throw an error when running withoutΒ
clearml.conf
Β which tells the user to run clearml-init first?
I would like potential users to be able to just run the example code and get the experience, or even integrate with their code, without the need to run a single configuration
(Basically to alleviate as many potential hurdles from getting users on board clearml)
I'm assuming the reason it fails is that the docker network is Only available for the specific docker compose. This means when you spin Another docker compose they do not share the same names. Just replace with host name or IP it should work. Notice this has nothing to do with clearml or serving these are docker network configurations
I'm trying to figure if this is reproducible...
Hi @<1578555761724755968:profile|GrievingKoala83>
Two tasks are created, but the training does not begin, both tasks are in perpetual running.
Can you print something after the task.launch_multi_node(args.nodes)) - I'm assuming the two Tasks are running and are blocked on the " Trainer " class
If specified
args.gpus=2
and args.nodes=2,
three
tasks are created.
This is really odd, can you add some prints with task id and rank after the ...
Thanks Martin, so does it mean I wonβt be able to see the data hosted on S3 bucket in ClearMl dashboard under datasets tab after registering it?
Sure you can, let's assume you have everything in your local /mnt/my/data you can just add this folder with add_files then upload to your S3 bucket with upload(output_uri=" None ",...)
make sense ?
This smells like a driver/image issue on the instance VM
What are you getting if add this inside your code?
os.system('nvidia-smi')
But I think this error has only appeared since I upgraded to version 1.1.4rc0
Hmm let me check something
btw: you can also configure --extra-index-url in the agent's clearml.conf