Reputation
Badges 1
25 × Eureka!Hi @<1610808279263350784:profile|FriendlyShrimp96>
Is there a way to get a list of variants given a metric, or even just a full list of metrics and variants for a given task id?
Try this
None
from clearml.backend_api.session.client import APIClient
c = APIClient()
metrics = c.events.get_task_metrics(tasks=["TASK_ID_HERE"], event_type="training_debug_image")
print(metrics)
I think API ...
Hi ScatteredClams84
Is there any parameter that adjusts the "number of files that can be stored in the cache"? I am using clearml python version 1.0.3 to upload artifacts and get the artifacts back from a task. (edited)
Yes you are correct, the default value is 100 entries.
You can configure it in the clearml.conf, just add:sdk.storage.cache.default_cache_manager_size = 1000
or from code:
` from clearml.storage.cache import CacheManager
CacheManager.get_cache_manager(cache_file_...
LazyFish41 just making sure, you built a container from the docker file, and used it as base docker image for the Task, is that correct ?
Also notice the cleaml-agent will not change the entry point of the docker meaning if the entry point does not end with plain bash, it will not actually run anything
that embed seems to be slightly off with regards to where the link is actually pointing to
I think this is the Slack preview... 😞
Hi FierceHamster54
Sure just dodataset = Dataset.get(dataset_project="project", dataset_name="name")
This will by default fetch the latest version
- Maybe we should add an option, archive components as well ...
The pod has an annotation with a AWS role which has write access to the s3 bucket.
So assuming the boto environment variables are configured to use the IAM role, it should be transparent, no? (I can't remember what the exact envs are, but google will probably solve it 🙂 _
AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_SESSION_TOKEN. I was expecting clearml to pick them by default from the environment.
Yes it should, the OS env will always override the configuration file sect...
Hi JealousParrot68
no need for decorators, you can just pass the function to schedule_function=<function goes here>
🙂
See scheduler here
https://github.com/allegroai/clearml/blob/8708967a5ef4d8529a1a5ea417672e3ebbb258d7/clearml/automation/scheduler.py#L485
And triggers here:
https://github.com/allegroai/clearml/blob/8708967a5ef4d8529a1a5ea417672e3ebbb258d7/clearml/automation/trigger.py#L193
https://github.com/allegroai/clearml/blob/8708967a5ef4d8529a1a5ea417672e3ebbb258d7/clea...
Hi SubstantialElk6
noted that clearml-serving does not support Spacy models out of the box and
So this is a good point.
To add any pissing package to the preprocessing docker you can just add them in the following environment variable here: https://github.com/allegroai/clearml-serving/blob/d15bfcade54c7bdd8f3765408adc480d5ceb4b45/docker/docker-compose.yml#L83EXTRA_PYTHON_PACKAGES="spacy>1"
Regrading a custom engine, basically this is supported with --engine custom
you c...
I have timeseries dataset with dimension 1,60,1 which the first dimension is number of data, the second one is timestep
I think it should be --input-size 1 60 ` if the last dimension is the batch size?
(BTW: this goes directly to Triton configuration, it is the information Triton needs in order to run the model itself)
Hi @<1639799308809146368:profile|TritePigeon86>
Sounds awesome, how can we help?
AdventurousButterfly15 this one is quite self container:
https://github.com/allegroai/clearml/blob/master/examples/reporting/scalar_reporting.py
So I guess pip install finished working
But the task is evidently not being executed.
This is very odd ... you can run the agent with debugging with --debug --foreground to see all the outputs and logs
Hi GreasyRaven35
You should set the output_uri, in Task init, it will auto upload the model, and register the remote location URLtask = Task.init(..., output_uri=True)
You can also specify a target bucket, if you configured credentials (e.g. output_uri=" s3://bucket ")
so that you can get the latest artifacts of that experiment
what do you mean by " the latest artifacts "? do you have multiple artifacts on the same Task or s it the latest Task holding a specific artifact?
What's the host you have in the clearml.conf ?
is it something like " http://localhost:8008 " ?
And how did you connect your example,yaml?
you can run md5 on the file as stored in the remote storage (nfs or s3)
s3 is implementation specific (i.e. minio weka wassaby etc, might not support it) and I'm actually not sure regrading nfs (I mean you can run it, but it actually means you are reading the data, that said, nfs by definition I'm assuming is relatively fast access)
wdyt?
Hi
The Squash operation copies all the data and is no longer linked to previous commits?
Yes, basically the idea is if you have data version that relies on many parents that needs to be merged, the squash will create a merged copy and push it all as a single version, and then yes the parent versions are no longer needed
I thought this operation is like git squash but it seems to me
yeah... we did not want to actually delete the parents because unlike git, the operation is done ...
Hi @<1684735407637401600:profile|WonderfulJellyfish65>
BTW, the training script connects to apiserver via the internal IP address
That is a big issue, because as you noticed the links to data =generated by the code will have the internal IP ...
You basically need every component to use the same address (url)
@<1569858449813016576:profile|JumpyRaven4> fyi clearml-serving was synced 🤞
NVIDIA_VISIBLE_DEVICES=0,1
Basically it is uses "as is" and Nvidia drivers do the rest
Same goes for all
or 0-3
etc.
Hi WickedGoat98
but is there also a way to delete them, or wipe complete projects?
https://github.com/allegroai/trains/issues/16
Auto cleanup service here:
https://github.com/allegroai/trains/blob/master/examples/services/cleanup/cleanup_service.py
My question is what happens if I launch in parallel multiple doit commands that create new Tasks.
Should work out of the box.
I would like to confirm that current_task ...
Correct.
Should work out of the box, as long as the task was started. You can forcefully start the task with:task.mark_started()
but when I removed output_uri from Task.init, the pickled model has path
When you run the job on the k8s pod?
ColorfulBeetle67 you might need to configure use_credentials_chain
see here:
https://github.com/allegroai/clearml/blob/a9774c3842ea526d222044092172980ae505e24f/docs/clearml.conf#L85
Regrading the Token, I did not find any reference to "AWS_SESSION_TOKEN" in the clearml code, my guess it is used internally by boto?!
Oh if this is the case, then by all means push it into your Task's docker_setup_bash_script
It does not seem to have to be done after the git clone, the only part the I can see is setting the PYTHONPATH to the additional repo you are pulling, and that should work.
The main hurdle might be passing credentials to git, but if you are using SSH it should be transparent
wdyt?