Ooops ๐
task.get_tags()
task.set_tags()
JitteryCoyote63 that makes total sense!!
The reporting subprocess is not being updated with the new value! Let me check how we can pass it along...
The other order (with custom decorator above pipeline fails - just for you info
)
This is on "purpose" the pipeline decorator has to be the top decorator.
Glad it works!
The second problem that I am running into now, is that one of the dependencies in the package is actually hosted in a private repo.
Add your private repo to the extra index section in the clearml.conf:
None
Hmm could you try to upload to your files server (not the S3)
Maybe some credentials error ?
Actually this should be a flag
Yes this is Triton failing to load the actual model file
If you one each "main" process as a single experiment, just don't call Task.init in the scheduler
(Also can you share the clearml.conf, without actual creds ๐ )
Do you think this is better ? (the API documentation is coming directly from the python doc-string, so the code will always have the latest documentation)
https://github.com/allegroai/clearml/blob/c58e8a4c6a1294f8acec6ed9cba81c3b91aa2abd/clearml/datasets/dataset.py#L633
Yes, could you send the full log? screen grab ?
Hi @<1590152178218045440:profile|HarebrainedToad56>
Yes you are correct all TB logs are stored into the ELK in the clearml backend. This really scales well and rarely has issues, as long of course that the clearml-server is running on strong enough machine. How many RAM / HD you have on the clearml-server ?
I would just add git+
None to your requirements (either in the requirements.txt or even better as part of the pipeline/component where you also specify the repo to be used)
The agent will automatically push the crednetilas when it installs the repo as wheel.
wdyt?
btw: you might also get away with adding -e .
into the requirements.txt (but you will need to test that one)
@<1657918706052763648:profile|SillyRobin38> out of curiosity did you compare performance of tensorrt-llm vs vllm ?
(the jury is still out on that, just wondered if you had a chance)
I see... We could definitely add an argument to control it. I'll update here once there is an RC
Hi @<1661542579272945664:profile|SaltySpider22>
question 1: are parallel writes to a dataset with the same version possible?
When you are saying parallel what do you mean? from multiple machines ?
Whats the recommended way to append the dataset in a future version?
Once a dataset was finalized the only way to add files is to add another version that inherits from the previous one (i.e. the finalized version becomes the parent of the new version)
If you are worried about multip...
Hi CynicalBee90
Always great to have people joining the conversation, especially if they are the decision makers a.k.a can amend mistakes ๐
If I can summarize a few points here (and feel free to fill in / edit any mistake or leftovers)
Open-Source license: This is basically the mongodb license, which is as open as possible with the ability to, at the end, offer some protection against Amazon giants stealing APIs (like they did for both mongodb and elastic search) Platform & language agno...
Hi @<1601386194774528000:profile|AmusedPanda8>
I think the project name is ./model_training/trained_models/yolov8n-TEST_OKTODELETE/
and for some reason you have "." as a project project?
(notice jested projects are automatically created based on the project name with '/' as separator)
Nice SubstantialElk6 !
BTW: you can configure your cleaml client to store the changes from the latest Pushed commit (and not the default which is latest local commit)
see store_code_diff_from_remote:
in clearml.conf:
https://github.com/allegroai/clearml/blob/9b962bae4b1ccc448e1807e1688fe193454c1da1/docs/clearml.conf#L150
Would this be best if it were executed in the Triton execution environment?
It seems the issue is unrelated to the Triton ...
Could I use theย
clearml-agent build
ย command and theย
Triton serving engine
ย task ID to create a docker container that I could then use interactively to run these tests?
Yep, that should do it ๐
I would start simple, no need to get the docker itself it seems like clearml credentials issue?!
If you passed the correct path it should work (if it fails it would have failed right at the beginning).
BTW: I think it is clearml-agent --config-file <file here> daemon ...
and the clearml server version ?
So can you verify it can download the model ?
Hi @<1691620877822595072:profile|FlutteringMouse14>
Do I have to use Hydra
You can, and then the entire configuration is fully captured by ClearML (automatically) while you can still override values with the manual "key.sub=value" both in the UI and in the CLI
Otherwise you can connect nested dict with task.connect (these will be flattened with /
for sub keys).
Or you can connect configuration files ( task.connect_configuration
) and edit them as is in the UI (with override of...
Hi JitteryCoyote63
Signal 9 is killed signal, could it be someone killed the process ? Do you have other logs to share ? Is this reproducible ?
Yes, I think the API is probably the easiest:from clearml.backend_api.session.client import APIClient client = APIClient() project_list = client.projects.get_all() print(project_list)
I can read them programmatically using tensorboard and the log the using clearml logger,
StaleButterfly40 this will be a great script to put somewhere (I'm sure you are not the only one with this problem). Maybe put it as a GitHub issue ? wdyt ?
The class documentation itself is also there under "References" -> "Trains Python Package"
Notice that due to a bug in the documentation (we are working on a fix) the reference part is not searchable in the main search bar
So maybe the path is related to the fact I have venv caching on?
hmmm could be...
Can you quickly disable the caching and try ?