Reputation
Badges 1
25 × Eureka!What does spin mean in this context?
This line:docker-compose --env-file example.env -f docker-compose-triton-gpu.yml up
But these have: different task ids, same endpoints (from looking through the tabs)
So I am not sure why they are here and why not somewhere else
You can safely ignore them for the time being ๐
but is it true that I can have multiple models on the same docker instance with different endpoints?
Yes! this is exactly the idea (and again I'm not sure ...
Epochs are still round numbers ...
Multiply by 2?! ๐
tried it and restarted the agent, but not working properly
What do you mean not working? can you provide logs ?
Hi VexedCat68
So if I understand correctly, the issue is this argument:parameter_override={'Args/dataset_id': '${split_dataset.split_dataset_id}', 'Args/model_id': '${get_latest_model_id.clearml_model_id}'},I think that what is missing is telling it this an artifact:parameter_override={'Args/dataset_id': '${split_dataset.artifacts.split_dataset_id.url}', 'Args/model_id': '${get_latest_model_id.clearml_model_id}'},You can see the example here:
https://clear.ml/docs/latest/docs/ref...
Found it
GiganticTurtle0 you are ๐งจ ! thank you for stumbling across this one as well.
Fix will be pushed later today ๐
Notice the args will be set on the connect call, so the check on whether they are empty should come after
I think I'm missing the connection between the hash-ids and the txt file, or in other words why is the txt file containing full path not relative path
You can also see the code there, could you run a quick test against the demo-server, this might be a UI issue that was fixed in the last version.
It seems to fail when trying to download the modellocal_download = StorageManager.get_local_copy(uri, extract_archive=False) File "/opt/venv/lib/python3.7/site-packages/clearml/storage/manager.py", line 47, in get_local_copy cached_file = cache.get_local_copy(remote_url=remote_url, force_download=force_download) File "/opt/venv/lib/python3.7/site-packages/clearml/storage/cache.py", line 55, in get_local_copy if helper.base_url == "file://":And based on the error I suspect the...
Also. finally the columns will be movable and re sizable, I can't wait for the next version ;)
No, clearml uses boto, this is internal boto error, which points bucket size limit, see the error itself
YEY ๐ ๐
I think I understand what the issue is, you have installed the agent on your python 3.8, but it is running and trying to install on python 3.10
To verify,
pip uninstall clearml-agent
python3.10 -m pip install clearml-agent
python3.10 -m clearml-agent daemon...
well it should fail, but I think the error message should be fixed ๐
maybe:ValueError: dataset 'tmp_datset' not found in projectlavi-testing' `wdyt?
Hi @<1697056701116583936:profile|JealousArcticwolf24> just saw the reply
Image look okay?! what what is the query? basically I'm truing to understand if grafana is connected to the Prometheus, and if the Prometheus has any data in it
Secondly, just to make sure, kafka service should be able to connect directly to the the container running the actual inference
Hi PompousBeetle71 , what exactly is the scenario / problem we are trying to solve ?
As I understand, providing this param at the Task.init() inside the subtask is too late, because step is already started.
If you are running the task on an agent (with I assume you do), than one way would be to configure the "default_output_uri" on the agnets clearml.conf file.
The other option is to change the task as creation time, task.storage_uri = 's3://...'
I have a question regarding running the code on the remote machine, each time I run the code I see the console in the ClearML server start downloading all the libraries I used in the code and when I run another code the same thing happens so why it has to download all the libraries again and many times?
I'm assuming you are referring to the installation, the downloaded python packages are cached.
You can turn on full caching by uncommenting the following line:
https://github.com/alleg...
Hi GrotesqueDog77
and after some time I want to delete artifact with
You can simply upload with the same local file name and same artifact name, it will override the target storage. wdyt?
Guys FYI:params = task.get_parameters_as_dict()
Based on what I see when the ec2 instance starts it installs the latest, could it be this instance is still running?
CrookedWalrus33 any chance you can think of a sample code to reproduce?
but I don't see any change...where is the link to the file removed from
In the meta data section, check the artifacts "state" object
How are these two datasets different?
Like comparing two experiments :)
My bad you have to pass it to the container itself:
https://github.com/allegroai/clearml-agent/blob/a5a797ec5e5e3e90b115213c0411a516cab60e83/docs/clearml.conf#L149extra_docker_arguments: ["-e", "CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1"]