It looks as it is running, did the status of the experiment change?
CluelessElephant89 , I think the RAM requirements for elastic might be 2GB, you can try the following hack so it maybe will work.
In the machine that it's running on there should be a docker-compose.yml
file (I'm guessing at home directory).
For the following https://github.com/allegroai/clearml-server/blob/master/docker/docker-compose.yml#L41 you can try changing it to ES_JAVA_OPTS: -Xms1g -Xmx1g
and this might limit the elastic memory to 1 gb, however please note this might ...
Hi @<1590514584836378624:profile|AmiableSeaturtle81> , what version is your server? Can you provide logs from the apiserver container?
SharpHedgehog60 hi!
All passwords/secrets can be kept in ~/clearml.conf on the relevant machine.
Is this what you meant?
CluelessElephant89 , the relevant command should be something of the sort sudo docker logs clearml-apiserver
Try removing the region, it might be confusing it
Please note that this configuration is on the client side, not the clearml server side 🙂
VexedCat68 , You can add labels to your model to keep track of what each model ID means (for example cat=1, dog=2). Please take a look here 🙂
https://github.com/allegroai/clearml/blob/master/examples/reporting/model_config.py#L41
Hi @<1524922424720625664:profile|TartLeopard58> , projects and many other internals like tasks are all saved in internal databases of the ClearML server, specifically in mongo & elastic
What version of clearml
and clearml-agent
are you using, what OS? Can you add the line you're running for the agent?
Can you please give an example?
Also, can you try with dataset.upload(output_url="/home/user/server_local_storage/clearml_training_dataset/")
(note the added '/' at the end of the line)
CurvedHedgehog15 , isn't the original experiment you selected to run against is the basic benchmark?
Hi @<1544853695869489152:profile|NonchalantOx99> , as the error states:
raise ValueError('Could not find queue named "{}"'.format(queue_name))
ValueError: Could not find queue named "services"
It couldn't find the queue services. This is for the pipeline controller to run on. Pipelines consist of steps and the controller and they all can run on different machines. By default, the controller will try to run on the services queue. Makes sense? 🙂
Next release of clearml-serving
, correct
SmallDeer34 Hi! 🙂
I'm afraid that currently we support only up to 10 experiments in comparison at once. However you can add/remove experiments mid comparison. Is there a specific reason why you'd need to compare specifically 20?
Regarding the coloring, if you do 5/5 it would be supported.
It looks that ES is unable to write to the volume that you provided.
Did you provide access to the folder where ES is trying to write? On what OS are you running the server?
Hi @<1617693654191706112:profile|HelpfulMosquito8> , can you please elaborate? Did you self deploy on Redshift? I don't think it really matters. What is the workflow you're trying to achieve?
How about this by the way?
https://clear.ml/docs/latest/docs/references/sdk/model_outputmodel#outputmodelset_default_upload_uri
Hi @<1717350332247314432:profile|WittySeal70> , just to clarify, are you talking about the ClearML server itself or about agents?
From my understanding the AMI is simply an image with the ClearML server preloaded to it.
Hi @<1800699527066292224:profile|SucculentKitten7> , you can control the docker image on the task level, either through the code ( Task.set_base_docker
) or through the webUI in the 'execution' section of the task
YummyLion54 , please try the following:
` from clearml import Task
base_task = Task.get_task(task_id="base task id")
cloned_task = Task.clone(source_task=base_task)
cloned_task.connect_configuration(name="OmegaConf", configuration="path/to/conf/file") `
You would also need to edit the links somehow that are connected to the task
It seems that the SDK can't reach the API server. Are you seeing anything in the API server logs? Is it possible you're being blocked by an internal firewall?
What version is the server? Do you see any errors in the API server or webserver containers?
and also take a look into development.apply_environment
Hi @<1708653001188577280:profile|QuaintOwl32> , you can control all of this on the task level. For example through code you can use Task.set_base_docker
- None
You can add all of these as arguments
SoreDragonfly16 , let me take a look 🙂
ClumsyElephant70 , I understand it's possible to already combine python and Triton, however it is quite difficult to do and requires a lot of work.
We're working on making ClearML-Serving easier to use and this is one of our next plans in the to do list.
We hoped to release it earlier but got caught with some other pressing issues we had to take care of.
So to sum up the long answer, yes, it is in our plans and will be supported eventually 🙂