Hi @<1523701283830108160:profile|UnsightlyBeetle11> , I think you can store txt artifacts so you can store the string there. If it's not too long, you can even fetch it from the preview
Hi RoughTiger69 ,
If you create a child version and add the delta of the files to the child, fetching the child version will also fetch the parents files as well
Hi ObedientToad56 🙂
My question is on how the deployment would be once we have verified the endpoints are working in a local container.
I isn't the deployment just running the inference container? You just open up the endpoints towards where you wanna server, no?
ClearML should log all OmegaConf automatically according to this: https://clear.ml/docs/latest/docs/fundamentals/hyperparameters#hydra
Might as well take a look at this example as well 🙂
https://github.com/allegroai/clearml/blob/master/examples/frameworks/hydra/hydra_example.py
JitteryCoyote63 , I think so.
config = OmegaConf.load(train_task.connect_configuration(config_path))
Should work
I have no idea, but considering that the version for http://app.clear.ml was updated recently (last week from what I noticed) I'd be guessing that the self hosted server should be right around the corner 😉
In the webUI, when you go to the dataset, where do you see it is saved? You can click on 'full details' in any version of a dataset and see that in the artifacts section
Hmmm, maybe you could save it as an env var. There isn't a 'default' server per say since you can deploy anywhere yourself. Regarding to check if it's alive, you can either check ping it with curl or check up on the docker status of the server 🙂
@<1664079296102141952:profile|DangerousStarfish38> , can you provide logs please?
For example:
task = Task.init(project_name='examples', task_name='PyTorch MNIST train', output_uri=True)
# Training settings
parser = argparse.ArgumentParser(description='PyTorch MNIST Example')
parser.add_argument('--ds-name', default="blabla")
args = parser.parse_args()
Regarding connect_configuration() , reading into the docs I see that this method needs to be called before reading the config file
https://clear.ml/docs/latest/docs/references/sdk/task#connect_configuration
Can you provide a snippet to try and reproduce?
Hi @<1569496075083976704:profile|SweetShells3> , do you mean to run the CLI command via python code?
ClearML has a built in model repository so together I think they make a "feature store" again, it really depends on your definition
Hi @<1582179661935284224:profile|AbruptJellyfish92> , how do the histograms look when you're not in comparison mode?
Can you provide a self contained snippet that creates such histograms that reproduce this behavior please?
VexedCat68 , I was about to mention it myself. Maybe only keeping last few or last best checkpoints would be best in this case. I think SDK also supports this quite well 🙂
Yes, however I think you might be able to expose this via an env variable on the Task object itself
The ubuntu is the client side or you changed OS on the server side?
Sounds like some issue with queueing the experiment. Can you provide a log of the pipeline?
From what I understand, by default the ES has a low disk waterkmark set at 95% of the disk capacity. Once reached the shard is transitioned to a read only mode. Since you have a large disk of 1.8Tb the remaining 85Gb is below the 5%.
Basically you need to set the following env vars in elasticsearch service in the docker compose:
` - cluster.routing.allocation.disk.watermark.low=10gb
- cluster.routing.allocation.disk.watermark.high=10gb
- cluster.routing.allocation.disk.wate...
I assigned both the pipeline controller and the component to this worker. Do I rather need to create two agents, one in services mode for the controller and then another one (not in services mode) for the component (which does training and predictions)? But, this seems to defeat the point of being able to run multiple tasks in services mode...
Yes. Again, the services mode is for special 'system' services if you will. The controller can run on the services agent (although not necessary...
Hi ScantChimpanzee51 , I think you can get it via the API, this sits on task.data.output.destination retrieve the task object via API and play with it a bit to see where this sits 🙂
Hi @<1523701260895653888:profile|QuaintJellyfish58> , can you please provide a standalone snippet that reproduces this?
Hi ElegantCoyote26 ,
It doesn't seem that using port 8080 is mandatory and you can simply change it when you run ClearML-Serving - i.e docker run -v ~/clearml.conf:/root/clearml.conf -p 8085:8085
My guess is that the example uses port 8080 because usually the ClearML backend and the Serving would run on different machines
Hi @<1710827340621156352:profile|HungryFrog27> , what seems to be the issue?
Hi MoodyCentipede68 ,
What version of ClearML / ClearML-Agent are you using? Is it a self hosted server or the SaaS?
Also, can you explain what step 7 was trying to do? Is it running locally or distributed?