Reputation
Badges 1
121 × Eureka!Something is wierd.. It is showing workers which are not running now...
This is my example. Iteration 10 so there are 10 runs. Looking at the 4th run, 60% of the jobs, 91% iteration, 94% time.. What does it mean ?
No no, I mean now i can export a csv file into clearml-data. I was wondering if it possible to export directly from a sql database.
AgitatedDove14 Not creating but more for orchestrating...
Currently, we manually push a dataset to cleaml-dataset .
Have a pipeline controller Task which (takes in data from clearml-dataset, runs preprocessing, runs training) and Publishes a model (if certain threshold is met).
We have clearml monitor which will monitor all Published models .It will push the uri of the published model to a rabbitmq.
We have a subscriber (python code) listening to the rabbitmq. This takes in the uri from t...
AgitatedDove14 We too self host (on prem) the helm charts in our local k8s ecosystem.
Triggering - Will be nice feature indeed, currently we are using clearml.monitors to address these now
Is it the UI presenting the entire workflow? - This portion will also be nice. (Let's say someone uses a 1) clearmldataset -> 2) Pipeline Controller (Contains preprocessing, training, hyperparamter tuning) -> 3) clearml-serving ).. If they can see the entire thing, in one flow
We are using seldon f...
Yeah within clearml , we use the PipelineController. We are now mainly looking for a single tool to stitch together other products.
But of course, will give first precedence to tools which will work best with clearml. Thus asking, if anyone has had similar experience on setting up such systems.
HI another qn,dataset_upload_task = Task.get_task(task_id=args['dataset_task_id'])
iris_pickle = dataset_upload_task.artifacts['dataset'].get_local_copy()
How would I replicate the above for Dataset ? Like how to get the iris_pickle file. I did some hacking likewise below.ds.get_mutable_local_copy(target_folder='data')
Subesequently, I have to load the file by name also.I wonder whether there is more elegant way
kkie..now I get it.. I set up the clearml-agent on an EC2 instance. and it works now.
Thanks
Essentially, while running on k8s_glue, I want to pull the docker image/container, then pip install the additional requirements.txt into them...
However, I am able to get it to work, if I launch a clearml-agent outside the kubernetes ecosystem.
Hi AgitatedDove14 , Just updated that flag, but the problem continues..
` agent.package_manager.system_site_packages = true
.....
Environment setup completed successfully
Starting Task Execution:
ClearML results page: files_server:
Traceback (most recent call last):
File "base_template_keras_simple.py", line 15, in <module>
import tensorflow as tf # noqa: F401
File "/root/.clearml/venvs-builds/3.6/lib/python3.6/site-packages/clearml/binding/import_bind.py", line 59, in __pat...
Just figured out..
Seems like the docker image below, didnt have tensorflow package.. 😮tensorflow/tensorflow:latest-devel-gpu
I shld have checked prior... My Bad..
Thanks for the help
It'll be good if there was yaml file to deploy clearml-agents into the k8 system.
This is where I downloaed the log. Seems like some docker issue, though i cant seem to figure it out. As an alternative, I spawned a clearml-agent outside the k8 environment and it was able to execute well.
Hi, will proceed to close this thread. We found some issue with the underlying docker in our machines. We've have not shifted to another k8 of ec2 instances in AWS.
When I push a job to an agent node, i got this error.
"Error response from daemon: network None not found"
Hi, Some walk around I thought of.. Btw, I havent tried . AnxiousSeal95 , your comments
1 ) Attach a clearml-task id to each new dataset-id
So in the future, when new data comes in, get the last data commit from the project(Dataset) and get the clearml-task for it. Then clone the clearml-task, and pass in the new data. The only downside, is the need to clone the cleaml-task.
Or alternatively
2) Attach a gitsha-id of the processing code to each new dataset-id.
This can't give the exact code ...
No, the agent can be in any machine.
But the agent has to be running on the machine with gpu
nice... we need moarrrrrrrr !!!!!!!!
It wud be really helpful, if you cud do the next episode on setting up clearml in kubernetes.. 😇
In anyways, keep up the good work for the community
Yup, i used the value file for the agent. However, i manually edited for the agentservices (as there was no example for it in the github).. Also I am not sure what is the CLEARML_HOST_IP (left it empty)
hi FriendlySquid61 , The clearml-agent got filled up due to values.yaml file. However, agentservices was empty so I filled it up manually..
Yup, tried that.. Same error also