Reputation
Badges 1
25 × Eureka!IrateBee40
Check the first steps here:
https://clear.ml/docs/latest/docs/getting_started/ds/ds_first_steps
(Basically you have to generate credentials / configure you machine so it knows where the server is and how to access it)
Make sense ?
I can install pytorch just fine locally on the agent, when I do not use clearml(-agent)
My thinking is the issue might be on the env file we are passing to conda, I can't find any other diff.
BTW:
@<1523701868901961728:profile|ReassuredTiger98> Can I send a specific wheel with mode debug prints for you to check (basically it will print the conda env YAML it is using)?
EnviousStarfish54 Sure, see scatter2d
https://allegro.ai/docs/examples/reporting/scatter_hist_confusion_mat_reporting/#2d-scatter-plots
Hi TrickyRaccoon92
BTW: checkout the HP optimization example, it might make things even easier π https://github.com/allegroai/trains/blob/master/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py
Hi WackyRabbit7
I have a pipeline controller task, which launches 30 tasks. Semantically there are 10 applications, and I run 3 tasks for each (those 3 are sequential, so in the UI it looks like 10 lines of 3 tasks).
π
In one of those 3 tasks that run for every app, I save a dataframe under the name "my_dataframe".
I'm assuming as an artifact:
What I want to achieve is once all tasks are over, to collect all those "my_dataframe" artifacts (10 in number), extract a sin...
Hi @<1601386194774528000:profile|AmusedPanda8>
I think the project name is ./model_training/trained_models/yolov8n-TEST_OKTODELETE/
and for some reason you have "." as a project project?
(notice jested projects are automatically created based on the project name with '/' as separator)
LovelyHamster1 verified, this is a UI bug with old limitation enforced.
I will make sure they know about it, it should be fixed for the upcoming release π
Wait @<1523701066867150848:profile|JitteryCoyote63>
If you reset the Task you would have lost the artifacts anyhow, how is that different?
Hi UpsetCrocodile10
execute them and return scalars.
This should be a good start (I hope π )
` for child in children:
put the Task into an execution queue
Task.enqueue(child, queue_name='my_queue_here')
wait for the task to finish
child.wait_for_status(status=['completed'])
reload all the metrics
child.reload()
get the metrics
print(child.get_last_scalar_metrics()) `
Yep this will work. BTW check the new pipeline it might have a more flexible solution
https://github.com/allegroai/clearml/blob/master/examples/pipeline/full_custom_pipeline.py
But this will require some code changes...
The reason is because it is logged as an image, not a plot π
- Could we add a comparison feature directly from the search results (Dashboard view -> search -> highlight some experiments for comparison)?
Totally forgot about the global search feature, hmm I'm not sure the webapp is in the correct "state" for that, i.e. I think that the selection only works in "table view", which is the "all experiments" flat table
- Could we add a filter on the project name in the "All Experiments" project?
You mean "filter by project" ?
Could we ad...
First I would check the CLI command it will basically prefill it for you:
https://clear.ml/docs/latest/docs/apps/clearml_task
Specifically to your question, working directory "." is the root of the git repo
But I would avoid adding it manually, use the CLI, it will either use ask you to provide info or take the git repo details from the local copy
Hi @<1523701079223570432:profile|ReassuredOwl55> let me try ti add some color here:
Basically we have to parts (1) pipeline logic, i.e. the code that drives the DAG, (2) pipeline components, e.g. model verification
The pipeline logic (1) i.e. the code that creates the dag, the tasks and enqueues them, will be running in the git actions context. i.e. this is the automation code. The pipeline components themselves (2) e.g. model verification training etc. are running using the clearml agents...
Hi SmallDeer34
The clearml-agent has its own cleaml.conf file , there you should put S3 credentials and they will be passed to any Task the agent executes:
https://github.com/allegroai/clearml-agent/blob/176b4a4cdec9c4303a946a82e22a579ae22c3355/docs/clearml.conf#L234
However, SNPE performs quantization with precompiled CLI binary instead of python library (which also needs to be installed). What would be the pipeline in this case?
I would imagine a container with preinstalled SNPE compiler / quantizer, and a python script triggering the process ?
one more question: in case of triggering the quantization process, will it be considered as separate task?
I think this makes sense, since you probably want a container with the SNE environment, m...
Hmm I see, if this is the case, would it make sense to run the pipeline logic locally? (notice the pipeline compute, i.e. the components will be running on remote machines with the agents)
That would be great! Might have to useΒ
2>/dev/null
Β in some of my bash scripts
Feel free to test and PR :)
One other question regarding connecting. We have setup sshd inside the docker image we are using.
Actually the remote session opens port 10022 on the host machine (so it does not collide with the default ssh port)
It actually runs an additional sshd
inside the docker, setting its port.
And the clearml-session will ssh directly into the container sshd...
Hi GiddyTurkey39
Glad to see that you are already diving into the controllers (the stable release will be out early next week)
A bit of background on how the pipeline controller are designed:
All steps in the pipeline are experiments already registered in the system (i.e. you can see them in the UI). Regardless on how you created those experiments they have to be there prior to the pipeline launch. The pipeline itself can be executed on any machine (it does very little, and...
This only talks about bugs reporting and enhancement suggestions
I'll make sure this is fixed π
Let me know if I can be of help π
The issue is uploading reporting fro http uploads (object storage will report upload). Basically the http upload is post with urllib that does not support upload callbacks for progress report. If you have an idea here, we will gladly add it (as you mentioned it can be quite annoying to have to open network manager to verify the upload is progressing)
Hi @<1523701079223570432:profile|ReassuredOwl55>
I want to kick off the pipeline and then check completion
outside
of the pipeline task. (edited)
Basically the pipeline is a Task (of a certain type).
You do the "standard" thing, you clone the pipeline Task, you enqueue it, and you wait for it's status
task = Task.clone(source_task="<pipeline ID here>")
Task.enqueue(task, queue_name=services)
task.wait_for_status(...)
wdyt?
DefeatedOstrich93 can you verify lightning actually only stored once ?
Hi @<1523701337353621504:profile|FlutteringSheep58>
are you asking how to convert a worker IP into a dns resolved host name ?