Reputation
Badges 1
103 × Eureka!Hi SmugDolphin23
Do you have a timeline for fixing this https://clearml.slack.com/archives/CTK20V944/p1661260956007059?thread_ts=1661256295.774349&cid=CTK20V944
BTW - is the CLEARML_HOST_IP
relevant for the clearml-agent
?
i can see that we can create a worker with this environment variable . e.g.CLEARML_WORKER_NAME=MY-WORKDER CLEARML_WORKER_ID=MY-WORKER:0 CLEARML_HOST_IP=X.X.X.X clearml-agent daemon --detached
my mistake doesn't use it to create a dedicated IP
We need to convert it a DataFrame since
Displaying metadata in the UI is only supported for pandas Dataframes for now. Skipping!
upgrading to 1.12.1 didn't help
I think the issue is that when I create the dataset
- i used
use_current_task=True,
If I change it to
use_current_task=False,
then it finalizes
Thx for your reply
Here is the screenshot - we deleted all the workers - accept for the one that we couldn't
Hi SweetBadger76 -
I'm I misunderstanding how this tests
worker runs?
clearml-3.5.0
Sorry - I'm a Helm newbee
when runninghelm search repo clearml --versions
I can't see version 3.6.2 - the highest is 3.5.0
This is the repo that we used to get the helm charthelm repo add allegroai
What I'm I missing?
Hi AgitatedDove14
OK - the issue was the firewall rules that we had.
Now both of the jupyter lab
and vscode
servers are up.
But now there is an issue with the Setting up connection to remote session
After the
Environment setup completed successfully
Starting Task Execution:
ClearML results page:
There is a WARNING
clearml - WARNING - Could not retrieve remote configuration named 'SSH'...
@<1523701205467926528:profile|AgitatedDove14> -
I'm getting the following error when running the following code within the mp_worker
command = ["ffmpeg","-i",f"{url}","-vcodec","libx264", "output.mp4"]
subprocess.run(command, stderr=subprocess.STDOUT)
TypeError: fork_exec() takes exactly 21 arguments (17 given)
Any suggestions?
Distributor ID: Ubuntu
Description: Ubuntu 20.04.4 LTS
Release: 20.04Codename: focal
Are we suppose to use the "Extra Configurations" from the https://clear.ml/docs/latest/assets/images/ClearML_Server_Diagram-7ea19db8e22a7737f062cce207befe38.png ?
https://docs.google.com/drawings/d/11f-AWVmIq7P0e8bP5OnMUz0hguXm2T_Xqq7iNMA-ANA/edit?usp=sharing
Hi
you will have to configure the credentials there (in a local
clearml.conf
or using environment variables
This is the part that confuses me - is there a way to configure clearml.conf
from the values.yaml
? I would like the GKE to load the cluster with the correct credentials without logging into the pods and manually updating the claerml.conf
file
Feeling that we are nearly there ....
One more question -
Is there a way to configure Clearml to store all the artifacts
and the Plots
etc. in a bucket instead of manually uploading/downloading the artifacts from within the client's code?
Specifying the output_uri
in Task.init
saves the the checkpoints, what about the rest of the outputs?
https://clear.ml/docs/latest/docs/faq#git-and-storage
Hi SweetBadger76
Further investigation showed that the worker was created with a dedicated CLEARML_HOST_IP
- so running the
clearml-agent daemon --stop
didn't kill it (but it did appear in the clearml-agent list
But once we added the
CLEARML_HOST_IP `
CLEARML_HOST_IP=X.X.X.X clearml-agent daemon --stop
it finally killed it
using the helm charts
https://github.com/allegroai/clearml-helm-charts
we want to use the dataset output_uri as a common ground to create additional dataset formats such as https://webdataset.github.io/webdataset/
Looking in the repo I was not able to see an example - only reference to https://github.com/allegroai/clearml/blob/b9b0a506f35a414f6a9c2da7748f3ec3445b7d2d/docs/clearml.conf#L13 - I just need to add company.id
or user.id
in the credential dict?
Well it seems that we have similar https://github.com/allegroai/clearml-agent/issues/86
currently we are just creating a new worker and on a separate queue
we reinstalled the clearml-agent$clearml-agent --version CLEARML-AGENT version 1.2.3
running top | grep clearml
we can see the agent running
running clearml-agent list
we can see 2 workers
before running clearml-agent daemon --stop
We updated the clearml.conf and updated the worker_id
and worker_name
with the relevant name/id that we can see from clearml-agent list
and we get
` Could not find a running clearml-agent instance with worker_name=<clearml_worker_na...
Hi,
Thx for you response,
Yes - we are using the above repo.
We would like to have easy/cheaper access to the artifacts etc. that will be output from the experiments
But this is not on the pods, isn't it? We're talking about the python code running from COLAB or locally...?
correct - but where is the clearml.conf
file?
Ok - I can see that if I ran finalize(auto_upload=True)
on the dataset - I get all the information in the UI.
Way is this necessary?