Reputation
Badges 1
103 × Eureka!Hi SweetBadger76 ,
Well - apparently I was mistaken.
I still have a ghost worker that i'm mot able to remove (I had 2 workers on the same queue - that caused my confusion).
I can see it in the UI and when I run clearml-agent list
And although I'm stoping the worker specificallyclearml-agent daemon --stop <worker_id>I'm gettingCould not find a running clearml-agent instance with worker_name=<worker_id> worker_id=<worker_id>
Hi @<1523701205467926528:profile|AgitatedDove14>
I'm having a similar issue.
Also notice the cleaml-agent will not change the entry point of the docker meaning if the entry point does not end with plain bash, it will not actually run anything
Not sure I understand how to run a docker_bash_setup_script and then run a python script - Do you have an example? I could not find one.
Here is our CLI command
clearml-task --name <TASK NAME> \
--project <PRJ NAME> \
--repo git@gi...
ClearML key/secret provided to the agent
When is this provided? Is this during the build ?
is this running from the same linux user on which you checked the git ssh clone on that machine?
yes
The only thing that could account for this issue is somehow the agent is not getting the right info from the ~/.ssh folder
maybe -
Question - if we change the clearml.conf do we need to stop and start the daemon?
yes - the pre_installations.sh runs and completes - but the pytorch/main.py file doesn't run.
so the Task completes successfully but without running the script
BTW - is the CLEARML_HOST_IP relevant for the clearml-agent ?
i can see that we can create a worker with this environment variable . e.g.CLEARML_WORKER_NAME=MY-WORKDER CLEARML_WORKER_ID=MY-WORKER:0 CLEARML_HOST_IP=X.X.X.X clearml-agent daemon --detachedmy mistake doesn't use it to create a dedicated IP
Is current relationship only available via _get_parents() method?
@<1523701205467926528:profile|AgitatedDove14> -
I'm getting the following error when running the following code within the mp_worker
command = ["ffmpeg","-i",f"{url}","-vcodec","libx264", "output.mp4"]
subprocess.run(command, stderr=subprocess.STDOUT)
TypeError: fork_exec() takes exactly 21 arguments (17 given)
Any suggestions?
CostlyOstrich36 - but we will use any method that will allow us to save the files as parquet.
We are not yet using clearml Dataset - i'm not sure if this is a solution
Great - Thx TimelyPenguin76 for your input
google.storage { credentials = [ { bucket: "clearml-storage" project: "my-project" credentials_json: "/path/to/creds.json" }, ] }No - just emulating - it is more of /home/... /creds.json
Thx again for your time -
Where the experiment is being executed
Not sure I understand what you mean by this -
Assuming that we are running the ClearML on GKE (we have succeeded doing so) - and running the python code from COLAB or locally. Where do we configure the Google Storage ? how can the helm / k8s dynamically load the clearml.conf ? is it only from values.yaml ?
Where you view your experiment
In mlflow I was able to view the artifact directly (a...
Well it seems that we have similar https://github.com/allegroai/clearml-agent/issues/86
we are not able to reference this orphan worker (it does not show up with ps -ef | grep clearml-agent -
but still appears with clearml-agent list
and not able to stop with clearml-agent daemon --stop clearml-server-agent-group-cpu-agent-5df4476cfc-j54gh:0
getting
` Could not find a running clearml-agent instance with worker_name=clearml-server-agent-group-cpu-agent-5df4476cfc-j54gh:0 wo...
Thx CostlyOstrich36 for your reply
Can't see the reverence to parquet . we are currently using the above functionality , but the pd.DataFrame is only saved as csv compressed by gz
Looking in the repo I was not able to see an example - only reference to https://github.com/allegroai/clearml/blob/b9b0a506f35a414f6a9c2da7748f3ec3445b7d2d/docs/clearml.conf#L13 - I just need to add company.id or user.id in the credential dict?
Using the https://allegro.ai/clearml/docs/rst/references/clearml_python_ref/task_module/task_task.html?highlight=upload_artifact#clearml.task.Task.upload_artifact method. It works well, but only saves it as a csv (which is very problematic since when loading the artifact none of the data types of the columns are preserved...)
Dataset.list_datasets(dataset_project='XXXX')
Always returns an empty list
Just for the record - for who ever will be searching for a similar setup with colab
prerequisitecreate a dedicated Service Account (I was not able to authenticate with a regular User credentials (and not SA)) get SA key ( credentials.json ) Upload json to an ephemeral location (e.g. root of colab)login into ClearML Web UI - Create access key for user - https://clear.ml/docs/latest/docs/webapp/webapp_profile#creating-clearml-credentials prepare credentials` %%bash
export api=`ca...
Here is the screenshot - we deleted all the workers - accept for the one that we couldn't
shape -> tuple([int],[int])
I decided to use
._task.upload_artifact(name='metadata', artifact_object=metadata)
where metadata is a dict
metadata = {**metadata, **{"name":f"{Path(file_tmp_path).name}", "shape": f"{df.shape}"}}