Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
215 Questions, 1023 Answers
  Active since 10 January 2023
  Last activity 3 months ago

Reputation

0

Badges 1

981 × Eureka!
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, where can I find the logs of trains-agent by default?
5 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi, in a subproject, would it be possible to hide the parent project if it is empty?
3 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, is it possible to start a clearml-agent (not in docker mode) on a machine with a gpu, but enforce the clearml-agent to not “see” the gpu? So that the exp...
4 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
4 years ago
0 Votes
20 Answers
2K Views
0 Votes 20 Answers 2K Views
Is it possible to run an agent, listen to the services queue without using docker?
5 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
How can I do the following? (basically, filtering by task type) Task.get_tasks(project_name="my-project", task_name="my-task", task_filter=dict(type="trainin...
5 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
First link in hyperparameter optimization page is broken > https://allegro.ai/docs/examples/examples_hyperparam_opt/
5 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
Hi, how does agent.enable_git_ask_pass works? I am using the clearml-agent in docker mode and my experiment is stuck at downloading a private dependency: Clo...
2 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Looks like trains-agent 0.16 doesn't support --install-globally documented parameter -> Only available for trains-agent build command. Would it be possible t...
5 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi, I would like to use pytorch3d==0.5.0 with torch==1.9.1 on cuda version 110, locally it works, but the clearml agent fails setting up the environment with...
4 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi, I see that there is a new parameter in aws autoscaler: max_spin_up_time_min - What is the difference with max_idle_time_min ?
aws
4 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi, in one of my agents with CUDA Version: 11.1 (from nvidia-smi), clearml agent 0.17.1 detects version 100 (I can see from experiments logs: agent.cuda_vers...
4 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hey, just wanted to mention: in docs, Task.get_parameter does not say: Different sections with key prefix "section/" , as Task.get_parameters do. Also there ...
5 years ago
0 Votes
12 Answers
2K Views
0 Votes 12 Answers 2K Views
Hi, where can I find the server parameter to control when the server is unregistering an agent after not receiving updates? Currently it's quite long (30mins...
2 years ago
0 Votes
18 Answers
2K Views
0 Votes 18 Answers 2K Views
Hi Guys, I had several times now the following errors poping in agents while executing a task: trains_agent: ERROR: Failed applying git diff: I attached the ...
4 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi, from within an experiment, how can I intercept the signal that the experiment was aborted and execute a cleanup function? I tried to intercept SIGINT and...
3 years ago
0 Votes
12 Answers
2K Views
0 Votes 12 Answers 2K Views
Hi, I deleted some archived experiments in clearml server 1.0 and the popup in the dashboard showed “the following artifacts were not deleted”, with a list o...
4 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi there, is it safe to use ClearML (trains >= 0.17) with the trains ignite handler? Should we wait for the update on their side?
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hey there, I see that in the autoscaler configuration, the queues param accept dictionaries with values of type list of lists (see eg below.) What does it me...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, Is it still true that --services-mode only supports docker mode?
4 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi, is it possible to pass environment variables to agents created by the AWS AutoScaler service?
4 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi, I just updated clearml-server to 1.1.0 and got the following error when starting it with docker-compose: clearml-apiserver | [2021-08-02 13:37:09,852] [8...
4 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi, how can I easily start a shell script from within an experiment and have its logs (stdin/err) logged in clearml?
3 years ago
0 Votes
7 Answers
2K Views
0 Votes 7 Answers 2K Views
Hi, is there a way to get some stats about the use of workers? I would like to know, over the past 3 months: Number of training hours per user Number of trai...
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi, are the experiments logs stored in s3 or in the trains-server? (When using s3 as artifact storage)
4 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Could you please explain a bit more how trains adapt the torch version depending on the installed cuda version? Here is my setup: cuda 102 installed and corr...
5 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi, would it be possible to parse torch requirement when it’s part of the extras_require dict? In my code, I have the following: train_task._update_requireme...
4 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi, I have a long running experiment that was running on AWS instance that got killed after ~4 days with the following reason: STATUS REASON: Forced stop (no...
3 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
(sorry I pinned the message accidentally 😅 )
5 years ago
Show more results questions
0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

I’ve reindexed the data for the logs, now the mappings are correct but I am missing one month of data, I have literally no idea where this data is/how it disappeared

4 years ago
0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

Here are the logs of the agent :)
` (base) user@worker:~$ tail -f /tmp/.clearml_agent_daemon_outjdups8t2.txt
sdk.development.worker.log_stdout = true
sdk.development.worker.report_global_mem_used = false

+----------------------------------+--------+-------+
| id | name | tags |
+----------------------------------+--------+-------+
| 54e4a62a402d5135612ba7b12cfe4e57 | docker | |
+----------------------------------+--------+-------+

Starting infinite tas...

4 years ago
0 Hi There

btw task._get_task_property('hyperparams') also gives me ValueError: Task has no hyperparams section defined

5 years ago
0 Hi, Kudos For The 0.15 Guys! I Am Having An Issue Related To Git Auth: I Have An Issue With Trains-Agent (0.15): It Does Not Use Git Creds While Trying To Clone A Private Repo:

(I didn't have this problem so far because I was using ssh keys globaly, but I want know to switch to git auth using Personal Access Token for security reasons)

5 years ago
0 Hi, If I Am Starting My Training With The Following Command:

btw I see in the pytorch_distributed_example I see that you average_gradients , but from pytorch https://pytorch.org/tutorials/beginner/dist_overview.html it says:
DDP takes care of gradient communication to keep model replicas synchronized and overlaps it with the gradient computations to speed up training.

3 years ago
0 Hello, ~3 Months Ago I Created A Trains-Server In A Machine With 30Gb Of Disk Space. Today I Wasn'T Able To Connect To Trains-Server, So I Checked The Server And Found That The Disk Full. I Ran:

Stopping the server Editing the docker-compose.yml file, adding the logging section to all services Restarting the serverDocker-compose freed 10Go of logs

4 years ago
0 Hi, Some Properties Of The Task Object Are Not Listed In The Documentation (Such As Task.Parent, Which Is Not Clear Whether It Is The Parent Task Object Itself Or The Id Of The Parent Task).

The parent task is a data_processing task, therefore I retrieve it so that I can then data_processed = parent_task.artifacts["data_processed"]

5 years ago
0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

and saved locally, which is why the second task, not executed in the same machine, cannot access the file

5 years ago
2 years ago
0 Hi There, It Seems Like There Is A Bug With The Visualization Of Debug Samples On The Ui (Server V1.2.0, Self-Hosted): When Clicking On A Debug Sample Then On The Download Button, If The Sample Is Stored In S3, The Download Button Opens A Blank Page With

Sure, it’s because of a very annoying bug that I shared in this https://clearml.slack.com/archives/CTK20V944/p1648647503942759 , that I couldn’t solve so far.

I’m not sure you can downgrade that easily ...

Yea that’s what I thought, that’s a bit of pain for me now, I hope I can find a way to fix the bug somehow

3 years ago
0 Hi Again, Is There A Way To Pass Secrets As Parameters Of A Task? I Have An Experiment That Requires Connecting To A Database, And I Need To Be Able To Pass The Creds As Task Params (Or In Another Way, I Don'T Know Yet). But I Don'T Want To Expose My Cred

Thanks for your input TenseOstrich47 , I was considering using a secret manager now, I guess that's the best option. I can move the secrets wherever I need them to be to make it work 🙂

4 years ago
0 Hi, With Clearml-Agent 1.5.1, I Tried To Run An Experiment Within A Docker With Image Python3:8 And It Failed Executing The Task While Trying To Call Python3.9. I Am Not Sure Why It'S Using Python3.9, Since The Agent.Default_Python Is 3.8 And The Image Is

Hi SmugDolphin23 thanks for the input! Will try now but that seems hacky: to have it working I have to specify python3.8 two times:
one in the agent config file (agent.default_python is already python3.8, but seems to be ignored) + make sure it is available (using python:3.8 docker image)Is there a way to prevent this redundancy? Ie. If I want to change the python version, I can control it from a single place?

2 years ago
0 Hi, I Would Like To Follow-Up In This

Another error that just popped up:

3 years ago
0 Hi, I Deleted Some Archived Experiments In Clearml Server 1.0 And The Popup In The Dashboard Showed “The Following Artifacts Were Not Deleted”, With A List Of Files That Are Under

SuccessfulKoala55 They do have the right filepath, eg:
https://***.com:8081/my-project-name/experiment_name.b1fd9df5f4d7488f96d928e9a3ab7ad4/metrics/metric_name/predictions/sample_00000001.png

4 years ago
0 Hi, Although

SuccessfulKoala55 I can try to make one, let’s see 🙂

4 years ago
0 Hi Guys For The Aws Auto-Scaler I Need To Access Aws Ssm Or Create .Env File Locally When Using The Init Script. Has Anyone Done This?

Try to spin up the instance of that type manually in that region to see if it is available

4 years ago
0 Got Some Errors While Running Migration Script From Es5 To Es7:

Thanks! Unfortunately still not working, here is the log file:

5 years ago
0 Hi, Although

Add carriage return flush support using the sdk.development.worker.console_cr_flush_period configuration setting (GitHub trains Issue 181)

4 years ago
0 Hi Clearml Team Members! Is There Any Progress Made On The Clearml-Serving Repo? I’D Love To Start Using It But I Lack A Straightforward Get Started Example. My Use Case Is The Following:

Hi AgitatedDove14 , that’s super exciting news! 🤩 🚀
Regarding the two outstanding points:
In my case, I’d maintain a client python package that takes care of the pre/post processing of each request, so that I only send the raw data to the inference service and I post process the raw output of the model returned by the inference service. But I understand why it might be desirable for the users to have these steps happening on the server. What is challenging in this context? Defining how t...

3 years ago
2 years ago
0 Hi, I Deleted All Archived Experiments In A Project And I Just Realized All Experiments Of All Projects Were Deleted (Clearml Server V1.0.0)

And now that I restarted the server and went back into the project where I initially deleted the archived experiments, some of them are still there - I will leave them alone, too scared to do anything now 😄

4 years ago
0 Hey, I Have A Problem With The Following Task:

AgitatedDove14 So what you are saying is that since I have trains-server 0.16.1, I should use trains>=0.16.1? And what about trains-agent? Only version 0.16 is released atm, this is the one I use

5 years ago
0 Hi, How Can I Get The Logs From The Pytorch Ignite Early Stopping Handler To Be Logged In Clearml?

AgitatedDove14 yes but I don't see in the docs how to attach it to the logger of the earlystopping handler

4 years ago
Show more results compactanswers