Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
214 Questions, 1021 Answers
  Active since 10 January 2023
  Last activity 7 months ago

Reputation

0

Badges 1

979 × Eureka!
0 Votes
18 Answers
1K Views
0 Votes 18 Answers 1K Views
Hi, I just updated clearml server 1.0 using docker-compose down & docker-compose pull & docker-compose up -d , it worked ant it looks amazing! I found two pr...
3 years ago
0 Votes
23 Answers
1K Views
0 Votes 23 Answers 1K Views
Hi, I started a trains-agent (0.15) in services mode (full command: trains-agent daemon --services-mode --detached --queue services --create-queue --docker u...
4 years ago
0 Votes
12 Answers
928 Views
0 Votes 12 Answers 928 Views
Hey, would it possible to add an option to make task.upload_artifact() blocking? (Not running in background)
4 years ago
0 Votes
1 Answers
967 Views
0 Votes 1 Answers 967 Views
Hi, I have a clearml-agent (1.1.2) in a g4dn.4xlarge AWS instance (with one T4 GPU), that reports agent.cuda_version = 0 agent.cudnn_version = 0and does not ...
2 years ago
0 Votes
18 Answers
1K Views
0 Votes 18 Answers 1K Views
Hi, kudos for the 0.15 guys! I am having an issue related to git auth: I have an issue with trains-agent (0.15): it does not use git creds while trying to cl...
4 years ago
0 Votes
20 Answers
1K Views
0 Votes 20 Answers 1K Views
Is it possible to run an agent, listen to the services queue without using docker?
4 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
Hi, is it possible to start a clearml-agent (not in docker mode) on a machine with a gpu, but enforce the clearml-agent to not “see” the gpu? So that the exp...
3 years ago
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
2 years ago
0 Votes
1 Answers
982 Views
0 Votes 1 Answers 982 Views
Hi, is there a way to update the setup shell script via the SDK?
one year ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
3 years ago
0 Votes
11 Answers
1K Views
0 Votes 11 Answers 1K Views
Hi, I have a question regarding the aws-autoscaler: am I understanding correctly that: max_idle_time_min=5 max_spin_up_time_min=10 polling_interval_time_min=...
3 years ago
0 Votes
12 Answers
1K Views
0 Votes 12 Answers 1K Views
2 years ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
3 years ago
0 Votes
30 Answers
981 Views
0 Votes 30 Answers 981 Views
Could you please explain a bit more how trains adapt the torch version depending on the installed cuda version? Here is my setup: cuda 102 installed and corr...
4 years ago
0 Votes
1 Answers
967 Views
0 Votes 1 Answers 967 Views
Hi there, is it safe to use ClearML (trains >= 0.17) with the trains ignite handler? Should we wait for the update on their side?
3 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
Hello, I tried the clearml-session CLI to start a jupyter instance on an agent, but an error with the password, here is the full CLI log: $ clearml-session -...
3 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
3 years ago
0 Votes
13 Answers
1K Views
0 Votes 13 Answers 1K Views
2 years ago
0 Votes
22 Answers
1K Views
0 Votes 22 Answers 1K Views
Hi there, I used clearml-task to send a script to be executed remotely. When being executed remotely Task.current_task() returns None, how should I get the c...
2 years ago
0 Votes
3 Answers
989 Views
0 Votes 3 Answers 989 Views
Hi, in the context of multi-gpu training, is Model.get_local_copy() multi-process safe? or should make sure only the first process calls it first, then others
3 years ago
0 Votes
4 Answers
897 Views
0 Votes 4 Answers 897 Views
Is there a way to report a simple series with X and Y coords, X and Y being two lists of same length?
4 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
How can I filter out archived tasks with Task.get_tasks?
3 years ago
0 Votes
13 Answers
985 Views
0 Votes 13 Answers 985 Views
Hello, in the following context: controller_task = Task.init(...) # This will clone the parent task, enqueue and wait for finished status data_processing_tas...
4 years ago
0 Votes
17 Answers
1K Views
0 Votes 17 Answers 1K Views
Hello, I am trying to retrieve a simple dict artifact uploaded in a previous task with task.upload_artifact("my_dict", dict(foo="bar")) in a second task. I t...
4 years ago
0 Votes
11 Answers
1K Views
0 Votes 11 Answers 1K Views
Hi, coming back with the venv caching: with the following setting: I call Task._update_requirements(["."]) setup.py has the following install_requires=["my-p...
3 years ago
0 Votes
12 Answers
1K Views
0 Votes 12 Answers 1K Views
Hi there! Is there an easy way to retrieve the site-package directory that was created by an agent from inside a task? Eg. task = Task.init(...) task.add_req...
2 years ago
0 Votes
13 Answers
1K Views
0 Votes 13 Answers 1K Views
Hey there, Is it possible for a clearml pipeline step to log a folder instead of numpy/pickle objects? Looking at the docs, monitor_artifacts could be what I...
2 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
Hi, I would like to report something else weird in the clearml-agent 1.5.1 running in docker mode: In the logs, when it dumps its config, it writes: docker_c...
one year ago
0 Votes
5 Answers
937 Views
0 Votes 5 Answers 937 Views
Hi, I have a long running experiment that was running on AWS instance that got killed after ~4 days with the following reason: STATUS REASON: Forced stop (no...
2 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
3 years ago
Show more results questions
0 Hi, I Have Another Problem

I don't know why it didn't detect it in first place

4 years ago
0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

Sorry, what I meant is that it is not documented anywhere that the agent should run in docker mode, hence my confusion

2 years ago
0 Hi, How Can I Change The Project.Default_Output_Destination? I Tried Setting It To None But It Is Not Updated

Thanks AgitatedDove14 ! I created a project with a default output destination to a s3 bucket but I don't have local access to this bucket (only agents have access to it for security reasons). Because of that, I cannot create a task in this project programmatically locally because it tries to access the bucket and fails. And there is no easy way to change the default output location (not in the web UI, not in the sdk)

2 years ago
0 Hello, I Have An Error While Installing Git Dependencies Of Local Package: So Far I Used Task.

yes, the only thing I changed is:
install_requires=[ ... "my-dep @ git+ ]to:
install_requires=[ ... "git+ "]

3 years ago
0 Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

Hey SuccessfulKoala55 , unfortunately this doesn’t work, because the dict contains others dicts, and only the first level dict becomes a dict, the inner dicts still are ProxyDictPostWrite and will make OmegaConf.create fail

2 years ago
0 Hi, I Have An Agent That Is Running Two Experiments At The Same Time: One That Was Running For A Long Time (11H) And One That The Agent Picked Up Afterwards, While The First One Was Still Running. Context: I Have 3 Agents Up (Not In Docker Mode) And All O

This is consistent: Each time I send a new task on the default queue, if trains-agent-1 has only one task running (the long one), it will pick another one. If I add one more experiment in the queue at that point (trains-agent-1 running two experiments at the same time), that experiment will stay in queue (trains-agent-2 and trains-agent-3 will not pick it because they also are running experiments)

4 years ago
0 Hi Quick Question: Does Task.Connect_Configuration Support Omegaconf Dictconfig Objects? Ie. Can I Do:

Hi CostlyOstrich36 , I am not using Hydra, only OmegaConf, so you mean just calling OmegaConf.load should be enough?

2 years ago
0 Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

with open(path, "r") as stream: return yaml.load(stream, Loader=yaml.FullLoader)

2 years ago
0 Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

it would be nice if Task.connect_configuration could support custom yaml file readers for me

2 years ago
0 Hi Guys, I Got A Very Unexpected Error Today On In One Of My Agents:

mmmmh I just restarted the experiment and it seems to work now. I am not sure why that happened. From this SO it could be related to size of the repo. Might be a good idea to clone with --depth 1 in the agents?
Or more generally, try to catch this error and retry a few times?

4 years ago
0 Hi There, I Have Several Experiments Hanging/Stuck In The Middle Or At The End Of The Training, With The Last Message Logged Being:

Hi @<1523701087100473344:profile|SuccessfulKoala55> I was able to find the issue, I was creating a queue and worker subprocess that were not properly cleaned up

7 months ago
0 Hi Guys, I Got A Very Unexpected Error Today On In One Of My Agents:

Unfortunately this is difficult to reproduce... Neverthless it would be important to me to be robust against it, because if this error happens in a task in the middle of my pipeline, the whole process fails.

This binds to another wider topic I think: How to "skip" tasks if they already run (a mechanism similar to what [ https://luigi.readthedocs.io/en/stable/ ] offers). That would allow to restart the pipeline and skip tasks until the point where the task failed

4 years ago
0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

ClearML has a task.set_initial_iteration , I used it as such:
checkpoint = torch.load(checkpoint_fp, map_location="cuda:0") Checkpoint.load_objects(to_load=self.to_save, checkpoint=checkpoint) task.set_initial_iteration(engine.state.iteration)But still the same issue, I am not sure whether I use it correctly and if it’s a bug or not, AgitatedDove14 ? (I am using clearml 1.0.4rc1, clearml-agent 1.0.0)

3 years ago
0 Hi, With Clearml-Agent 1.5.1, I Tried To Run An Experiment Within A Docker With Image Python3:8 And It Failed Executing The Task While Trying To Call Python3.9. I Am Not Sure Why It'S Using Python3.9, Since The Agent.Default_Python Is 3.8 And The Image Is

Hi SmugDolphin23 thanks for the input! Will try now but that seems hacky: to have it working I have to specify python3.8 two times:
one in the agent config file (agent.default_python is already python3.8, but seems to be ignored) + make sure it is available (using python:3.8 docker image)Is there a way to prevent this redundancy? Ie. If I want to change the python version, I can control it from a single place?

one year ago
Show more results compactanswers