Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
215 Questions, 1023 Answers
  Active since 10 January 2023
  Last activity 3 months ago

Reputation

0

Badges 1

981 × Eureka!
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
Hi there, I have several experiments hanging/stuck in the middle or at the end of the training, with the last message logged being: train INFO: Engine run co...
one year ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi, I am using clearml with pytorch-ignite and its EarlyStopping handler. I would like to log the counter of the patience of this handler, how can I do that?
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
The “Manage queue” option in the right tab on a queued experiment is broken in v1.0 (it does nothing)
4 years ago
0 Votes
20 Answers
2K Views
0 Votes 20 Answers 2K Views
Is it possible to run an agent, listen to the services queue without using docker?
5 years ago
0 Votes
13 Answers
2K Views
0 Votes 13 Answers 2K Views
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Is there an option to make trains-agent create experiment virtualenvs with --system-site-packages parameter?
5 years ago
0 Votes
17 Answers
2K Views
0 Votes 17 Answers 2K Views
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Another one: What is the difference between Task.connect() and Task.set_parameter?
5 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hey! Would it be possible to tag the RC releases in the different repos? So that one knows what is inside?
5 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi, would it be possible to parse torch requirement when it’s part of the extras_require dict? In my code, I have the following: train_task._update_requireme...
4 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
Hi, I cannot manage to start trains-server 0.16 with the docker-compose file, the trains-elastic container fails with the following error:
5 years ago
0 Votes
18 Answers
2K Views
0 Votes 18 Answers 2K Views
Hi Guys, I had several times now the following errors poping in agents while executing a task: trains_agent: ERROR: Failed applying git diff: I attached the ...
4 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi there, is it safe to use ClearML (trains >= 0.17) with the trains ignite handler? Should we wait for the update on their side?
4 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi guys, with the new venv caching available in clearml, I have the following problem: I force my pip requirements to be: torch==1.7.1 pytorch-ignite clearml...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Congrats on the clearml-serving 0.9.0 release! I’ll try it for sure!
3 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, how can I search an old experiment based on its commit hash?
2 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hey there, happy new year to all of you 🍾 I have several tasks that are stuck while training a model with pytorch/ignite, more precisely right after uploadi...
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
2 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hey, just wanted to mention: in docs, Task.get_parameter does not say: Different sections with key prefix "section/" , as Task.get_parameters do. Also there ...
5 years ago
0 Votes
13 Answers
3K Views
0 Votes 13 Answers 3K Views
Hi, I am trying to use the clearml-agent in docker mode to run an experiment, but it seems to fail passing the clearml.conf file to the docker container: Exe...
2 years ago
0 Votes
20 Answers
2K Views
0 Votes 20 Answers 2K Views
Hello, I have an error while installing git dependencies of local package: So far I used task. update _requirements(“[.]“) with my local package referencing ...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, Is it still true that --services-mode only supports docker mode?
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi, I am considering making automated backups of my clearml-server using Amazon EBS snapshots. Should I be concerned with the same problem described here > h...
4 years ago
0 Votes
9 Answers
2K Views
0 Votes 9 Answers 2K Views
Hi, I want to upgrade clearml server from 1.1 to 1.2 (self hosted). I have the following setup: /dev/nvme0n1p1 30G 21G 8.9G 70% / <- This is where /opt/clear...
3 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, I recently updated my clearml to 1.1.2 and a code that was working before now behaves completely differently: I am using the following to log debug sampl...
4 years ago
0 Votes
11 Answers
2K Views
0 Votes 11 Answers 2K Views
Hi, I have a question regarding the aws-autoscaler: am I understanding correctly that: max_idle_time_min=5 max_spin_up_time_min=10 polling_interval_time_min=...
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi, I have several long running experiments failing with Process failed, exit code -9 and no other error with clearml 1.0.4 and clearml-agent 1.0.0, what cou...
4 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
Hi all, how can I have a global variable used in a pipeline step? I have to define them in each pipeline step, otherwise they are not included in the pipelin...
one year ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
Hi, in the Metric Snapshot graph, is it possible to scale the Y axis to [y_min *0.9, y_max * 1,1] ? currently all my values are flat at the same ~y and it is...
3 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi, Together with ElegantKangaroo44 we found two unexpected behaviors in task.models['output'] : The input model of the task is included in the list The best...
5 years ago
Show more results questions
0 Hi, One More Question: When Creating A Task With Task.Init(), We Can Specify The

I still don't see why you would change the type of the cloned Task, I'm assuming the original Task had the correct type, no?

Because it is easier for me that I create a training task out of the controller task by cloning it (so that parameters are prefilled and I can set the parent task id)

5 years ago
5 years ago
0 Could You Please Explain A Bit More How Trains Adapt The Torch Version Depending On The Installed Cuda Version? Here Is My Setup:

What happens is different error but it was so weird that I thought it was related to the version installed

5 years ago
0 Could You Please Explain A Bit More How Trains Adapt The Torch Version Depending On The Installed Cuda Version? Here Is My Setup:

agent.package_manager.type = pip ... Using base prefix '/home/machine1/miniconda3/envs/py36' New python executable in /home/machine1/.trains/venvs-builds/3.6/bin/python3.6 Also creating executable in /home/machine1/.trains/venvs-builds/3.6/bin/python Installing setuptools, pip, wheel...

5 years ago
0 Hi, Together With

The experiment finished completely this time again

5 years ago
0 Hi There

Yes this is correct. I am trying to create a minimal reproducable example

5 years ago
0 Hey, I Hope This Is The Right Place To Ask. We'Re A Small Data Science Team That Wants To Log Everything About Our Ml Models. Looking Around On The Internet, Mostly Mlflow Is Being Recommended, But Occasionally The Name Trains Pop-Ups. According To You,

I would let the trains team answer this in details, but as a user moving from MLflow to trains, I can share the following insights:

MLflow and trains overlap when it comes to having a system with nice web UI to compare/log experiments/models/metrics. But MFlow lacks a crutial feature IMO which is ML/DevOps: Using MLFlow, you will have to take care of the whole maintenance of your machines, design interactions between them, etc. This is where trains shines, it provides these features out-of-t...

5 years ago
0 Hey There, Since A Bit I Often Find Experiments Being Stuck While Training A Model. It Seems To Happen Randomly And I Could Not Find A Reproducible Scenario So Far, But It Happens Often Enough To Be Annoying (I'D Say 1 Out Of 5 Experiments). The Symptoms

You mean you "aborted the task" from the UI?

Yes exactly

I'm assuming from the leftover processes ?

Most likely yes, but I don't see how clearml would have an impact here, I am more inclined to think it would be a pytorch dataloader issue, although I don't see why

From the log I see the agent is running in venv mode
Hmm please try with the latest clearml-agent (the others should not have any effect)

yes in venv mode, I'll try with the latest version as well

2 years ago
0 Hi, I Am Getting The Following Errors In The Experiments I Am Currently Running:

can it be that the merge op takes so much filesystem cache that the rest of the system becomes unresponsive?

4 years ago
0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

I am running on bare metal, and cuda seems to be installed at /usr/lib/x86_64-linux-gnu/libcuda.so.460.39

4 years ago
0 Hi, If I Am Starting My Training With The Following Command:

AgitatedDove14 I think it’s on me to take the pytorch distributed example in the clearml repo and try to reproduce the bug, then pass it over to you πŸ™‚

3 years ago
0 Hi There,

clearml doesn't change the matplotlib backend under the hood, right? Just making sure πŸ˜„

2 years ago
0 Hi There

Yes, in the Task being executed in the agents, I have:
from trains import Task task = Task.init(...) task.get_logger().report_text(str(task.get_parameters()))

5 years ago
0 Hi Guys, I Had Several Times Now The Following Errors Poping In Agents While Executing A Task:

yes, here is the error (the space at the end of the line is there)
` Applying uncommitted changes
Executing: ('git', 'apply'): b'error: corrupt patch at line 13\n'
Failed applying diff
trains_agent: ERROR: Failed applying git diff:
diff --git a/configs/2.2.2_from_scratch.yaml b/configs/2.2.2_from_scratch.yaml
index 9fece48..5816f78 100644
--- a/configs/2.2.2_from_scratch.yaml
+++ b/configs/2.2.2_from_scratch.yaml
@@ -136,7 +136,7 @@ data_processing:
optimizer:
type: 'RMSprop'
args:

  • lr: 2.5e...
4 years ago
0 Hi Guys, Is A Task Updating Its Status To 'Complete' Before Finishing To Upload Its Artifacts/Metrics In The Background?

No, I want to launch the second step after the first one is finished and all its artifacts are uploaded

5 years ago
0 Hey, Often I Want To Compare Scalars Of Two Experiments With The Same Name But With Different Tags. In The Scalars Comparison Tab, I Cannot See Which Experiment Is Which Because I Don’T See The Tags. Usually, I Rename The Experiments So That I Can Identif

Usually one or two tags, indeed, task ids are not so convenient, but only because they are not displayed in the page, so I have to go back to another page to check the ID of each experiment. Maybe just showing the ID of each experiment in the SCALAR page would already be great, wdyt?

3 years ago
0 I Am Wondering Is It Possible To Schedule A Task To Run At Certain Time In Periodic Fashion Aka. Cron Style... Thinking Of Having A Monitoring Task To Be Run Routinely ... I Could Use A Cron On One Of The Server But Prefer To Run It On Trains As Then I Am

I don't think there is an example for this use case in the repo currently, but the code should be fairly simple (below is a rough draft of what it could look like)
` controller_task = Task.init(...)
controller_task.execute_remotely(queue_name="services", clone=False, exit_process=True)

while True:
periodic_task = Task.clone(template_task_id)
# Change parameters of {periodic_task} if necessary
Task.enqueue(periodic_task, queue="default")
time.sleep(TRIGGER_TASK_INTERVAL_SECS) `

5 years ago
Show more results compactanswers