Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
215 Questions, 1023 Answers
  Active since 10 January 2023
  Last activity 3 months ago

Reputation

0

Badges 1

981 × Eureka!
0 Votes
12 Answers
2K Views
0 Votes 12 Answers 2K Views
Hi, I deleted some archived experiments in clearml server 1.0 and the popup in the dashboard showed “the following artifacts were not deleted”, with a list o...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, Is it still true that --services-mode only supports docker mode?
4 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
4 years ago
0 Votes
27 Answers
2K Views
0 Votes 27 Answers 2K Views
Hi there, I found a memory leak in Logger.report_matplotlib_figure . I was constantly running out of memory when training my models so I decided to spend som...
2 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi again, my clearml api-server is having a memory leak. Each time I restart it, its ram consumption grows until getting OOM, is not killed and make the ec2 ...
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi, what happens exactly when I execute the following command: trains-agent daemon --gpus 0 --queue default &In my code, how to know which GPU to choose insi...
5 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
3 years ago
0 Votes
10 Answers
2K Views
0 Votes 10 Answers 2K Views
Hi, I have a local package that I use to train my models. To start training, I have a script that calls task._update_requirements([".", "torch==1.11.0"]) . I...
3 years ago
0 Votes
11 Answers
2K Views
0 Votes 11 Answers 2K Views
Hi, coming back with the venv caching: with the following setting: I call Task._update_requirements(["."]) setup.py has the following install_requires=["my-p...
4 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
2 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi quick question: does Task.connect_configuration support OmegaConf DictConfig objects? ie. Can I do: config = train_task.connect_configuration(OmegaConf.lo...
3 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi guys, I would like to start using the AWS autoscaler shipped in trains. I need to create a IAM user to get and I would like to know what are the minimal p...
4 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
Hi, is it possible to specify the required version of python for a Task that is different from the python running the clearml-agent? Example: my clearml-agen...
2 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
Hi there, I have several experiments hanging/stuck in the middle or at the end of the training, with the last message logged being: train INFO: Engine run co...
one year ago
0 Votes
13 Answers
2K Views
0 Votes 13 Answers 2K Views
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hey! Would it be possible to tag the RC releases in the different repos? So that one knows what is inside?
5 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
Hi, I cannot manage to start trains-server 0.16 with the docker-compose file, the trains-elastic container fails with the following error:
5 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi there, is it safe to use ClearML (trains >= 0.17) with the trains ignite handler? Should we wait for the update on their side?
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Congrats on the clearml-serving 0.9.0 release! I’ll try it for sure!
3 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, how can I search an old experiment based on its commit hash?
2 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hey, just wanted to mention: in docs, Task.get_parameter does not say: Different sections with key prefix "section/" , as Task.get_parameters do. Also there ...
5 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi, I am considering making automated backups of my clearml-server using Amazon EBS snapshots. Should I be concerned with the same problem described here > h...
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi, I have several long running experiments failing with Process failed, exit code -9 and no other error with clearml 1.0.4 and clearml-agent 1.0.0, what cou...
4 years ago
0 Votes
3 Answers
432 Views
0 Votes 3 Answers 432 Views
3 months ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi ClearML team members! Is there any progress made on the clearml-serving repo? I’d love to start using it but I lack a straightforward get started example....
3 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hi there, would it be possible to add some Neural Architecture Search example, as for the HyperParameter Optimizer examples?
4 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
3 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi, from within an experiment, how can I intercept the signal that the experiment was aborted and execute a cleanup function? I tried to intercept SIGINT and...
3 years ago
0 Votes
11 Answers
2K Views
0 Votes 11 Answers 2K Views
Are the various task types available in 0.15? I am getting > 2020-06-09 12:58:53,287 - trains.Task - WARNING - Retrying, previous request failed : 'custom' i...
5 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi, It seems that the package_manager.pip_version has been removed from the https://allegro.ai/docs/references/trains_ref/#agent , although still being shown...
5 years ago
Show more results questions
0 Hi, I Started A Trains-Agent (0.15) In Services Mode (Full Command:

I will try to isolate the bug, if I can, I will open an issue in trains-agent 🙂

5 years ago
0 Hi There! I Have A Question Regarding S3 Access: I Created A S3 User With Read/Write Access But Not Delete, And Trains Seems To Requires Delete Permissions (See Errors Below). Why Does It Need Delete Permissions?

I actually need to be able to overwrite files, so in my case it makes sense to give the Deleteobject permission in s3. But for other cases, why not simply catch this error, display a warning to the user and store internally that delete is not possible?

5 years ago
5 years ago
0 Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

it would be nice if Task.connect_configuration could support custom yaml file readers for me

3 years ago
0 Hi, Similar To Task.Set_Offline(True), Is There A Way To Simulate An Execution In An Agent? (For Testing Purposes)

I want in my CI tests to reproduce a run in an agent because the env changes and some things break in agents and not locally

3 years ago
0 Hi

Awesome! (Broken link in migration guide, step 3: https://allegro.ai/docs/deploying_trains/trains_server_es7_migration/ )

5 years ago
0 Hi Guys, Following Up On This

AgitatedDove14 This looks awesome! Unfortunately this would require a lot of changes in my current code, for that project I found a workaround 🙂 But I will surely use it for the next pipelines I will build!

5 years ago
0 Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

with open(path, "r") as stream: return yaml.load(stream, Loader=yaml.FullLoader)

3 years ago
0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

AgitatedDove14 I made some progress:
In clearml.conf of the agent, I set: sdk.development.report_use_subprocess = false (because I had the feeling that Task._report_subprocess_enabled = False wasn’t taken into account) I’ve set task.set_initial_iteration(0) Now I was able to get the followin graph after resuming -

4 years ago
0 Hi, I Would Like To Follow-Up In This

SuccessfulKoala55 , This is not the exact corresponding request (I refreshed the tab since then), but the request is an events.get_task_logs , with the following content:

3 years ago
0 Hi There,

I think that somehow somewhere a reference to the figure is still living, so plt.close("all") and gc cannot free the figure and it ends up accumulating. I don't know where yet

2 years ago
0 Hi, Kudos For The 0.15 Guys! I Am Having An Issue Related To Git Auth: I Have An Issue With Trains-Agent (0.15): It Does Not Use Git Creds While Trying To Clone A Private Repo:

I also don't understand what you mean by unless the domain is different... The same way ssh keys are global, I would have expected the git creds to be used for any git operation

5 years ago
0 Hi, I Would Like To Bring Awareness

Ha I just saw in the logs:

WARNING:py.warnings:/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/torch/cuda/__init__.py:145: UserWarning:
NVIDIA A10G with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA A10G GPU with PyTorch, please check the instructions at 
2 years ago
0 Hi, If I Am Starting My Training With The Following Command:

Hi AgitatedDove14 , I investigated further and got rid of a separate bug. I was able to get ignite’s events fired, but still no scalars logged 😞
There is definitely something wrong going on with the reporting of scalars using multi processes, because if my ignite callback is the following:

` def log_loss(engine):
idist.barrier(). # Sync all processes
device = idist.device()
print("IDIST", device)
from clearml import Task
Task.current_task().get_logger().r...

3 years ago
5 years ago
0 Hello There, I Would Like To Do Run Cleanup Code In Case The User Aborts One Task From The Dashboard (The Agent Is Not Using The Task In Docker). What Signal Should I Listen For In The Task?

The clean up service is awesome, but it would require to have another agent running in services mode in the same machine, which I would rather avoid

4 years ago
0 I Guess One Experiment Is Running Backwards In Time

CostlyOstrich36 I don’t see such number, can you please share a screenshot of where to look at?

3 years ago
0 Hi, Where Can I Find The Logs Of Trains-Agent By Default?

Thanks, the message is not logged in GCloud instances logs when using startup scripts, this is why I did not see it. 👍

5 years ago
0 Hi, I Have Another Bug To Report For Clearml-Server 1.2 (Self Hosted) In The Console Logs Of An Experiments, I Cannot See The Latest Logs. Eg My Experiment Is Done, But I Can Only See The Logs Of To The Installation Of The Packages. If I Download The Log

CostlyOstrich36 , actually this only happens for a single agent. The weird thing is that I have a machine with two gpus, and I spawn two agents, one per gpus. Both have the same version. For one, I can see all the logs, but not for the other

3 years ago
Show more results compactanswers