Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
215 Questions, 1023 Answers
  Active since 10 January 2023
  Last activity 3 months ago

Reputation

0

Badges 1

981 × Eureka!
0 Votes
27 Answers
2K Views
0 Votes 27 Answers 2K Views
Hi there, I found a memory leak in Logger.report_matplotlib_figure . I was constantly running out of memory when training my models so I decided to spend som...
2 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hello, I have a small question regarding UI: Currently, in the artifacts section of a task, the FILE PATH displayed for artifacts stored in s3 are displayed ...
5 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi there, I am trying to start an agent in services mode with trains-server being on localhost (but not started together with the docker-compose!). My trains...
5 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi, is clearml-server compatible with latest versions of ES ( > 7.6.2)?
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hello there, is there a parameter to configure the number of columns rendered in the preview area of the CSV artifacts? (some of them are truncated with “…”)
4 years ago
0 Votes
19 Answers
2K Views
0 Votes 19 Answers 2K Views
Hi again, I am trying to make the aws autoscaler work with ec2 instances, but it fails to setup the agent in the machine: the logs of the user-data script sh...
4 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
How can I do the following? (basically, filtering by task type) Task.get_tasks(project_name="my-project", task_name="my-task", task_filter=dict(type="trainin...
5 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi, is there a way to update the setup shell script via the SDK?
2 years ago
0 Votes
22 Answers
2K Views
0 Votes 22 Answers 2K Views
Hi there, I used clearml-task to send a script to be executed remotely. When being executed remotely Task.current_task() returns None, how should I get the c...
3 years ago
0 Votes
7 Answers
2K Views
0 Votes 7 Answers 2K Views
Hi, I think there is a small bug in the Experiment running time column of the workers-and-queues/workers page: they do not match the time reported in the exp...
3 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi, in the "Choose compared experiments" view of the WebUI, would it be possible to add a toggle to include archived experiments in the results of the search...
3 years ago
0 Votes
12 Answers
2K Views
0 Votes 12 Answers 2K Views
Hi, where can I find the server parameter to control when the server is unregistering an agent after not receiving updates? Currently it's quite long (30mins...
2 years ago
0 Votes
18 Answers
2K Views
0 Votes 18 Answers 2K Views
Hi, I just updated clearml server 1.0 using docker-compose down & docker-compose pull & docker-compose up -d , it worked ant it looks amazing! I found two pr...
4 years ago
0 Votes
10 Answers
2K Views
0 Votes 10 Answers 2K Views
Hi, I have a local package that I use to train my models. To start training, I have a script that calls task._update_requirements([".", "torch==1.11.0"]) . I...
3 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hey there, Does trains support clicks ? (entry points defined with that library)
5 years ago
0 Votes
11 Answers
2K Views
0 Votes 11 Answers 2K Views
Hi guys, following up on this https://allegroai-trains.slack.com/archives/CTK20V944/p1599135173096200?thread_ts=1599125260.076600&cid=CTK20V944 : I have a pi...
5 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
First link in hyperparameter optimization page is broken > https://allegro.ai/docs/examples/examples_hyperparam_opt/
5 years ago
0 Votes
30 Answers
3K Views
0 Votes 30 Answers 3K Views
Hi, I am giving another try to clearml-session and I am blocked at the current error shown when the CLI try to establish the tunneling: Starting SSH tunnel W...
3 years ago
0 Votes
13 Answers
2K Views
0 Votes 13 Answers 2K Views
Hello, in the following context: controller_task = Task.init(...) # This will clone the parent task, enqueue and wait for finished status data_processing_tas...
5 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
aws
4 years ago
0 Votes
10 Answers
2K Views
0 Votes 10 Answers 2K Views
Hi, just want to report a small bug in the clearml dashboard: after queuing an experiment, if I change the experiment queue, then go back to the experiment I...
4 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
Hi, Is there a way to stop a clearml-agent from within an experiment? Or block it to prevent it running any other task?
4 years ago
0 Votes
9 Answers
2K Views
0 Votes 9 Answers 2K Views
Another strange behavior of the python SDK CLI: after executing python my_task.py, where my_task.py creates and send to the queue an experiment, the command ...
4 years ago
0 Votes
10 Answers
2K Views
0 Votes 10 Answers 2K Views
Hey, what is the exact difference between agent.package_manager.system_site_packages and trains-agent --install-globally ?
5 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi, if I am starting my training with the following command: python -u -m torch.distributed.launch --nproc_per_node=2 --use_env train.py --config configs/tra...
3 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hey there again, I am not sure to understand what is the difference between StorageManager and StorageHelper and which one to use?
5 years ago
0 Votes
27 Answers
2K Views
0 Votes 27 Answers 2K Views
5 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, is it possible to get an artifact from a Task and force not using local cache? The task itself updated the artifact in the meantime and I cannot get the ...
4 years ago
0 Votes
30 Answers
3K Views
0 Votes 30 Answers 3K Views
Hello, I am getting ValueError: Could not get access credentials for ' s3://my-bucket ' , check configuration file ~/trains.conf but I did specify them in my...
5 years ago
Show more results questions
0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

(BTW: it will work with elevated credentials, but probably not recommended)

What does that mean? Not sure to understand

3 years ago
0 Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

So I need to have this merging of small configuration files to build the bigger one

3 years ago
0 Hi, I Have Another Bug To Report For Clearml-Server 1.2 (Self Hosted) In The Console Logs Of An Experiments, I Cannot See The Latest Logs. Eg My Experiment Is Done, But I Can Only See The Logs Of To The Installation Of The Packages. If I Download The Log

CostlyOstrich36 I updated both agents to 1.1.2 and still go the same problem unfortunately. Since I can download the full log file from the Web UI, I guess the agents are reporting correctly?
Could it be that the elasticsearch does not return all the requested logs when it is queried from the WebUI to display it in the console?
Now that I think about it, I remember that on the changelog of the clearml-server 1.2.0 the following is listed:
` Fix UI Workers & Queues and Experiment Table pages ...

3 years ago
5 years ago
0 Hi, I Just Updated Clearml-Server To 1.1.0 And Got The Following Error When Starting It With Docker-Compose:

Ok I have a very different problem now: I did the following to restart the ES cluster:
docker-compose down docker-compose up -dAnd now the cluster is empty. I think docker simply created a new volume instead of reusing the previous one, which was always the case so far.

4 years ago
0 Hi Guys, With The New Venv Caching Available In Clearml, I Have The Following Problem: I Force My Pip Requirements To Be:

I carry this code from older versions of trains to be honest, I don't remember precisely why I did that

4 years ago
0 Hey, I Have A Problem With The Following Task:

The cloning is done in another task, which has the argv parameters I want the cloned task to inherit from

5 years ago
3 years ago
0 Hi There,

Is it exactly agg or something different?

2 years ago
0 Hi, With Clearml-Agent 1.5.1, I Tried To Run An Experiment Within A Docker With Image Python3:8 And It Failed Executing The Task While Trying To Call Python3.9. I Am Not Sure Why It'S Using Python3.9, Since The Agent.Default_Python Is 3.8 And The Image Is

I have a mental model of the clearml-agent as a module to spin my code somewhere, and the python version running my code should not depend of the python version running the clearml-agent (especially for experiments running in containers)

2 years ago
0 Hi, Although

Yes, I will try 🙂

4 years ago
0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

The jump in the loss when resuming at iteration 31 is probably another issue -> for now I can conclude that:
I need to set sdk.development.report_use_subprocess = false I need to call task.set_initial_iteration(0)

4 years ago
0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

AgitatedDove14 I do continue an aborted Task yes - So I shouldn’t even need to call the task.set_initial_iteration function, interesting! Do you have any ideas what could be a reason of the behavior I am observing? I am trying to find ways to debug it

4 years ago
0 Hi There! Is There An Easy Way To Retrieve The Site-Package Directory That Was Created By An Agent From Inside A Task? Eg.

Now I'm curious, what did you end up doing ?

in my repo I maintain a bash script to setup a separate python env. then in my task I spawn a subprocess and I don't pass the env variables, so that the subprocess properly picks up the separate python env

2 years ago
0 Hey There, I Moved The Clearml S3 Bucket Where I Stored All My Clearml Data From One S3 Bucket To Another And Now I Realized That All The Models/Experiments Logged In The Clearml-Server Still Refer To The Old S3 Bucket. Is There A Way To Update All The Re

Yes, I would like to update all references to the old bucket unfortunately… I think I’ll simply delete the old s3 bucket, wait or his name to be available again and recreate it where on the other aws account and move the data there. This way I don’t have to mess with clearml data - I am afraid to do something wrong and loose data

4 years ago
0 Hi, If I Am Starting My Training With The Following Command:

So probably only the main process (rank=0) should attach the ClearMLLogger?

3 years ago
0 Hi, I Think There Is A Small Bug In The

It could be yes but the difference between now and last_report_time doesn’t match the difference I observe

3 years ago
0 Hi, I Started A Trains-Agent (0.15) In Services Mode (Full Command:

Alright, I had a look in the /tmp/.trains_agent_daemon_outabcdef.txt logs, not many insights from here. For the moment, I simply started a new trains-agent daemon in services mode and I will wait to see what happens.

5 years ago
0 Hi There

basically:
` from trains import Task

task = Task.init("test", "test", "controller")
task.upload_artifact("test-artifact", dict(foo="bar"))
cloned_task = Task.clone(task, name="test", parent=task.task_id)
cloned_task.data.script.entry_point = "test_task_b.py"
cloned_task._update_script(cloned_task.data.script)
cloned_task.set_parameters(**{"artifact_name": "test-artifact"})
Task.enqueue(cloned_task, queue_name="default") `

5 years ago
5 years ago
Show more results compactanswers