Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
215 Questions, 1023 Answers
  Active since 10 January 2023
  Last activity one month ago

Reputation

0

Badges 1

981 × Eureka!
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi, I see that there is a new parameter in aws autoscaler: max_spin_up_time_min - What is the difference with max_idle_time_min ?
aws
4 years ago
0 Votes
15 Answers
2K Views
0 Votes 15 Answers 2K Views
Hi, I restarted my clearml-server (1.1.0) and the login page always redirects me to the login page. I am using fixed users in config files. In the logs of th...
4 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
3 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
3 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
3 years ago
0 Votes
12 Answers
2K Views
0 Votes 12 Answers 2K Views
Hi there! Is there an easy way to retrieve the site-package directory that was created by an agent from inside a task? Eg. task = Task.init(...) task.add_req...
2 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Quick question: Why does clearml-server 1.15.0 api-server python package require ES 8.12.0 but the docker-compose references ES 7.17.18?
one year ago
0 Votes
17 Answers
2K Views
0 Votes 17 Answers 2K Views
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi, what happens exactly when I execute the following command: trains-agent daemon --gpus 0 --queue default &In my code, how to know which GPU to choose insi...
5 years ago
0 Votes
20 Answers
2K Views
0 Votes 20 Answers 2K Views
Hello, I have an error while installing git dependencies of local package: So far I used task. update _requirements(“[.]“) with my local package referencing ...
4 years ago
0 Votes
19 Answers
2K Views
0 Votes 19 Answers 2K Views
Hi again, I am trying to make the aws autoscaler work with ec2 instances, but it fails to setup the agent in the machine: the logs of the user-data script sh...
4 years ago
0 Votes
12 Answers
2K Views
0 Votes 12 Answers 2K Views
3 years ago
0 Votes
8 Answers
2K Views
0 Votes 8 Answers 2K Views
Hi, I would like to create backups of my trains-server periodically. I was thinking about creating a service task under the devops project. The backup task w...
4 years ago
0 Votes
11 Answers
2K Views
0 Votes 11 Answers 2K Views
Hi, I have a question regarding the aws-autoscaler: am I understanding correctly that: max_idle_time_min=5 max_spin_up_time_min=10 polling_interval_time_min=...
3 years ago
0 Votes
0 Answers
2K Views
0 Votes 0 Answers 2K Views
Hello, Pytorch 1.8 was released, bringing AMD wheels with it > pip install torch -f https://download.pytorch.org/whl/rocm4.0.1/torch_stable.html Is ClearML s...
4 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
3 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi, there is a small bug with auto-refreshing in the DEBUG SAMPLES Tab of the Web UI: If it is ON, then it will always force the first series to be displayed...
3 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
Hi there, I would like to report a bug with the resizing of the columns in the projects view: it doesn’t work as expected. Please look at the behavior of the...
3 years ago
0 Votes
15 Answers
2K Views
0 Votes 15 Answers 2K Views
Hi, how can I get the logs from the pytorch ignite early stopping handler to be logged in clearml?
4 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
Hey again ๐Ÿ˜ I am migrating my trains-server to AWS and I would like now to have secure accounts (with password). But I don't want to loose the current users...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Another one: What is the difference between Task.connect() and Task.set_parameter?
5 years ago
0 Votes
17 Answers
2K Views
0 Votes 17 Answers 2K Views
3 years ago
0 Votes
29 Answers
2K Views
0 Votes 29 Answers 2K Views
Hi, although https://github.com/allegroai/clearml/issues/181 is resolved, clearml-agent (0.17.2) still logs tqdm iterations as different lines, is there some...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, I recently updated my clearml to 1.1.2 and a code that was working before now behaves completely differently: I am using the following to log debug sampl...
3 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi there ๐Ÿ™‚ Task.get_parameters() returns an empty dict from within a trains-agent task being executed. When I execute it outside, it works properly. Is it i...
5 years ago
0 Votes
30 Answers
3K Views
0 Votes 30 Answers 3K Views
Hi, I am giving another try to clearml-session and I am blocked at the current error shown when the CLI try to establish the tunneling: Starting SSH tunnel W...
3 years ago
0 Votes
16 Answers
2K Views
0 Votes 16 Answers 2K Views
Hello, ~3 months ago I created a trains-server in a machine with 30gb of disk space. Today I wasn't able to connect to trains-server, so I checked the server...
4 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi there, any plan/benefit to support virtualenv= 20 ?
5 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
4 years ago
0 Votes
3 Answers
1K Views
0 Votes 3 Answers 1K Views
Hey! Would it be possible to tag the RC releases in the different repos? So that one knows what is inside?
5 years ago
Show more results questions
0 Hi, I Would Like To Follow-Up In This

Hi SuccessfulKoala55 , AgitatedDove14 ,
I updated to 1.4.0 (Web UI shows: WebApp: 1.5.0-186 โ€ข Server: 1.5.0-186 โ€ข API: 2.18 )
Unfortunately the bug is still there ๐Ÿ˜ž
I donโ€™t see errors in the console anymore though!

I had another look and modified a events.get_task_logs request with a super old timestamp to try to retrieve all logs, this returned me only the few logs already displayed in the console. So I think the problem doesnโ€™t come from the WebUI, but from the...

3 years ago
0 Hi, I Started A Trains-Agent (0.15) In Services Mode (Full Command:

Probably 6. I think because of some reason, it did not go back to main trains-agent. Nevertheless I am not sure, because a second task could start. It could also be that the second was aborted for some reason while installing task requirements (not system requirements, so executing the trains-agent setup within the docker container) and therefore again it couldn't go back to main trains-agent. But ps -aux shows that the trains-agent is stuck running the first experiment, not the second...

5 years ago
0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

The jump in the loss when resuming at iteration 31 is probably another issue -> for now I can conclude that:
I need to set sdk.development.report_use_subprocess = false I need to call task.set_initial_iteration(0)

4 years ago
0 Hi, I Am Getting The Following Errors In The Experiments I Am Currently Running:

SuccessfulKoala55 Thanks! If I understood correctly, setting index.number_of_shards = 2 (instead of 1) would create a second shard for the large index, splitting it into two shards? This https://stackoverflow.com/a/32256100 seems to say that itโ€™s not possible to change this value after the index creation, is it true?

4 years ago
0 Hey There, I Would Like To Increase The

yes please, I think indeed thatโ€™s the problen

4 years ago
0 Hey There, Is It Possible For A Clearml Pipeline Step To Log A Folder Instead Of Numpy/Pickle Objects? Looking At The Docs,

I guess I can have a workaround by passing the pipeline controller task id to the last step, so that the last step can download all the artifacts from the controller task.

3 years ago
0 Hi, I Have Another Problem

python3 -m trains_agent --config-file "~/trains.conf" daemon --queue default --log-level DEBUG --detached --gpus 1 > ~/trains-agent.startup.log 2>&1

4 years ago
0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

Oops, I spoke to fast, the json is actually not saved in s3

5 years ago
0 Another Strange Behavior Of The Python Sdk Cli: After Executing Python My_Task.Py, Where My_Task.Py Creates And Send To The Queue An Experiment, The Command Returns But After Some Time Some Messages Are Printed In The Console, Such As

yes, so it does exit the local process (at least, the command returns), but another process is still running on the background and is logging things from time to time (such as:)
ClearML Monitor: Could not detect iteration reporting, falling back to iterations as seconds-from-start

4 years ago
0 Hi, If I Am Starting My Training With The Following Command:

For the moment this is what I would be inclined to believe

3 years ago
2 years ago
0 Hi Guys, With The New Venv Caching Available In Clearml, I Have The Following Problem: I Force My Pip Requirements To Be:

This is new right? it detects the local package, uninstalls it and reinstalls it?

4 years ago
0 Hi Again, My Clearml Api-Server Is Having A Memory Leak. Each Time I Restart It, Its Ram Consumption Grows Until Getting Oom, Is Not Killed And Make The Ec2 Instance Crash

Something like that?
` curl "localhost:9200/events-training_stats_scalar-adx3r00cad1bdfvsw2a3b0sa5b1e52b/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must": [
{
"match": {
"variant": "loss_model"
}
},
{
"match": {
"task": "8f88e4b8cff84f23bde74ed4b7213ec6"
}
}
]
}
},
"aggs": {
"series": {
"terms": { "field": "iter" }
}
}
}...

4 years ago
0 Hi There

More context:
trains, trains-agent and trains-server all 0.16 Session.api_version -> 2.9 (both when executed in trains-agent and in local script)

5 years ago
0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

I am still confused though - from the get started page of pytorch website, when choosing "conda", the generated installation command includes cudatoolkit, while when choosing "pip" it only uses a wheel file.
Does that mean the wheel file contains cudatoolkit (cuda runtime)?

4 years ago
0 Hey, I Hope This Is The Right Place To Ask. We'Re A Small Data Science Team That Wants To Log Everything About Our Ml Models. Looking Around On The Internet, Mostly Mlflow Is Being Recommended, But Occasionally The Name Trains Pop-Ups. According To You,

I would let the trains team answer this in details, but as a user moving from MLflow to trains, I can share the following insights:

MLflow and trains overlap when it comes to having a system with nice web UI to compare/log experiments/models/metrics. But MFlow lacks a crutial feature IMO which is ML/DevOps: Using MLFlow, you will have to take care of the whole maintenance of your machines, design interactions between them, etc. This is where trains shines, it provides these features out-of-t...

5 years ago
4 years ago
0 Hi Guys, Following Up On This

AgitatedDove14 This looks awesome! Unfortunately this would require a lot of changes in my current code, for that project I found a workaround ๐Ÿ™‚ But I will surely use it for the next pipelines I will build!

4 years ago
0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

Yes, thanks! In my case, I was actually using TrainsSaver from pytorch-ignite with a local path, then I understood looking at the code that under the hood it actually changed the output_uri of the current task, thats why my previous_task.output_uri = " s3://my_bucket " had no effect (it was placed BEFORE the training)

5 years ago
0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

and saved locally, which is why the second task, not executed in the same machine, cannot access the file

5 years ago
0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

nvm, bug might be from my side. I will open an issue if I find any easy reproducible example

5 years ago
Show more results compactanswers