Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
215 Questions, 1023 Answers
  Active since 10 January 2023
  Last activity 3 months ago

Reputation

0

Badges 1

981 × Eureka!
0 Votes
16 Answers
2K Views
0 Votes 16 Answers 2K Views
Got some errors while running migration script from ES5 to ES7: 2020-08-11 15:21:50,130 Running on: Linux 2020-08-11 15:21:50,227 Docker allocated memory: 16...
5 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Could you please explain a bit more how trains adapt the torch version depending on the installed cuda version? Here is my setup: cuda 102 installed and corr...
5 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi, in one of my agents with CUDA Version: 11.1 (from nvidia-smi), clearml agent 0.17.1 detects version 100 (I can see from experiments logs: agent.cuda_vers...
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
3 years ago
0 Votes
7 Answers
2K Views
0 Votes 7 Answers 2K Views
Hi, I think there is a small bug in the Experiment running time column of the workers-and-queues/workers page: they do not match the time reported in the exp...
3 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hey, I would like my experiment to call at some point a CLI program installed as a dependency of the experiment. Here is what I do: myTask = Task.init(...) i...
5 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
4 years ago
0 Votes
28 Answers
2K Views
0 Votes 28 Answers 2K Views
Hi, I am trying to use omegaconf with task.connect_configuration and I get the following error: >>> OmegaConf.create(task.connect_configuration(config_dict))...
3 years ago
0 Votes
16 Answers
2K Views
0 Votes 16 Answers 2K Views
Hi guys, coming this time to share an idea of a killer feature for ClearML 🚀 I am pretty sure you guys already heard of https://www.streamlit.io/ , which is...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hello, what is the default limit for global context ? https://allegro.ai/docs/storage_manager_storagemanager.html#trains.storage.manager.StorageManager.get_l...
5 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi, I have a long running experiment that was running on AWS instance that got killed after ~4 days with the following reason: STATUS REASON: Forced stop (no...
3 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi again, my clearml api-server is having a memory leak. Each time I restart it, its ram consumption grows until getting OOM, is not killed and make the ec2 ...
4 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Hi there, I moved my ClearML server from US to EU and now I am trying to setup the AWS autoscaler with the different architecture that I have now. So far I u...
4 years ago
0 Votes
0 Answers
2K Views
0 Votes 0 Answers 2K Views
Hello, Pytorch 1.8 was released, bringing AMD wheels with it > pip install torch -f https://download.pytorch.org/whl/rocm4.0.1/torch_stable.html Is ClearML s...
4 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
4 years ago
0 Votes
7 Answers
2K Views
0 Votes 7 Answers 2K Views
Hi, I recently updated clearml-server to 1.7 and I am getting a lot of the following errors since today on any experiment (I didn't had this error before): 1...
3 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
Hi there, is it possible to configure the clearml-agent to run some commands before running each experiment it launches? Eg. echo "test" > "test.txt" && <-- ...
4 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi, is it possible to disable some of the system metrics monitored? and also downsample the rate of logging?
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi guys, since I am done with implementing the AWS autoscaler, I would like to share some pain points that I encountered in the process with the hope that th...
aws
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Is there a way to report a simple series with X and Y coords, X and Y being two lists of same length?
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi guys; another idea: would be very cool to have a mattermost alert (monitor task), just like the one for Slack. Have a nice week-end all 👋
4 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi there, I think there is a bug with clearml sdk v0.17.5rc2: when running a task locally, the dashboard doesnt not shows the task as finished once the task ...
4 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
Hi, I am using the aws autoscaler and getting the following error while trying to spin up spot instances: 2021-08-16 17:18:48 Spinning new instance type=v100...
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
Hi, is clearml-server compatible with latest versions of ES ( > 7.6.2)?
4 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi, I am getting the following errors in the experiments I am currently running: 2021-06-25 17:11:47,911 - clearml.Metrics - ERROR - Action failed <504/0: ev...
4 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi, I would like to use pytorch3d==0.5.0 with torch==1.9.1 on cuda version 110, locally it works, but the clearml agent fails setting up the environment with...
4 years ago
0 Votes
12 Answers
2K Views
0 Votes 12 Answers 2K Views
Hi there! Is there an easy way to retrieve the site-package directory that was created by an agent from inside a task? Eg. task = Task.init(...) task.add_req...
2 years ago
0 Votes
0 Answers
1K Views
0 Votes 0 Answers 1K Views
(sorry I pinned the message accidentally 😅 )
5 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
3 years ago
0 Votes
18 Answers
2K Views
0 Votes 18 Answers 2K Views
Hi, I just updated clearml server 1.0 using docker-compose down & docker-compose pull & docker-compose up -d , it worked ant it looks amazing! I found two pr...
4 years ago
Show more results questions
0 Is It Possible To Run An Agent, Listen To The Services Queue Without Using Docker?

Alright, thanks for the answer! Seems legit then 🙂

5 years ago
0 Hi, Together With

Seems to works, I started a last one to confirm!

5 years ago
0 Hi, Where Can I Find The Server Parameter To Control When The Server Is Unregistering An Agent After Not Receiving Updates? Currently It'S Quite Long (30Mins) And This Prevents The Autoscaler From Launching A New Agent

, causing it to unregister from the server (and thus not remain there).

Do you mean that the agent actively notifies the server that it is going down? or the server infers that the agent is down after a timeout?

2 years ago
0 Hi, Is It Possible To Pass Environment Variables To Agents Created By The Aws Autoscaler Service?

extra_configurations = {'SubnetId': "<subnet-id>"}with brackets right?

4 years ago
0 Trains-Elastic | {"Type": "Server", "Timestamp": "2020-12-07T15:19:11,101Z", "Level": "Error", "Component": "O.E.B.Elasticsearchuncaughtexceptionhandler", "Cluster.Name": "Trains", "Node.Name": "Trains", "Message": "Uncaught Exception In Thread [Main]",

it worked for the other folder, so I assume yes --> I archived the /opt/trains/data/mongo, sent the archive via scp, unarchived, updated the rights and now it works

4 years ago
5 years ago
0 Hi, Where Can I Find The Server Parameter To Control When The Server Is Unregistering An Agent After Not Receiving Updates? Currently It'S Quite Long (30Mins) And This Prevents The Autoscaler From Launching A New Agent

Yes it would be very valuable to be able to tweak that param, currently it's quite annoying because it's set to 30 mins, so when a worker is killed by the autoscaler, I have to wait 30 mins before the autoscaler spins up a new machine because the autoscaler thinks there is already enough agents available, while in reality the agent is down

2 years ago
0 Hi, It Seems That The

Thanks SuccessfulKoala55 for the answer! One followup question:
When I specify:
agent.package_manager.pip_version: '==20.2.3'
in the trains.conf, I get:
trains_agent: ERROR: Failed parsing /home/machine1/trains.conf (ParseException): Expected end of text, found '=' (at char 326), (line:7, col:37)

5 years ago
0 Hello

Looking forward to seeing the clearml-deploy 🤩 you guys rock 🚀

4 years ago
0 Does Trains 0.16 Supports Pip >=20.2?

I would like to try it to see if it solves some dependencies not found eventhough they are installed when using --system-site-packages

5 years ago
0 Hi, Together With

(It would be nice to have all the Pypi releases tagged in github btw)

5 years ago
0 Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

But I am not sure it will connect the parameters properly, I will check now

3 years ago
0 Hi, I Want To Upgrade Clearml Server From 1.1 To 1.2 (Self Hosted). I Have The Following Setup:

Hi, /opt/clearml is ~40Mb, /opt/clearml/data is about ~50gb

3 years ago
0 Hi, I Am Using Clearml With Pytorch-Ignite And Its Earlystopping Handler. I Would Like To Log The Counter Of The Patience Of This Handler, How Can I Do That?

I didn’t use ignite callbacks, for future reference:
` early_stopping_handler = EarlyStopping(...)

def log_patience(_):
clearml_logger.report_scalar("patience", "early_stopping", early_stopping_handler.counter, engine.state.epoch)

engine.add_event_handler(Events.EPOCH_COMPLETED, early_stopping_handler)
engine.add_event_handler(Events.EPOCH_COMPLETED, log_patience) `

4 years ago
0 Hi, If I Am Starting My Training With The Following Command:

The main issue is the task_logger.report_scalar() not reporting the scalars

3 years ago
0 Hi, I Would Like To Bring Awareness

and I didn't have this problem before because when cu117 wheels were not available, the agent was trying to get the wheel with the closest cu version and was falling back to 1.11.0+cu115, and this one was working

2 years ago
0 Hi, Coming Back With The Venv Caching: With The Following Setting:

Yes, I guess that's fine then - Thanks!

4 years ago
0 Hey There, Is It Possible For A Clearml Pipeline Step To Log A Folder Instead Of Numpy/Pickle Objects? Looking At The Docs,

I guess I can have a workaround by passing the pipeline controller task id to the last step, so that the last step can download all the artifacts from the controller task.

3 years ago
0 Hello, What Is The Default Limit For Global Context ?

Thanks! (Maybe could be added to the docs ?) 🙂

5 years ago
0 Hey, I Have A Problem With The Following Task:

I tried removing type=str but I got same problem 😕

5 years ago
Show more results compactanswers