
JitteryCoyote63
214
Questions,
1021
Answers
Active since 10 January 2023
Last activity
11 months ago
Reputation
0
Badges 1
979 × Eureka!Hi, in the context of multi-gpu training, is Model.get_local_copy() multi-process safe? or should make sure only the first process calls it first, then others
3 years ago
Hi guys, I got a very unexpected error today on in one of my agents: ... Collecting tqdm Using cached tqdm-4.48.2-py2.py3-none-any.whl (68 kB) Processing /ro...
4 years ago
Hi, Is it still true that --services-mode only supports docker mode?
3 years ago
Hi, just want to report a small bug in the clearml dashboard: after queuing an experiment, if I change the experiment queue, then go back to the experiment I...
3 years ago
Hi there, would it be possible to add some Neural Architecture Search example, as for the HyperParameter Optimizer examples?
4 years ago
Hi, how can I search an old experiment based on its commit hash?
one year ago
Hi guys, following up on this https://allegroai-trains.slack.com/archives/CTK20V944/p1599135173096200?thread_ts=1599125260.076600&cid=CTK20V944 : I have a pi...
4 years ago
Hi guys, with the new venv caching available in clearml, I have the following problem: I force my pip requirements to be: torch==1.7.1 pytorch-ignite clearml...
4 years ago
Hi, I am getting an error while running task.mark_stopped() , any idea why? (clearml 1.0.2, clearml-agent 1.0.0, python 3.6) File "/home/machine/.clearml/ven...
3 years ago
Hi, is clearml-server compatible with latest versions of ES ( > 7.6.2)?
3 years ago
Hi, would it be possible to parse torch requirement when it’s part of the extras_require dict? In my code, I have the following: train_task._update_requireme...
3 years ago
Hi, I think there is a small bug in the Experiment running time column of the workers-and-queues/workers page: they do not match the time reported in the exp...
3 years ago
Hi, from within an experiment, how can I intercept the signal that the experiment was aborted and execute a cleanup function? I tried to intercept SIGINT and...
3 years ago
Hello, what is the default limit for global context ? https://allegro.ai/docs/storage_manager_storagemanager.html#trains.storage.manager.StorageManager.get_l...
4 years ago
Hi, some properties of the Task object are not listed in the documentation (such as task.parent, which is not clear whether it is the parent task object itse...
4 years ago
(sorry I pinned the message accidentally 😅 )
4 years ago
Hi, I encounter the following bug with clearml 0.17.5rc2: When I start a task locally and that task raises cuda out of memory, the command returns but the pr...
4 years ago
Hi! I have a question regarding performances of the clearml-server: are the calls from the agents made asynchronously/in a non blocking separate thread? is t...
4 years ago
Hi there, congrats for releasing v1 😄 I observed that with pytorch ignite (4.2.0), the metrics of the validation engines are delayed by one epoch. I am not ...
3 years ago
Hi there, is it safe to use ClearML (trains >= 0.17) with the trains ignite handler? Should we wait for the update on their side?
4 years ago
Hi, another bug to report with the aws_auto_scaler using 1.1.2: Traceback (most recent call last): File "aws_autoscaler.py", line 297, in main() File "aws_au...
3 years ago
Hi Guys, I had several times now the following errors poping in agents while executing a task: trains_agent: ERROR: Failed applying git diff: I attached the ...
4 years ago
Hi, I am getting the following errors in the experiments I am currently running: 2021-06-25 17:11:47,911 - clearml.Metrics - ERROR - Action failed <504/0: ev...
3 years ago
Hi, where can I find the logs of trains-agent by default?
4 years ago
Hi guys, I would like to start using the AWS autoscaler shipped in trains. I need to create a IAM user to get and I would like to know what are the minimal p...
4 years ago
Hi, I am using clearml with pytorch-ignite and its EarlyStopping handler. I would like to log the counter of the patience of this handler, how can I do that?
3 years ago
Are the env variables passed to trains-agent available in experiments run by this trains-agent?
4 years ago
Hello, I would like to use spot instances together with the AWS autoscaler to train models with pytorch/ignite and I am wondering how to support interruption...
3 years ago
Hi, one more question: When creating a task with Task.init(), we can specify the task_type . Now when using Task.clone(), we cannot specify the task_type (is...
4 years ago
Hi all, how can I have a global variable used in a pipeline step? I have to define them in each pipeline step, otherwise they are not included in the pipelin...
12 months ago
Show more results
questions