Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
214 Questions, 1021 Answers
  Active since 10 January 2023
  Last activity 7 months ago

Reputation

0

Badges 1

979 × Eureka!
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
Hi there, I am trying to start an agent in services mode with trains-server being on localhost (but not started together with the docker-compose!). My trains...
4 years ago
0 Votes
10 Answers
1K Views
0 Votes 10 Answers 1K Views
3 years ago
0 Votes
4 Answers
955 Views
0 Votes 4 Answers 955 Views
Hi, in the Metric Snapshot section of the Overview tab of a project page, would it be possible to: Show running experiments Have the legend clickable, to hid...
2 years ago
0 Votes
3 Answers
976 Views
0 Votes 3 Answers 976 Views
Hi quick question: does Task.connect_configuration support OmegaConf DictConfig objects? ie. Can I do: config = train_task.connect_configuration(OmegaConf.lo...
2 years ago
0 Votes
3 Answers
1K Views
0 Votes 3 Answers 1K Views
Hi, in a subproject, would it be possible to hide the parent project if it is empty?
3 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
Hi there ๐Ÿ™‚ Task.get_parameters() returns an empty dict from within a trains-agent task being executed. When I execute it outside, it works properly. Is it i...
4 years ago
0 Votes
30 Answers
995 Views
0 Votes 30 Answers 995 Views
Hi guys, with the new venv caching available in clearml, I have the following problem: I force my pip requirements to be: torch==1.7.1 pytorch-ignite clearml...
3 years ago
0 Votes
7 Answers
951 Views
0 Votes 7 Answers 951 Views
Hi, is there a way to get some stats about the use of workers? I would like to know, over the past 3 months: Number of training hours per user Number of trai...
3 years ago
0 Votes
10 Answers
1K Views
0 Votes 10 Answers 1K Views
Hi guys, any plan to integrate the https://github.com/allegroai/trains-agent/blob/master/examples/dynamic_cloud_cluster.ipynb in trains-server? The code ther...
4 years ago
0 Votes
5 Answers
934 Views
0 Votes 5 Answers 934 Views
How can I do the following? (basically, filtering by task type) Task.get_tasks(project_name="my-project", task_name="my-task", task_filter=dict(type="trainin...
4 years ago
0 Votes
2 Answers
958 Views
0 Votes 2 Answers 958 Views
Hello, what is the default limit for global context ? https://allegro.ai/docs/storage_manager_storagemanager.html#trains.storage.manager.StorageManager.get_l...
4 years ago
0 Votes
19 Answers
1K Views
0 Votes 19 Answers 1K Views
I guess one experiment is running backwards in time ๐Ÿ˜„
2 years ago
0 Votes
0 Answers
931 Views
0 Votes 0 Answers 931 Views
(sorry I pinned the message accidentally ๐Ÿ˜… )
4 years ago
0 Votes
7 Answers
977 Views
0 Votes 7 Answers 977 Views
Hi, I think there is a small bug in the Experiment running time column of the workers-and-queues/workers page: they do not match the time reported in the exp...
3 years ago
0 Votes
2 Answers
991 Views
0 Votes 2 Answers 991 Views
Are the env variables passed to trains-agent available in experiments run by this trains-agent?
4 years ago
0 Votes
5 Answers
984 Views
0 Votes 5 Answers 984 Views
Hey there, since which version, clearml stops connecting to the demo server by default?
3 years ago
0 Votes
1 Answers
1K Views
0 Votes 1 Answers 1K Views
Hi there, any plan/benefit to support virtualenv= 20 ?
4 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
Hello, I tried the clearml-session CLI to start a jupyter instance on an agent, but an error with the password, here is the full CLI log: $ clearml-session -...
3 years ago
0 Votes
17 Answers
967 Views
0 Votes 17 Answers 967 Views
Hi there, I have a problem with PyJWT: I am using trains==0.16.4 and trains-agent==0.16.3 in my agents. I installed PyJWT==1.7.1 in the agent (through extra_...
3 years ago
0 Votes
14 Answers
1K Views
0 Votes 14 Answers 1K Views
3 years ago
0 Votes
8 Answers
936 Views
0 Votes 8 Answers 936 Views
3 years ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
Hi, is it possible to specify the required version of python for a Task that is different from the python running the clearml-agent? Example: my clearml-agen...
2 years ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
Hi there, is it possible to configure the clearml-agent to run some commands before running each experiment it launches? Eg. echo "test" > "test.txt" && <-- ...
3 years ago
0 Votes
11 Answers
1K Views
0 Votes 11 Answers 1K Views
Hi, coming back with the venv caching: with the following setting: I call Task._update_requirements(["."]) setup.py has the following install_requires=["my-p...
3 years ago
0 Votes
5 Answers
1K Views
0 Votes 5 Answers 1K Views
Hi again, it seems like the aws autoscaler is not spinning instances with the EBS configuration I configured. Here is the configuration: resource_configurati...
3 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
3 years ago
0 Votes
1 Answers
967 Views
0 Votes 1 Answers 967 Views
Hi, would it be possible to parse torch requirement when it’s part of the extras_require dict? In my code, I have the following: train_task._update_requireme...
3 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
Hi, I am giving another try to clearml-session and I am blocked at the current error shown when the CLI try to establish the tunneling: Starting SSH tunnel W...
2 years ago
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
2 years ago
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
2 years ago
Show more results questions
0 Hi There, I Used

The task object

2 years ago
0 Hi There, I Used

So I guess the problem is that the following snippet:
from clearml import Task Task.init()Should be added before the if __name__ == "__main__": ?

2 years ago
0 Hi There, I Used

AgitatedDove14 So I copied pasted locally the https://github.com/pytorch-ignite/examples/blob/main/tutorials/intermediate/cifar10-distributed.py from the examples of pytorch-ignite. Then I added a requirements.txt and called clearml-task to run it on one of my agents. I adapted a bit the script (removed python-fire since itโ€™s not yet supported by clearml).

2 years ago
0 Hi There, I Used

UnevenDolphin73 , task = clearml.Task.get_task(clearml.config.get_remote_task_id()) worked, thanks

2 years ago
0 Hi Guys, Coming This Time To Share An Idea Of A Killer Feature For Clearml

I also discovered https://h2oai.github.io/wave/ last week, would be awesome to be able to deploy it in the same manner

3 years ago
0 Hi, How Does

Also enable_git_ask_pass is not dumped into the logs when an experiment start btw

one year ago
0 Hi Guys, Coming This Time To Share An Idea Of A Killer Feature For Clearml

Hi AnxiousSeal95 , I hope you had nice holidays! Thanks for the update! I discovered h2o when looking for ways to deploy dashboards with apps like streamlit. Most likely I will use either streamlit deployed through clearml or h2o as standalone if ClearML won't support deploying apps (which is totally fine, no offense there ๐Ÿ™‚ )

3 years ago
0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

Setting it after the training correctly updated the task and I was able to store artifacts remotely

4 years ago
0 I Am Wondering Is It Possible To Schedule A Task To Run At Certain Time In Periodic Fashion Aka. Cron Style... Thinking Of Having A Monitoring Task To Be Run Routinely ... I Could Use A Cron On One Of The Server But Prefer To Run It On Trains As Then I Am

I don't think there is an example for this use case in the repo currently, but the code should be fairly simple (below is a rough draft of what it could look like)
` controller_task = Task.init(...)
controller_task.execute_remotely(queue_name="services", clone=False, exit_process=True)

while True:
periodic_task = Task.clone(template_task_id)
# Change parameters of {periodic_task} if necessary
Task.enqueue(periodic_task, queue="default")
time.sleep(TRIGGER_TASK_INTERVAL_SECS) `

4 years ago
0 Hi Again, I Am Trying To Make The Aws Autoscaler Work With Ec2 Instances, But It Fails To Setup The Agent In The Machine: The Logs Of The User-Data Script Show That It Fails Updating The Machine (See Below)

I think waiting for the apt locks to be released with something like this would work
startup_bash_script = [ "#!/bin/bash", "while sudo fuser /var/{lib/{dpkg,apt/lists},cache/apt/archives}/lock >/dev/null 2>&1; do echo 'Waiting for other instances of apt to complete...'; sleep 5; done", "sudo apt-get update", ...Weirdly this throws an error in the autoscaler:
` Spinning new instance type=v100_spot
Error: Failed to start new instance, unexpected '{' in field...

3 years ago
0 Hi Guys, Coming This Time To Share An Idea Of A Killer Feature For Clearml

as it's also based on pytorch-ignite!

I am not sure to understand, what is the link with pytorch-ignite?

We're in the brainstorming phase of what are the best approaches to integrate, we might pick your brain later on

Awesome, I'd be happy to help!

3 years ago
0 Hey Again

Very cool! Run two train-agent daemons, one per GPU on the same machine, with default Nvidia/CUDA Docker This is close to my use case, I just would like to run these two daemons not with docker, would that be possible? I should just remove the --docker nvidia/cuda param right?

4 years ago
0 Hey Again

trains-agent daemon --gpus 0 --queue default & trains-agent daemon --gpus 1 --queue default &

4 years ago
0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

Ho the object is actually available in previous_task.artifacts

4 years ago
0 Hi, Did Anyone Experiment With Running On The Aws Autoscaler On Spots And Knows Whether There Is Configuration For Retry Policy When Spot Get Evacuated Mid-Job?

Hi there, yes I was able to make it work with some glue code:
Save your model, optimizer, scheduler every epoch Have a separate thread that periodically pulls the instance metadata and check if the instance is marked for stop, in this case, add a custom tag eg. TO_RESUME Have a services that periodically pulls failed experiments from the queue with the tag TO_RESUME, force marking them as stopped instead of failed and reschedule them with as extra-param the last checkpoint

3 years ago
0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

ย you mean โ€œdockerโ€ was not installed and it did not throw an error ?

Yes docker was not installed in the machine

Yes you must make sure the docker can mount a persistent folder for you to work on.

Ok, it would be nice to have a --user-folder-mounted that do the linking automatically

3 years ago
0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

So I installed docker, added user to group allowed to run docker (not to have to run with sudo, otherwise it fails), then ran these two commands and it worked

3 years ago
0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

I got some progress TimelyPenguin76 , Now the task runs and I get the error from docker:
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

3 years ago
0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

So that I donโ€™t loose what I worked on when stopping the session, and if I need to, I can ssh to the machine and directly access the content inside the user folder

3 years ago
0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

Alright I have a followup question then: I used the param --user-folder โ€œ~/projects/my-projectโ€, but any change I do is not reflected in this folder. I guess I am in the docker space, but this folder is not linked to my the folder on the machine. Is it possible to do so?

3 years ago
0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

` Executing: ['docker', 'run', '-t', '--gpus', '"device=0"', '--network', 'host', '-e', 'CLEARML_WORKER_ID=office:worker-0:docker', '-e', 'CLEARML_DOCKER_IMAGE=nvidia/cuda:10.1-runtime-ubuntu18.04 --network host', '-v', '/home/user/.gitconfig:/root/.gitconfig', '-v', '/tmp/.clearml_agent.toc3_yks.cfg:/root/clearml.conf', '-v', '/tmp/clearml_agent.ssh.1dsz4bz8:/root/.ssh', '-v', '/home/user/.clearml/apt-cache.2:/var/cache/apt/archives', '-v', '/home/user/.clearml/pip-cache:/root/.cache/pip', '...

3 years ago
Show more results compactanswers