Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
JitteryCoyote63
Moderator
215 Questions, 1023 Answers
  Active since 10 January 2023
  Last activity 3 months ago

Reputation

0

Badges 1

981 × Eureka!
0 Votes
27 Answers
2K Views
0 Votes 27 Answers 2K Views
Hi, similar to Task.set_offline(True), is there a way to simulate an execution in an agent? (for testing purposes)
3 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, a small bug (not really a bug) in the autoscaler: I have p3.2xlarge instances that take a long time to shutdown. With polling_interval_time_min=1 , the a...
4 years ago
0 Votes
14 Answers
2K Views
0 Votes 14 Answers 2K Views
4 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
Hi, Is there a way to stop a clearml-agent from within an experiment? Or block it to prevent it running any other task?
4 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hey again 😁 I am migrating my trains-server to AWS and I would like now to have secure accounts (with password). But I don't want to loose the current users...
4 years ago
0 Votes
11 Answers
2K Views
0 Votes 11 Answers 2K Views
Hi, I have a question regarding the aws-autoscaler: am I understanding correctly that: max_idle_time_min=5 max_spin_up_time_min=10 polling_interval_time_min=...
4 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hi guys, with the new venv caching available in clearml, I have the following problem: I force my pip requirements to be: torch==1.7.1 pytorch-ignite clearml...
4 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, is it possible to get an artifact from a Task and force not using local cache? The task itself updated the artifact in the meantime and I cannot get the ...
4 years ago
0 Votes
9 Answers
2K Views
0 Votes 9 Answers 2K Views
Another strange behavior of the python SDK CLI: after executing python my_task.py, where my_task.py creates and send to the queue an experiment, the command ...
4 years ago
0 Votes
13 Answers
2K Views
0 Votes 13 Answers 2K Views
3 years ago
0 Votes
10 Answers
2K Views
0 Votes 10 Answers 2K Views
Hey, what is the exact difference between agent.package_manager.system_site_packages and trains-agent --install-globally ?
5 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
2 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, in the AWS AutoScaler, I am getting the following warning: Warning! exception occurred: APIError: code 400/1004: Worker is not registered: worker=aws:A10...
4 years ago
0 Votes
12 Answers
2K Views
0 Votes 12 Answers 2K Views
Hey, would it possible to add an option to make task.upload_artifact() blocking? (Not running in background)
5 years ago
0 Votes
1 Answers
2K Views
0 Votes 1 Answers 2K Views
Small error in doc: https://allegro.ai/docs/references/trains_agent_ref/#daemon The detach parameter is shown in the command as --detached while it is listed...
5 years ago
0 Votes
2 Answers
2K Views
0 Votes 2 Answers 2K Views
Hi, I have a configuration file that I read and connect to my training tasks. I cannot use config = task.get_parameters_as_dict()["General"]["param"]["nested...
3 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi all, I updated from clearml-server 1.14.1 to 1.15.0 and I am getting the following error while trying to start the server after running docker-compose pul...
one year ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
3 years ago
0 Votes
30 Answers
2K Views
0 Votes 30 Answers 2K Views
Hello, I tried the clearml-session CLI to start a jupyter instance on an agent, but an error with the password, here is the full CLI log: $ clearml-session -...
4 years ago
0 Votes
3 Answers
2K Views
0 Votes 3 Answers 2K Views
2 years ago
0 Votes
7 Answers
2K Views
0 Votes 7 Answers 2K Views
Hi, I am currently using CLEARML_AGENT_GIT_USER and CLEARML_AGENT_GIT_PASS when starting my clearml-agent and I would like to switch to using a single auth t...
3 years ago
0 Votes
4 Answers
2K Views
0 Votes 4 Answers 2K Views
Hi there, I think there is a bug with clearml sdk v0.17.5rc2: when running a task locally, the dashboard doesnt not shows the task as finished once the task ...
4 years ago
0 Votes
5 Answers
2K Views
0 Votes 5 Answers 2K Views
Hi, I would like to report something else weird in the clearml-agent 1.5.1 running in docker mode: In the logs, when it dumps its config, it writes: docker_c...
2 years ago
0 Votes
23 Answers
2K Views
0 Votes 23 Answers 2K Views
Hi, I started a trains-agent (0.15) in services mode (full command: trains-agent daemon --services-mode --detached --queue services --create-queue --docker u...
5 years ago
0 Votes
2 Answers
1K Views
0 Votes 2 Answers 1K Views
Hi all, how can I have a global variable used in a pipeline step? I have to define them in each pipeline step, otherwise they are not included in the pipelin...
one year ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
Hi, I am using the aws autoscaler and getting the following error while trying to spin up spot instances: 2021-08-16 17:18:48 Spinning new instance type=v100...
4 years ago
0 Votes
26 Answers
2K Views
0 Votes 26 Answers 2K Views
Hi, I attached an IAM role to an ec2 instance to grant access to an s3 bucket. The ec2 instance is running a clearml-agent (v1.1.0). I didn’t specify any key...
aws
4 years ago
0 Votes
11 Answers
2K Views
0 Votes 11 Answers 2K Views
Hi guys, following up on this https://allegroai-trains.slack.com/archives/CTK20V944/p1599135173096200?thread_ts=1599125260.076600&cid=CTK20V944 : I have a pi...
5 years ago
0 Votes
26 Answers
2K Views
0 Votes 26 Answers 2K Views
Hi, I would like to follow-up in this https://clearml.slack.com/archives/CTK20V944/p1646123127790389 happening on clearml server 1.2.0 (self hosted on a sing...
3 years ago
0 Votes
18 Answers
2K Views
0 Votes 18 Answers 2K Views
Hello there, I would like to do run cleanup code in case the user aborts one task from the dashboard (the agent is not using the task in docker). What signal...
5 years ago
Show more results questions
0 Hi, I Have Another Bug To Report For Clearml-Server 1.2 (Self Hosted) In The Console Logs Of An Experiments, I Cannot See The Latest Logs. Eg My Experiment Is Done, But I Can Only See The Logs Of To The Installation Of The Packages. If I Download The Log

Hi CostlyOstrich36 , this weekend I took a look at the diffs with the previous version ( https://github.com/allegroai/clearml-server/compare/1.1.1...1.2.0# ) and I saw several changes related to the scrolling/logging:
apiserver/bll/event/ http://log_events_iterator.py apiserver/bll/event/ http://events_iterator.py apiserver/config/default/services/_mongo.conf apiserver/database/model/ http://base.py apiserver/services/ http://events.pyI suspect that one of these changes might be responsible ...

3 years ago
0 Hi

Awesome! (Broken link in migration guide, step 3: https://allegro.ai/docs/deploying_trains/trains_server_es7_migration/ )

5 years ago
0 Hi, I Have Another Bug To Report For Clearml-Server 1.2 (Self Hosted) In The Console Logs Of An Experiments, I Cannot See The Latest Logs. Eg My Experiment Is Done, But I Can Only See The Logs Of To The Installation Of The Packages. If I Download The Log

Hi CostlyOstrich36 , one more observation: it looks like when I don’t open the experiment in the webUI before it is finished, then I get all the logs correctly. It is when I open the experiment in the webUI while it is running that I don’t see all the logs.
So it looks like there is an effect of caching (the logs are retrieved only once, when I open the experiment for the first time), and not afterwards (or rarely). Is that possible?

3 years ago
0 I Am Wondering Is It Possible To Schedule A Task To Run At Certain Time In Periodic Fashion Aka. Cron Style... Thinking Of Having A Monitoring Task To Be Run Routinely ... I Could Use A Cron On One Of The Server But Prefer To Run It On Trains As Then I Am

I don't think there is an example for this use case in the repo currently, but the code should be fairly simple (below is a rough draft of what it could look like)
` controller_task = Task.init(...)
controller_task.execute_remotely(queue_name="services", clone=False, exit_process=True)

while True:
periodic_task = Task.clone(template_task_id)
# Change parameters of {periodic_task} if necessary
Task.enqueue(periodic_task, queue="default")
time.sleep(TRIGGER_TASK_INTERVAL_SECS) `

5 years ago
0 Hi, If I Am Starting My Training With The Following Command:

I opened an https://github.com/pytorch/ignite/issues/2343 in ignite’s repo and a https://github.com/pytorch/ignite/pull/2344 , could you please have a look? There might be a bug in clearml Task.init in distributed envs

3 years ago
0 Hello, ~3 Months Ago I Created A Trains-Server In A Machine With 30Gb Of Disk Space. Today I Wasn'T Able To Connect To Trains-Server, So I Checked The Server And Found That The Disk Full. I Ran:

Thanks SuccessfulKoala55 !
Maybe you could add to your docker-compose file an option for limiting the size of the logs, since there is no limit by default, their size will grow for ever, which doesn't sound ideal https://docs.docker.com/compose/compose-file/#logging

4 years ago
0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

Mmmh unfortunately not easily… I will try to debug deeper today, is there a way to resume a task from code to debug locally?
Something like replacing Task.init with Task.get_task so that Task.current_task is the same task as the output of Task.get_task

4 years ago
0 Hi, One More Question: When Creating A Task With Task.Init(), We Can Specify The

I still don't see why you would change the type of the cloned Task, I'm assuming the original Task had the correct type, no?

Because it is easier for me that I create a training task out of the controller task by cloning it (so that parameters are prefilled and I can set the parent task id)

5 years ago
0 Hi, How Does

There was no possible cache, the agent was running on a new ec2 instance

2 years ago
0 Hello, Is It Possible For The Clearml-Agent In Docker Mode To Not Pull A Specific Docker Image, But To Build One From The Experiment Repository Using The Dockerfile And .Dockerignore Of The Experiment Repository?

I don’t have a registry to push my image to.I think I can get around it actually - Will it work if I just build the image locally once, then start the agent? Docker would recognise that image locally and just use it right? I won’t need to update that image often anyway

3 years ago
0 Trains-Elastic | {"Type": "Server", "Timestamp": "2020-12-07T15:19:11,101Z", "Level": "Error", "Component": "O.E.B.Elasticsearchuncaughtexceptionhandler", "Cluster.Name": "Trains", "Node.Name": "Trains", "Message": "Uncaught Exception In Thread [Main]",

it worked for the other folder, so I assume yes --> I archived the /opt/trains/data/mongo, sent the archive via scp, unarchived, updated the rights and now it works

4 years ago
0 Hi There, I Used

AgitatedDove14 So I copied pasted locally the https://github.com/pytorch-ignite/examples/blob/main/tutorials/intermediate/cifar10-distributed.py from the examples of pytorch-ignite. Then I added a requirements.txt and called clearml-task to run it on one of my agents. I adapted a bit the script (removed python-fire since it’s not yet supported by clearml).

3 years ago
0 Hi There

Here is the minimal reproducable example.
Run test_task_a.py - It will register a dummy artifact, create a new task, set a parameter in that task and enqueue it test_task_b will try to retrieve parameter from parent task and fail

5 years ago
0 Hi, If I Am Starting My Training With The Following Command:

I fixed, will push a fix in pytorch-ignite 🙂

3 years ago
0 Hi, I Want To Upgrade Clearml Server From 1.1 To 1.2 (Self Hosted). I Have The Following Setup:

--- /data ---------- 48.4 GiB [##########] /elastic_7 1.8 GiB [ ] /shared 879.1 MiB [ ] /fileserver . 163.5 MiB [ ] /clearml_cache . 38.6 MiB [ ] /mongo 8.0 KiB [ ] /redis

3 years ago
Show more results compactanswers