Reputation
Badges 1
98 × Eureka!@<1523701205467926528:profile|AgitatedDove14> Then it isn't working at intended. To test it I started the scheduler and set a simple dead man snitch process to run once a day. In the web-app (on your site app.cleearml.ml), when looking at the scheduler process in the DevOps section, I was able to see a configuration file under artifacts but it was not as all obvious how you'd change that because it wasn't part of the configuration section, it was just an artifact. So I thought maybe it was b...
1707128614082 bigbrother:gpu0 INFO task 59d23c5919b04fd6947c1e463fa8c78c pulled from 9890a035b8f84872ab18d7ff207c26c6 by worker bigbrother:gpu0
Current configuration (clearml_agent v1.7.0, location: /tmp/.clearml_agent.vo_oc47r.cfg):
----------------------
agent.worker_id = bigbrother:gpu0
agent.worker_name = bigbrother
agent.force_git_ssh_protocol = true
agent.python_binary = /home/natephysics/anaconda3/bin/python
agent.package_manager.type = pip
agent.package_manager.pip_version.0 = ...
Results:
I first tried uncommenting enable_git_ask_pass: false
but it didn't resolve the issue.
I then cleared the cache in the vcs-cache
folder, and that did fix the issue. This is the second time the cache seemed to have been the root cause of the problem. At some point I did move from token-based auth to ssh keys. Would this require clearing the cache for any project that was cached prior to the auth change?
I think this error occurred for me because when I first authenticated with the project I was using username/password and later I transitioned to using ssh keys. That's why clearing the cache worked.
Did you validate that branch exists on remote?
Do you start the clearml agents on the server with the same user that has the credentials saved?
Is it possible the cached repository was cloned before you changed your agent settings?
Which settings are you referring to? I can't remember if I was using https auth when the project would have been first cached. Would that make a difference?
Also, did you set
agent.enable_git_ask_pass: true
?
The only instance of it in the config is commented out.
# if set, use GIT_ASKPASS to pass user/pass when cloning / fetch repositories
# it solves pas...
Thanks for always checking in @<1523701087100473344:profile|SuccessfulKoala55> 😛
Oh, duh. I'll test that out. But I did have the agent.force_git_ssh_protocol: true
It's verbatim from requirements as I pass that into ClearML.
It's even attempting to install omegaconf but not from the repo, likely because it's a dependency of hydra-colorlog.
Collecting omegaconf<2.4,>=2.2
Using cached omegaconf-2.2.3-py3-none-any.whl (79 kB)
Using cached omegaconf-2.2.2-py3-none-any.whl (79 kB)
Using cached omegaconf-2.2.1-py3-none-any.whl (78 kB)
Unfortunately, that doesn't seem to have solved the problem. I tried the same thing with https and it seems to skip the lines with the @ symbol like it did before. Honestly, it seems more like it just isn't parsing those lines during the install.
Collecting darts==0.25.0
Using cached darts-0.25.0-py3-none-any.whl (760 kB)
Collecting lightgbm
Using cached lightgbm-4.1.0-py3-none-manylinux_2_28_x86_64.whl (3.1 MB)
Collecting prophet
Using cached prophet-1.1.4-py3-none-manylinux_2_1...
Sure. I'm in Europe but we can also test things async.
Sounds good. Lmk if there's some changes that are required.
Thanks Eugen for the quick reply. If I can add a suggestion/comment from my perspective: Why is schedule_function
included in the .add_task()
method? As far as I can tell if you use schedule_function
it changes the very nature of the method, it's no longer adding a task but adding a function . It seems like it would make more sense if this was broken into something like an .add_function()
method. Also, if you call schedule_function
many of the other parameters in `.add...
@<1523701435869433856:profile|SmugDolphin23> Yeah, I just wanted to validate it was worth spending the time. Since there is already a parameter that takes callable (i.e. schedule_function
) it might make sense that we reuse the parameter. If it returns a str we validate that it's a task and if it does we can run the task as if we originally passed it as the task_id
in .add_task()
. This would only be a breaking change if the callable that was passed happened to return a task_id
...
I figured you'd say that so I went ahead with that PR. I got it working but I'm going to test it a bit further.
Are you self hosting a ClearML server?
I'm not self-hosting the server.
So far when I delete a task or dataset using the web interface that has artifacts on S3 it doesn't prompt me for credentials.
Ah, I think I see the issue. In my head I was crossing ID with URL.
Hi @<1523701205467926528:profile|AgitatedDove14> . I think I'm misunderstanding something here. I have the scheduler service running. Now that it's running how does one add a new task or remove an existing task from the scheduler? I get that I can add them before starting the scheduler service but once the service is running is there any way to connect to it and change the schedule?
I thought the advantage of this service would be we could schedule tasks just by connecting to the existing t...
Thanks, that's exactly what I was looking for.
Thanks for your reply @<1523701070390366208:profile|CostlyOstrich36> Is there an example where a pipeline is built from existing tasks? I'd like to experiment with it and I don' t see any examples of what you describe with my (clearly lacking) google-fu. What happens if you wrap a function with a task.init() with a pipeline decorator or is that the process you're speaking of?
Actually, clearing the cache on the other project might have fixed it. I just tested it out and it seems to be working.