
Reputation
Badges 1
108 × Eureka!Yeah, it's because it's just hooking into the save operation and capturing the output, regardless of the parent call.
Why? That's not how I authenticate. Also, if it was simply an issue with authentication wouldn't there be some error message in the log?
The git credentials are stored in the agent config and they work when I tested them on another project (not for setting up the environment but for downloading the repo of the task itself.)
Sorry I disappeared (went on a well deserved vacation). The problem is happening because of the ordering of the install. If I install using pip install -r ./requirements.txt
then pip installs the packages in the order of the requirements file. However, during the installation process from ClearML, it installs the packages in order UNLESS there's a custom path provided, then it's saved for last. The reason this breaks my code is I have later packages that depend on the custom packages, as ...
The verbose output:
Generating SHA2 hash for 123 files
100%|██████████████████████████████████████████████████████████| 123/123 [00:00<00:00, 310.04it/s]
Hash generation completed
Add 2022-12.csv
Add 2020-10.csv
Add 2021-06.csv
Add 2022-02.csv
Add 2021-04.csv
Add 2013-03.csv
Add 2021-02.csv
Add 2015-02.csv
Add 2016-07.csv
Add 2022-05.csv
Add 2021-10.csv
Add 2018-04.csv
Add 2019-06.csv
Add 2017-11.csv
Add 2016-01.csv
Add 2013-06.csv
Add 2018-08.csv
Add 2020-05.csv
Add 2020-03.csv
Add 20...
Do you start the clearml agents on the server with the same user that has the credentials saved?
Will this return a list of datasets?
If I wanted to do this with the ID, how would I approach it?
Actually, clearing the cache on the other project might have fixed it. I just tested it out and it seems to be working.
@<1523701435869433856:profile|SmugDolphin23> Yeah, I just wanted to validate it was worth spending the time. Since there is already a parameter that takes callable (i.e. schedule_function
) it might make sense that we reuse the parameter. If it returns a str we validate that it's a task and if it does we can run the task as if we originally passed it as the task_id
in .add_task()
. This would only be a breaking change if the callable that was passed happened to return a task_id
...
So far when I delete a task or dataset using the web interface that has artifacts on S3 it doesn't prompt me for credentials.
I'm not sure why the logs were incomplete. I think part of the reason it wasn't pulling from the repo was that it was pulling from cache. I cleared the clearml cache for that project and reran it. This should be the full log.
I'm using pro. Sorry, for the delay, I didn't notice I never sent the response.
Alright, I'll try and put that together for Monday.
I'm not self-hosting the server.
The answer is simple but also not completely obvious to someone new to the platform. So you can inject new command line args that hydra will recognize. This is what the Hydra section of args is for. However, if you enable _allow_omegaconf_edit_: True
, I think ClearML will “inject” the OmegaConf saved under the configuration object of the prior run, overwriting the overrides. I’ll experiment with this behavior a bit more to be sure.
Hi Jake 👍 ,
Maybe the content is cached? The repo isn't big. I didn't realize the log was missing content. I believe I copied everything but I'll double check in a moment.
Ah, I think I see the issue. In my head I was crossing ID with URL.
I might have found the answer. I'll reply if it works as expected.
Thanks again for the info. I might experiment with it to see first hand what the advantages are.
Yes, it indeed appears to be a regex issue. If I run:
Dataset.list_datasets(
dataset_project=self.task.get_project_name(),
partial_name=re.escape('[LTV] Dataset Test'),
only_completed=True,
)
It works as expected. I'm not sure how raw you want to leave the partial_name features. I could create a PR to fix this but would you want me to re.escape at the list_datasets()
level? Or go deeper and do it at `Task._query_task...
I found I was having this issue as well. I don't have an alias defined in the pipeline but in a task and I get the same error. I'm not hosting my own server but using the free web service at the moment.
Results:
I first tried uncommenting enable_git_ask_pass: false
but it didn't resolve the issue.
I then cleared the cache in the vcs-cache
folder, and that did fix the issue. This is the second time the cache seemed to have been the root cause of the problem. At some point I did move from token-based auth to ssh keys. Would this require clearing the cache for any project that was cached prior to the auth change?
Depending on the framework you're using it'll just hook into the save model operation. Every time you save a model, which will probably happen every epoch for some subset of the training. If you want to do it with the existing framework you could change the checkpoint so that it only clones the best model in memory and saves the write operation for last. The risk with this is if the training crashes, you'll lose your best model.
Optionally, you could also disable the ClearML integration with...
I see. Thanks for the insight. That seems to be the case. I'm struggling a bit with datasets. For example, if I wanted to trace the genealogy of a dataset that's used by traditional tasks and pipelines. I'll try and write something up about the challenges around that when I get the chance. But your comment revealed another issue:
It appears that the partial name matching isn't going well. I'm unclear why this wouldn't be matching. In the attached photo you can see the input for `partial_nam...
I'd like to provide the credentials to any ec2 instances that are spun up.
Oh, I get what's happening. That segment of the code is rerun when the task is enqueued remotely. So it's deleting itself. This also explains why it works fine locally. It's an ouroboros, the task is deleting itself.
It hooks into the calls made by the code. If you never save the model to disk, add it to a tool like MLflow/Tensorboard, or manually add the artifact to ClearML, afaik it won't save the artifact.