I have manually verified that the line-by-line content of the csv files is identical using hashlib.sha256(). Why would it be that the file content is the same, they are generated by the same process (literally just rerunning the same code twice) but ClearML treats them differently.
Thanks for your reply @<1523701070390366208:profile|CostlyOstrich36> Is there an example where a pipeline is built from existing tasks? I'd like to experiment with it and I don' t see any examples of what you describe with my (clearly lacking) google-fu. What happens if you wrap a function with a task.init() with a pipeline decorator or is that the process you're speaking of?
@<1523701070390366208:profile|CostlyOstrich36> ClearML: 1.10.1, I'm not self-hosting the server so whatever the current version is. Unless you mean the operating system?
@<1523701435869433856:profile|SmugDolphin23> Good to know.
I found I was having this issue as well. I don't have an alias defined in the pipeline but in a task and I get the same error. I'm not hosting my own server but using the free web service at the moment.
I'd like to provide the credentials to any ec2 instances that are spun up.
@<1523701070390366208:profile|CostlyOstrich36> Just pinging you 😄
The answer is simple but also not completely obvious to someone new to the platform. So you can inject new command line args that hydra will recognize. This is what the Hydra section of args is for. However, if you enable _allow_omegaconf_edit_: True
, I think ClearML will “inject” the OmegaConf saved under the configuration object of the prior run, overwriting the overrides. I’ll experiment with this behavior a bit more to be sure.
The git credentials are stored in the agent config and they work when I tested them on another project (not for setting up the environment but for downloading the repo of the task itself.)
I figured as much. This is basically what I was planning to do otherwise. I have questions around that.
- It appears that the 'extra' config is displayed in plain text on the web app and downloadable in json. I was just curious if this is best practices.
- I noticed in the AWS instance that's spun up when starting the autoscaler there's 3 settings in the config:
use_credentials_chain: false, use_iam_instance_profile: false, use_owner_token: False
are these strictly for the credentials t...
Ah, I think I see the issue. In my head I was crossing ID with URL.
Sorry I disappeared (went on a well deserved vacation). The problem is happening because of the ordering of the install. If I install using pip install -r ./requirements.txt
then pip installs the packages in the order of the requirements file. However, during the installation process from ClearML, it installs the packages in order UNLESS there's a custom path provided, then it's saved for last. The reason this breaks my code is I have later packages that depend on the custom packages, as ...
Oh, duh. I'll test that out. But I did have the agent.force_git_ssh_protocol: true
Well, if I stop the cron service and start it back up I don't have to re-register each schedule. If, for instance, I start the TaskScheduler, register a task, and stop the scheduler, how do I restart the TaskScheduler in a way that re-register the tasks? Because, in theory, they could be registered from several users and I might be unaware of tasks that were previously scheduled. What is the best practices to preserve state?
It's even attempting to install omegaconf but not from the repo, likely because it's a dependency of hydra-colorlog.
Collecting omegaconf<2.4,>=2.2
Using cached omegaconf-2.2.3-py3-none-any.whl (79 kB)
Using cached omegaconf-2.2.2-py3-none-any.whl (79 kB)
Using cached omegaconf-2.2.1-py3-none-any.whl (78 kB)
That's what I was getting at. It wasn't clear to me from the documentation that it saves the state.
Thanks, that's exactly what I was looking for.
They will be related through the task. Get the task information from the dataset, then get the model information from the task.
Thanks for the reply @<1523701070390366208:profile|CostlyOstrich36> !
It says in the documentation that:
Add a folder into the current dataset. calculate file hash, and compare against parent, mark files to be uploaded
It seems to recognize the dataset as another version of the data but doesn't seem to be validating the hashes on a per file basis. Also, if you look at the photo, it seems like some of the data does get recognized as the same as the prior data. It seems like it's the correct...
I think this error occurred for me because when I first authenticated with the project I was using username/password and later I transitioned to using ssh keys. That's why clearing the cache worked.
Did you validate that branch exists on remote?
I'm not self-hosting the server.
Do you start the clearml agents on the server with the same user that has the credentials saved?
Why? That's not how I authenticate. Also, if it was simply an issue with authentication wouldn't there be some error message in the log?
Provide a bit more detail. What framework are you using?
Unfortunately, that doesn't seem to have solved the problem. I tried the same thing with https and it seems to skip the lines with the @ symbol like it did before. Honestly, it seems more like it just isn't parsing those lines during the install.
Collecting darts==0.25.0
Using cached darts-0.25.0-py3-none-any.whl (760 kB)
Collecting lightgbm
Using cached lightgbm-4.1.0-py3-none-manylinux_2_28_x86_64.whl (3.1 MB)
Collecting prophet
Using cached prophet-1.1.4-py3-none-manylinux_2_1...
That's great! I look forward to trying this out.
Is there currently a way to bind the same GPU to multiple queues? I believe the agent complains last time I tried (which was a bit ago).
Thanks again for the info. I might experiment with it to see first hand what the advantages are.
Is it possible the cached repository was cloned before you changed your agent settings?
Which settings are you referring to? I can't remember if I was using https auth when the project would have been first cached. Would that make a difference?
Also, did you set
agent.enable_git_ask_pass: true
?
The only instance of it in the config is commented out.
# if set, use GIT_ASKPASS to pass user/pass when cloning / fetch repositories
# it solves pas...