Reputation
Badges 1
25 × Eureka!so i end up having to clone the other ones manually in my code
Hi ConvolutedChicken69
Yes the problem is that there is no standard for multi repo environments
The best solution I can come up with is using git-submodules or packaging the auxiliary repo as wheels. wdyt?
Yes, offline got broken in 1.3.0 😞 , RC fixed it:pip install clearml==1.3.1rc0
Stable release later this week
Another question, do you have the argparse with type=str
?
First let's verify with the manual change, but yes
command line 🙂
cmd.exe / bash
SubstantialElk6 if you call Task.init with continue_last_task=<task_id> it will automatically add the last_iteration of the previous run, to any logging/report so you never overwrite the previous reports 🙂
SlipperyDove40
FYI:args = task.connect(args, name="Args")
Is "kind of" reserved section for argparse. Meaning you can always use it, but argparse will also push/pull things from there. Is there any specific reason for not using a different section name?
SlipperyDove40 following on the missing section name, this seems like backwards compatibility issue. Try calling with backwards_compatibility=False
my_params = Task.get_parameters(backwards_compatibility=False)
This should always add the section name prefix.
and the agent default runtime mode is docker correct?
Actually the default is venv mode, to run in docker mode add --docker
to the command line
So I could install all my system dependencies in my own docker image?
Correct, inside the docker it will inherit all the preinstalled packages, But it will also install any missing ones (based on the Task requirements. i.e. "installed packages" section)
Also what is the purpose of the
aws
block in the clearml.c...
ReassuredTiger98 regrading the agent error, can you see the package some_packge
in the "Installed Packages" in the UI? Was it installed ? are you using pip or conda as package manager in the agent (check the clearml.conf) is the agent running in docker mode ?
Thanks ShallowCat10 !
I'll make sure we fix it 🙂
Thank you! 😊
How can I reproduce it?
Hi @<1562610699555835904:profile|VirtuousHedgehong97>
I think you need to upgrade your self-hosted clearml-server, could that be the case?
Thanks! Let me check something
Hi GreasyPenguin14
Yes, I think you are right the series name should be next to the title. Let me check it...
Not yet 😞
It should not be complex to implement,
The actual aws auto scaler class is implementing just two functions:
def spin_up_worker(self, resource, worker_id_prefix, queue_name):
https://github.com/allegroai/clearml/blob/e9f8fc949db7f82b6a6f1c1ca64f94347196f4c0/clearml/automation/auto_scaler.py#L104
def spin_down_worker(self, instance_id):
https://github.com/allegroai/clearml/blob/e9f8fc949db7f82b6a6f1c1ca64f94347196f4c0/clearml/automation/auto_scaler.py#L...
Hi JealousParrot68
This is the same as:
https://clearml.slack.com/archives/CTK20V944/p1627819701055200
and,
https://github.com/allegroai/clearml/issues/411
There is something odd happening in the files-server as it replaces the header (i.e. guessing the content o fthe stream) and this breaks the download (what happens is the clients automatically ungzip the csv).
We are working on a hit fix to he issue (BTW: if you are using object-storage / shared folders, this will not happen)
Hi ResponsiveCamel97
Let me explain how it works, essentially it creates a new venv inside the docker, inheriting all the packages form the main system packages.
This allows it to use the installed packages if the version match, and upgrade/change if you need, all without the need to rebuild a new container. Make sense ?
Hi SubstantialElk6
ClearML-Serving is already out with a new version, the ETA for the next ClearML-serving full 1.0 (which is the new redesign version) is the end of May
In the "installed packages" section you should have "nvidia-dali-cuda110" In the agent's clearml.conf you should add:extra_index_url: ["
", ]
https://github.com/allegroai/clearml-agent/blob/master/docs/clearml.conf#L78
Should solve the issue
This makes no sense to me 😞
Both are reading the exact same file, and using the same session / flow ...
Maybe there is an error with the "verify_certificate" on the agent ?
DM me the entire log, I would assume this is something with the configuration
GiganticTurtle0 found it, fix will be pushed tomorrow 🙂
Yep... they are pushing "heavy" users away from these instances. Nothing really you can do, maybe switch to Azure/GCP, but it might be the same there
Hmm, can you send the full log of the pipeline component that failed, because this should have worked
Also could you test it with the latest clearml python version (i.e. 1.10.2)