Reputation
Badges 1
25 × Eureka!And the agent section on this machine is:api_server:ย
web_server:ย
files_server:ย
Is that correct?
yes, or (because I deployed clearml using helm in kubernetes) from the same machine, but multiple pods (tasks).
Oh now I see, long story short, no ๐ the correct way of doing that is every node/pod creates it's own dataset,
then when you are done, you create a new version with the X datasets that you created as parents, the newly created version is just "meta" it basically tells the system how to combine the previously generated datasets (i.e. no data is actually re-uploa...
Oh!
I see this is using the colab as remote agent (i.e. to launch jobs on it),
[ColabKernelApp] CRITICAL | Bad config encountered during initialization: The 'kernel_class' trait of <main.ColabKernelApp object at 0x7fa41b29e5c0> instance must be a type, but 'google.colab._kernel.Kernel' could not be imported
Can you send the full log?
Hi @<1523702307240284160:profile|TeenyBeetle18>
and url of the model refers to local file, no to the remote storage.
Do you mean that in the Model tab when you look into the model details the URL points to a local location (e.g. file:///mnt/something/model) ?
And your goal is to get a copy of that model (file) from your code, is that correct ?
but the logger info is missing.
What do you mean? Can I reproduce it ?
BTW: The code sample you shared is very similar to how you create pipelines in ClearML, no?
(also could you expand on how you create the Kedro node ? from te face o fit it looks like another function in the repo, but I have a feeling I'm missing something)
Thanks EnviousStarfish54
Let me check if I can reproduce it
Why does my task execution freeze after pip installation (running agent in foreground mode)?
Hi AdventurousButterfly15
Are you running in agent docker mode or venv mode ?
What do you mean freeze? do you see anything on the Taks console log in the UI? what's the host OS ?
Hi PompousParrot44
What do you have in the Execution/"script path" ?
Checkout the trains-agent repo https://github.com/allegroai/trains-agent
It is fairly straight forward.
Are Kwargs supported in functions decorated as a pipeline component?
They are, but I think the main issue is the casting, without prior knowledge, everything will be a tring
before exposing our IP to the world, I suggest going over security advisory in the docs: None
as a general note, do not expose your server, the open source version is not designed for it, just put it inside your VPN and it will be fine
Hi HollowFish37
I think I have good news for you, the clearml-agent is only communicating with the api endpoint, so as long as this is secure, you should be fine. Do notice that the default files server endpoint should be secure as well, as by default it will allow any upload/download
Well it is there, do you have it in your docker-compose as well?
https://github.com/allegroai/trains-server/blob/master/docker-compose.yml#L55
FreshReindeer51
Could you provide some logs ?
With pleasure ๐
Oh you can definitely use the RestAPI, but in this specific case, I'm not sure there is something better.
(BTW: Look for APIClient it a pythonic interface for the RestAPI)
Hi RoughTiger69
How about using the pipeline decorator as a way to run this logic?
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
I think I'm missing the context of where the code is executed....
btw: you can now set the configuration_objects directly when calling add_step ๐
https://clearml.slack.com/archives/CTK20V944/p1633355990256600?thread_ts=1633344527.224300&cid=CTK20V944
- Be able to trigger the โpureโ function (e.g. train()) locally, without anyย
ย code running, while driving it from a configuration e.g. path to the data.
When you say " without anyย http://clear.ml ย code" do mean without the agent, or without using the Clearml.Dataset ?
Be able to trigger the โ
ย decoratorโ (e.g.ย train_clearml()) while driving it from configuration e.g. dataset_id
Hmm I can think of:
` def train_clearml(local_folder=None, dataset_id=None):
...
Hi EagerOtter28
The agent knows how to do the http->ssh conversion on the fly, in your cleaml.conf (on the agent's machine) set force_git_ssh_protocol: true
https://github.com/allegroai/clearml-agent/blob/42606d9247afbbd510dc93eeee966ddf34bb0312/docs/clearml.conf#L25
So the thing is clearml
automatically detects the last iteration of the previous run, my assumption you also add it hence the double shift.
SourOx12 could that be it?
Sometimes it is working fine, but sometimes I get this error message
@<1523704461418041344:profile|EnormousCormorant39> can I assume there is a gateway at --remote-gateway <internal-ip>
?
Could it be that this gateway has some network firewall blocking some of the traffic ?
If this is all local network, why do you need to pass --remote-gateway ?
when you clone the Task, it might be before it is done syncying git / packages.
Also, since you are using 0.16 you have to have a section name (Args or General etc.)
How will task b use the parameters ? (argparser / connect dict?)
What's the trains version / trains-server version ?
Yes the one you create manually is not really of the same "type" as the one you create online, this is why you do not see it there ๐
JitteryCoyote63 the new wizard was pushed, you can check it out here:
https://github.com/allegroai/trains/blob/master/examples/services/aws-autoscaler/aws_autoscaler.py
BTW: next release to include it all is next week (hopefully :))
is it possible to perform debugging operations with pycharm integration using remote session?
Sure, use clearml-session it will open an ssh connection to the remote machine, then you can use pycharm
Apparently it ignores it and replaces everything...