Reputation
Badges 1
25 × Eureka!WickedGoat98
I will try to collect the installation steps in a document and share it to the community once ready
Thank you! this will be awesome !
We're here if you need anything π
WickedGoat98 nice!!
Can you also pass the login screen (i.e. can you access the api server)
Hi PanickyMoth78
So do not tell anyone, but the next version will have reports built in clearml, as well as the ability to embed graphs in 3rd party (think Notion GitHub, markdown etc.)
Until then (ETA mid Dec), the easiest is to download an image or just use the url (it encodes the full view, so when someone clicks on it they the exact view you are seeing)
but I still need the laod ballancer ...
No you are good to go, as long as someone will register the pods IP automatically on a dns service (local/public) you can use the regsitered address instead of the IP itself (obviously with the port suffix)
Thanks for your support
With pleasure!
PleasantGiraffe85
it took the repo from the cache. When I delete the cache, it can't get the repo any longer.
what error are you getting ? (are we talking about the internal repo)
Hi PleasantGiraffe85
Did you set git_host
to only point to your host ? do you expect all the git clones to use SSH? how does the requirements.txt git link looks like ?
https://github.com/allegroai/clearml-agent/blob/bf07b7f76d3236c1118b81730c6d9718705a795a/docs/clearml.conf#L22
Will such an docker image need a trains configuration file?
If you need to configure things other than credentials (see above) than yes you might need to map trains.conf
into the pod.
Specifically, if you need, map your trains.conf to /root/.trains
inside the pod/container
We host in a private (self-hosed) git-Lab, which can only be cloned through the SSH, and we would like to import packages by compiling it from a public git-Hub (or by installing it using wheels with find-links).
Try removing force_git_ssh_protocol: true
If you do not provide user/pass , and assuming the repo link on the task is internal, hence SSH, it should leave links as they are (ssh for ssh http for http), this should solve the issue (I hope)
We're lucky that they let the developers see their code...
LOL π
and it is also set in theΒ
/clearml-agent/.ssh/config
Β and it still can't clone it. So it must be some security issue internally.
Wait, are you using docker mode or venv mode ? in both cases your SSH credentials should be at the default ~/.ssh
Sure, try to run the clearml-agent withclearml-agent daemon -O
https://clear.ml/docs/latest/docs/clearml_agent/clearml_agent_daemon
Hi ReassuredTiger98
I think it used to be the default and then it was removed, it has no real affect on performance but it remove all asserts ... what is your use case ? do you see any performance gains ?
with tensorboard logging, it works fine when running from my machine, but not when running remotely in an agent.
This is odd, could you send the full Task log?
Hi TrickyRaccoon92
Are you sure plotly (the front-end module displaying the plots in the UI) supports it ?
LOL that's the spirit , making your team happy is key to success in adoption π
It does, tested π but you should as well
So we basically have two options, one is when you call Dataset.get_local_copy()
, we register it on the Task automatically, the other is a more explicit, with something like:ds = Datasset.get(...) folder = ds.get_local_copy() task.connect(ds, name=train) ... ds_val = Datasset.get(...) folder = ds_val.get_local_copy() task.connect(ds_val, name=validate)
wdyt?
We are working hard on release 1.7 once that is out we will push an RC for review (I hope) π
you can also specify additional packages on the decorator@PipelineDecorator.component(..., packages=["tqdm>=2.1", "scikit-learn"]) def step_one(...): # code here
Oh, is your pipeline code a part of a git repository ?
so i end up having to clone the other ones manually in my code
Hi ConvolutedChicken69
Yes the problem is that there is no standard for multi repo environments
The best solution I can come up with is using git-submodules or packaging the auxiliary repo as wheels. wdyt?
Hi ZippySheep23
Any ideas what might be happening?
I think you passed the upload limit (2.36 GB) π
If you create an initial code base maybe we can merge it?
Hi DepressedFox45
Basically move the import into the function, it will automatically detect the package.@PipelineDecorator.component(...) def step_one(...): import sklearn import pandas as pd # stuff
Make sense ?
Hi PanickyMoth78
So the current implantation of the pipeline parallelization is exactly like python async function calls:for dataset_conf in dataset_configs: dataset = make_dataset_component(dataset_conf) for training_conf in training_configs: model_path = train_image_classifier_component(training_conf) eval_result_path = eval_model_component(model_path)
Specifically here since you are passing the output of one function to another, image what happens is a wait operation, hence it ...
JitteryCoyote63 hacky but sure π
` from trains.config import config_obj
print(config_obj) `
Hmm, this means the step should have included the git repo itself, which means the code should have been able to import the .py
Can you see the link to the git repository on the Pipeline step Task ?
Hmm SuccessfulKoala55 what do you think?
So there is a hack for it:CLEARML_OFFLINE_MODE=1 python3 my_main.py
Which is the same as calling Task.set_offline
Then inside the code After the Task.init call:
` task = Task.init(...)
not sure what the if here is?!
Task.debug_simulate_remote_task(task_id="offline-1") `This will make things act as if this is running remotely , i.e. your logic Task.running_remotely() will be called.
Do notice that in remote mode, all the arguments / data is read from the clearml-server into the cod...