Reputation
Badges 1
25 × Eureka!This means it will Always authenticate with SSH force_git_ssh_protocol
...
But it seems you need mixed behavior ?
Are you using github as git provider ?
We're lucky that they let the developers see their code...
LOL 😄
and it is also set in theÂ
/clearml-agent/.ssh/config
 and it still can't clone it. So it must be some security issue internally.
Wait, are you using docker mode or venv mode ? in both cases your SSH credentials should be at the default ~/.ssh
PleasantGiraffe85
it took the repo from the cache. When I delete the cache, it can't get the repo any longer.
what error are you getting ? (are we talking about the internal repo)
Hi ClumsyElephant70
s there a way to run all pipeline steps, not in isolation but consecutive in the same environment?
You mean as part of a real-time inference process ?
Hi VexedCat68
One of my steps just finds the latest model to use. I want the task to output the id, and the next step to use it. How would I go about doing this?
When you say "I want the task to output the id" do you mean to pass t to the next step:
Something like this one:
https://github.com/allegroai/clearml/blob/c226a748066daa3c62eddc6e378fa6f5bae879a1/clearml/automation/controller.py#L224
but I can't seem to figure out a way to do something similar using a task in add_step
VexedCat68 With "add_step" it assumes the Task you are adding is self contained (i.e. there is no "return object" to serialize), this means you can only add arguments, or use the artifacts the Task (i.e. step) will recreate, obviously you knowing in advance what the step creates. Make sense ?
Hi MassiveBat21
CLEARML_AGENT_GIT_USER is actually git personal token
The easiest is to have a read only user/token for all the projects.
Another option is to use the ClearML vault (unfortunately not part of the open source) to automatically take these configuration on a per user basis.
wdyt?
Hi UnsightlySeagull42
How can I reproduce this behavior ?
Are you getting all the console logs ?
Is it only the Tensorboard that is missing ?
Hi SubstantialElk6ClearML-Data
doesn't actually "load" the data, it brings it locally and returns a folder with all your data files, from that point onward, it's up to your code to load it to the framework. Make sense ?
tf datasets is able to handle batch downloading quite well.
SubstantialElk6 I was not aware of that, I was under the impression tf dataset is accessed on a file level, no?
Hi ReassuredTiger98
So let's assume we call:logger.report_image(title='training', series='sample_1', iteration=1, ...)
And we report every iteration (keeping the same title.series names). Then in the UI we could iterate back on the last 100 images (back in time) for this title / series.
We could also report a second image with:logger.report_image(title='training', series='sample_2', iteration=1, ...)
which means that for each one we will have 100 past images to review ( i.e. same ti...
Hi MortifiedDove27
Looks like there is a limit of 100 images per experiment,
The limit is 100 unique combination of title/series per image.
This means that changing the title or the series name will add 100 more images (notice the 100 limit is for previous iterations)
you can also increase the limit here:
https://github.com/allegroai/clearml/blob/2e95881c76119964944eaa0289549617e8afeee9/docs/clearml.conf#L32
Are you aware of any other way then (other than theÂ
secure: false
 flag?
Actually self -signing and providing certificate file is already supported with boto (and thus clearml)
AWS_CA_BUNDLE
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html
Hi VexedCat68
Yes the serving is a bit complicated. Let me try to explain the underlying setup, before going into more details.
clearml-serving CLI is a tool to launch / setup. (it does the configuration and enqueuing not the actual serving) control plan Task -> Storing the state of the serving (i.e. which end points needs to be served, what models are used, collects stats). This Task has no actual communication with the serving requests/replies (Running on the services queue) Serving Task...
SmarmyDolphin68 sadly if this was not executed with trains (i.e. the offline option of trains), this is not really doable (I mean it is, if you write some code and parse the TB 😉 but let's assume this is way to much work)
A few options:
On the next run, use clearml OFFLINE option, (i.e. in your code call Task.set_offline() , or set env variable CLEARML_OFFLINE_MODE=1) You can compress the upload the checkpoint folder manually, by passing the checkpoint folder, see https://github.com...
Hi ScantChimpanzee51
Is it possible to run multiple agent on EC2 machines started by the Autoscaler?
I think that by default you cannot,
having the Autoscaler start 1x p3.8xlarge (4 GPU) on AWS might be better than 4x p3.2xlarge (1 GPU) in terms of availability, but then then we’d need one Agent per GPU.
I think that this multi-GPU setup is only available in the enterprise tier.
That said, the AWS pricing is linear, it costs the same having 2 instances with 1 GPU as 1 instanc...
Hi ReassuredOwl55
The easiest is to configure it as default output_uri in the clearml.conf of file the agent, wdyt?
https://github.com/allegroai/clearml-agent/blob/ebb955187dea384f574a52d059c02e16a49aeead/docs/clearml.conf#L430
Did you set an agent on a machine? (See clearml agent in docs for details)
I think what you are looking for is clearml-agent daemon
https://clear.ml/docs/latest/docs/clearml_agent
https://clear.ml/docs/latest/docs/getting_started/video_tutorials/agent_remote_execution_and_automation
You need to adjust it to your setup , specifically change the queue name to one you have. Does that make sense ?
ElegantKangaroo44 it seems to work here?!
https://demoapp.trains.allegro.ai/projects/0e152d03acf94ae4bb1f3787e293a9f5/experiments/48907bb6e870479f8b230e6b564cd52e/output/metrics/plots
Hmm @<1523701083040387072:profile|UnevenDolphin73> I think this is the reason, None
and this means that even without a full lock file poetry can still build an environment
How about this one:
None
Hi GiganticTurtle0
I have found thatÂ
clearml
 does not automatically detect the imports specified within the function decorated
The pipeline decorator will automatically detect the imports Inside the funciton, but not outside (i.e. global), to allow better control of packages (think for example one step needs the huge torch package, and the other does not.
Make sense ?
How can I tellÂ
clearml
 I will use the same virtual environment in all steps...
Metadata might be expensive, it's a RestAPI call, and we have found users putting hundreds of artifacts, with preview entries ...
Okay, so the idea behind the new decorator is not to group all the defined steps under the same script so that they share the same environment, but rather to simplify the process of creating scripts for each step and avoid manually callingÂ
Task.init
 on those scripts.
Correct, and allow users to more easily create Tasks from code.
Regarding virtual environment creation from caching, I will keep running benchmarks (from what you say it might be due to high workload ...
Hi @<1523701868901961728:profile|ReassuredTiger98>
is there something like a clearml context manager to disable automatic logging?
Sure just do a wildcard with the files you actually want to autolog the rest will be ignored:
None
task = Task.init(..., auto_connect_frameworks={'pytorch' : '*.pt'}
PompousParrot44
It should still create a new venv, but inherit the packages from the system-wide (or specific venv) installed packages. Meaning it will not reinstalled packages you already installed, but it will ive you the option of just replacing a specific package (or install a new one) without reinstalling the entire venv