Hi WittyOwl57
I'm guessing clearml is trying to unify the histograms for each iteration, but the result is in this case not useful.
I think you are correct, the TB histograms are actually a 3d histograms (i.e. 2d histograms over time, which would be the default for kernel;/bias etc.)
is there a way to ungroup the result by iteration, and, is it possible to group it by something else (e.g. the tags of the two plots displayed below side by side).
Can you provide a toy example...
Hi WittyOwl57
That's actually how it works (original idea/design was borrowed from libclound), basically you need to create a Drive, then the storage manger will use it.
Abstract class here:
https://github.com/allegroai/clearml/blob/6c96e6017403d4b3f991f7401e68c9aa71d55aa5/clearml/storage/helper.py#L51
Is this what you had in mind ?
WittyOwl57 I think this is a great idea, can you open a feature issue on GitHub so this is not forgotten ?
BTW: regardless, if you have time to upgrade to the new the azure package upgrade, it will be great ๐ this is on our to do list for a while, but since not a lot of users complained it got pushed ...
DeliciousBluewhale87 basically any solution that is compliant with S3 protocol will work. An example:output_uri="
:PORT/bucket/folder"
Are you sure Nexus supports this protocol ?
I "think" nexus sits on top of a storage solution (like am object storage), meaning we can use the same storage solution Nexus is using.
Just to clarify we do not support the artifactory protocol Nexus provides for storing models/artifacts. But we do support it as a source for python packages used by the a...
Hi ConvolutedSealion94
Just making sure, you spinned the docker-compose of the clearml serving as well ?
I guess it wonโt due to the nature of services?
Correct, k8s glue works differently, that said I would actually use the helm to spin a pod woth the agent in services mode and venv mode.
I want to use services queue for running services, and I want to do it on k8s
So yes, as a standalone pod with the agent in venv mode (as opposed to docker mode)
Does that make sense to you?
GiddyTurkey39
I would guess your VM cannot access the trains-server
, meaning actual network configuration issue.
What are VM ip and the trains-server IP (the first two numbers are enough, e.g. 10.1.X.Y 174.4.X.Y)
Hi UpsetCrocodile10
First, I perform many experiments in one process, ...
How about this one:
https://github.com/allegroai/trains/issues/230#issuecomment-723503146
Basically you could utilize create_function_task
This means you have Task.init() on the mainn "controller" and each "train_in_subset" as a "function_task". Them the controller can wait on them, and collect the data (like the HPO does.
Basically:
` controller_task = Task.init(...)
children = []
for i, s in enumer...
๐ Let me know if it solved the issue ๐
Hi CrookedWalrus33
the python version is auto detected and register in "manual execution" time (i.e. when you run your code on your machine).
That said this is a suggestion for the agent, and only if it can actually find the matching Python version it will use it, otherwise it will use whatever is
available (i.e. Look through the PATH environment for a matching pythonX.Y
executable)
The easiest way to support would just make sure the python binary's path is added to the PATH env.
Does...
@<1546303293918023680:profile|MiniatureRobin9>
, not the pipeline itself. And that's the last part I'm looking for.
Good point, any chance you want to PR this code snippet ?
def add_tags(self, tags):
# type: (Union[Sequence[str], str]) -> None
"""
Add Tags to this pipeline. Old tags are not deleted.
When executing a Pipeline remotely (i.e. launching the pipeline from the UI/enqueuing it), this method has no effect.
:param tags: A li...
Hi PerplexedCow66
I'm assuming an extension for this:
https://github.com/allegroai/clearml-serving/issues/32
Basically JWT can be used as a general access/block all endpoints, which is most efficnely used if handled by k8s loadbalancer (nginx/envoy),
but if you want a per-endpoint check (or maybe do something based on the JWT values)
See adding JWT to FastAPI here:
https://fastapi.tiangolo.com/tutorial/security/oauth2-jwt/?h=jwt#oauth2-with-password-and-hashing-bearer-with-jwt-tokens
T...
HappyDove3
see here https://github.com/allegroai/clearml-pycharm-plugin ๐
Is this like a local minio?
What do you have under the sdk/aws/s3 section
?
So how do I solve the problem? Should I just relaunch the agents? Because they can't execute jobs now
Are you running in docker mode ?
If so you can actually delete mapped files (they will still be available inside the docker), just make sure you delete them X hours after they were created, and you should be fine.
wdyt?
Hi RobustRat47
the easiest way to reproduce the entire environment on you local machine:clearml-agent build --id <task_id> --target ~/debug-full-env/
This will install an entire venv including code and applying git changes:
You can also create a container with everything:
https://clear.ml/docs/latest/docs/clearml_agent#task-container
StaleButterfly40 just making sure I understand, are we trying to solve the "import offline zip file/folder" issue, where we create multiple Tasks (i.e. Task per import)? Or are you suggesting the Actual task (the one running in offline mode) needs support for continue-previous execution ?
Is there a quicker way to abort all running experiments in a project? I have over a thousand running anonymous data tasks in a specific project and I want to abort them beforeย debugging them.
We are adding "select" all in the next UI version to do that as quickly as possible ๐
Hi LooseClams37
From the docker compose, I see the agent is running in venv mode, is that correct?
Also notice that when configuring the minio credentials you can specify if this is an https connection (secure: true) which by default it is not.
See here: https://github.com/allegroai/clearml-agent/blob/5a6caf6399a0128ad81e8723d0a847e2ded5b75e/docs/clearml.conf#L287
When you set the pod make sure you mount the clearml local cache folder to the PV
basically /root/.clearml/cache/
I solved the issue by implementing my own ClearML logger
This is awesome! any chance you want to PR it to transformers ?
ElegantCoyote26 what is the model input layer definition? This implies the data format to pass to the serve endpoint
Hi @<1542316991337992192:profile|AverageMoth57>
Not sure I follow how the integration what you have in mind regarding Gerrit integration None
Sounds interesting ...
wdyt?
SmallDeer34 the function Task.get_models() incorrectly returned the input model "name" instead of the object itself. I'll make sure we push a fix.
I found a different solution (hardcoding the parent tasks by hand),
I have to wonder, how does that solve the issue ?
where people can do @'s for experiments/projects/tasks and even comparisons ...
ohhh I like that! for me this throws me directly to Slack integration .
I think my main question is, "is the discussion ephemeral?" in other words, is this an on going discussion that later no one will care about, or are we creating some "knowledge base" that we want to later share?
Also, by "address bar at the top", i assume you mean address url right?
yes... apologies for the phrasing, it was w...
This, however, requires that I slightly modify the clearml helm chart with the aws-autoscaler deployment, right?
Correct ๐
Can you let me know if i can override the docker image using template.yaml?
No, you cannot.
But you can pass OS environment "CLEARML_DOCKER_IMAGE" to set a diff default one