Reputation
Badges 1
25 × Eureka!Hi VivaciousWalrus99
Could you attach the log of the run ?
By default it will use the python it is running with.
Any chance the original experiment was executed with python2 ?
Guys, any chance you can verify the RC solves the issue?pip install clearml==1.0.2rc0
CourageousKoala93 when you call Task.close() it will mark the task as completed, there is no need to do that manually. The idea with mark_completed is that you can forcefully change the state if needed, or externally stop the task and mark it completed. Make sense?
okay this seems like a broken pip install python3.6
Can you verify it fails on another folder (maybe it's a permissions thing, for example if you run in docker mode, then the permissions will be root, as the docker is creating those folders)
as i also noticed that uploads are sometimes slow, and i see here max_connections=2
Makes sense to me, please go ahead and add that as well (basically the same thing on _AzureBlobServiceStorageDriver.upload_object
and an additional variable on the AzureContainerConfigurations
class.
Could you PR a tested draft ? we will be able to take from there
Hi LazyTurkey38
What do you mean the git repo is not recognized? When execute_remotely leaves you should see on the task a ref to the git repo with the exact commit ID you have locally pulled, do you see it under the Execution tab?
You might only see it when the upload is done
I would guess that for some reason loglevel is DEBUG, could that be the case?
... indicate the job needs to be run remotely? Iβm imagining something like
clearml-task
and you need to specify the queue to push your Task into.
See here: https://clear.ml/docs/latest/docs/apps/clearml_task
This will mount the trains-agent machine's hosts file into the docker
I presume is via theΒ
project_name
Β andΒ
task_name
Β parameters.
You are correct in your assumption, it only happens when you call Task.init but two distinctions:
ArgParser arguments are overridden (with trains-agent) even before Task.init is called Task.init when running under trains-agent will totally ignore the project/task name, it receives a pre-made task id, and uses it. So the project name and experiment are meaningless if you are running the tas...
owning the agent helps, but still it's much better if the credentials don't show up in logs,
They are not, they are always filtered out,
- how does
force_git_ssh_protocol
help please? it doesn't solve the issue of the agent simply not having accessIt automatically maps the host .ssh into the container, so that git can use SSH to clone.
What exactly is not working?
and how are you configuring it?
RobustSnake79 let's assume that the trace figure above is probably too much to get into the WebUI, which simple figures might still have value in your scenario ?
Hi @<1610808279263350784:profile|FriendlyShrimp96>
Is there a way to get a list of variants given a metric, or even just a full list of metrics and variants for a given task id?
Try this
None
from clearml.backend_api.session.client import APIClient
c = APIClient()
metrics = c.events.get_task_metrics(tasks=["TASK_ID_HERE"], event_type="training_debug_image")
print(metrics)
I think API ...
Welp, it's been a day with the new settings, and stats went up 140K for API calls
... going to check again tomorrow to see if any of that was spill over from yesterday
140K calls a day, how often are you sending scalars ? how long is it running? how many experiments are running ?
BroadSeaturtle49 agent RC is out with a fix:pip3 install clearml-agent==1.5.0rc0
Let me know if it solved the issue
@<1532532498972545024:profile|LittleReindeer37> nice!!! π
Do you want to PR? it will be relatively easy to merge and test, and I think that they might even push it to the next version (or worst case quick RC)
HandsomeCrow5
BTW: out of curiosity, how do you generate the html reports. I remember a few users suggesting trains should have a report generating functionality
Hi @<1697056701116583936:profile|JealousArcticwolf24>
You have clearml Datasets None
It will version catalog and store meta-data of your datasets.
Each version only stores the delta from the parent version, but delta is on a file granularity not a "block" granularity
Notice that under the hood of course it uses storage solutions to store and cache the underlying immutable copy of the data. What's your use case?
Hi PricklyJellyfish35
My apologies this thread was forgotten π
What's the current status with the OmegaConf, ? (I'm not sure I understand what do mean by resolve=False)
Are you hosting your own server? Is it on http://app.clear.ml ?
SteadyFox10 With pleasure π
BTW: you can retrieve the Task id from its name withTask.get_tasks(project_name='my project', task_name='my task name')
See https://allegro.ai/docs/task.html?highlight=get_tasks#trains.task.Task.get_tasks
E.g. I might need to have different N-numbers for the local and remote (ClearML) storage.
Hmm yes, that makes sense
That'd be a great solution, thanks! I'll create a PR shortly
Thank you! π π€©
Hi CheekyAnt38
However now I would like to evaluate directly my machine learning model via api requests, directly over clearml. Itβs possible?
This basically means serving the model, is this what you mean?
Let's start small. Do you have grafana enabled in your docker compose and can you login to your grafana web ui?
Notice grafana needs to access the prometheus container directly so easiest way is to have everything in the same docker compose
ElegantKangaroo44 good question, that depends on where we store the score of the model itself. you can obviously parse the file name task.models['output'][-1].url
and retrieve the score from it. you can also store it on the model name task.models['output'][-1].name
and you can put it as general purpose blob o text on what is currently model.config_text
(for convenience you can have model parse a json like text and use model.config_dict