
Reputation
Badges 1
75 × Eureka!I am not an expert on this, just started using torchmetrics.
I did not know about it, thanks!
it is typically sued with pytorch
yes, I don't know whether you have access to the backend, but just in case my experiment is this one:
ClearML results page: https://app.clear.ml/projects/45557a1ee1464631a9a18b0dcda4f682/experiments/01b77a220869442d80af42efce82c617/output/log
where is the endpoint located? I can't find it, were only able to find this:
https://github.com/allegroai/clearml/blob/ccc8e83c58336928424ed14b176306b149258512/examples/services/monitoring/slack_alerts.py#L55
task.data.user is the user id, can I get it in the text form?
I subscribe to the problem of having large metrics without a tool for proper inspection what is it coming from.
thanks! is this documented? (I am wondering whether I could have avoided bothering you with my question in the first place)
SuccessfulKoala55 that worked, thanks a lot!
I created my own docker image with a newer python and the error disappeared
yes, I am calling Task.init
task status is running in the webui
traceback:
` Traceback (most recent call last):
File "/home/marek/nomagic/monomagic/ml/tiresias/calibrate_and_test.py", line 57, in <module>
Task.add_requirements('requirements.txt')
File "/home/marek/.virtualenvs/tiresias-3.9/lib/python3.9/site-packages/clearml/backend_interface/task/task.py", line 1976, in add_requirements
for req in pkg_resources.parse_requirements(requirements_txt):
File "/home/marek/.virtualenvs/tiresias-3.9/lib/python3.9/site-packages/pkg_resources/_init...
but it is a guess
ok, but do you know why did it try to reuse in the first place?
I circumvented the problem by putting timestamp in task name, but I don't think this is necessary.
I don't see such a method in the docs, but it seems so natural that decided to ask.
ok, I will do a simple workaround for this (use an additional parameter that I can update using parameter_override and then check if it exists and update the configuration in python myself)
which is probably why it does not work for me, right?
they are universal, I thought there is some interface to them in clearml, but probably not
now it stopped working locally as well
We have a training template that is a k8s job definition (yaml) that creates env variables inside the docker images that is used for tranining, and those env variables are credentials for ClearML. Since they are taken from k8s secrets, they are the same for every user.
I can create secrets for every new user and set env variables accordingly, but perhaps you see a better way out?
report_scalar works, report_text does not, this is very weird
console output:ClearML results page:
01b77a220869442d80af42efce82c617 some text 2022-03-21 22:47:16,660 - clearml.Task - INFO - Waiting to finish uploads 2022-03-21 22:47:28,217 - clearml.Task - INFO - Finished uploading