Reputation
Badges 1
75 × Eureka!I created my own docker image with a newer python and the error disappeared
I circumvented the problem by putting timestamp in task name, but I don't think this is necessary.
@<1523701087100473344:profile|SuccessfulKoala55> I am using it as follows:
after calling clearml.Task.init() I create an object:
cache = Cache('/scidata/marek/diskcache')
and then in the loading function I do:
if cache_arg in load_and_crop.cache:
return load_and_crop.cache[cache_arg] ...
@<1523701435869433856:profile|SmugDolphin23> it took some time, but I was able to cut 90% of the code, just dataloading remains and the problem persists (which is fortunate, as it makes it easy to replicate). Please have a look.
but perhaps it is worth adding to the docs page a hint to avoid using the CLEARML_TASK_ID env variable, perhaps I am not the only one to ever try it
yes, I don't know whether you have access to the backend, but just in case my experiment is this one:
ClearML results page: https://app.clear.ml/projects/45557a1ee1464631a9a18b0dcda4f682/experiments/01b77a220869442d80af42efce82c617/output/log
I can hardcode it into program if you want
Just to let you know, it now works (obviously) in the k8s setting as well.
maybe not at the top but in the Task.init description
I am not an expert on this, just started using torchmetrics.
now it stopped working locally as well
this is part of repository
I am only getting one user for some reason, even though 4 are in the system
@<1523701435869433856:profile|SmugDolphin23> let me know if you need any help in reproducing
The problem started appearing when I started to use joblib with a simple memory caching mechanism.
I subscribe to the problem of having large metrics without a tool for proper inspection what is it coming from.
ok, understood, it was probably my fault, I was messing up with the services container and probably made the pipeline task interrupted, so the subtasks themselves have finished, but the pipeline task was not alive when it happened
I don't see such a method in the docs, but it seems so natural that decided to ask.
where is the endpoint located? I can't find it, were only able to find this:
https://github.com/allegroai/clearml/blob/ccc8e83c58336928424ed14b176306b149258512/examples/services/monitoring/slack_alerts.py#L55
@<1523701435869433856:profile|SmugDolphin23> will send later today
to avoid loading and cropping a big image
task.data.user is the user id, can I get it in the text form?
I did not know about it, thanks!
my code snippet
` from clearml import Task
import os
clearml_task_id = os.environ['CLEARML_TASK_ID']
Task.debug_simulate_remote_task(clearml_task_id)
clearml_task = Task.init(auto_connect_arg_parser=False, auto_resource_monitoring=False)
print(clearml_task.id)
clearml_task.logger.report_scalar(series='s', value='123', iteration=2, title='title')
clearml_task.logger.report_text("some text") `
I could have been more inventive as well 😄
I will try with sys.path.append('../../../../') ` later today and see what happens
traceback:
` Traceback (most recent call last):
File "/home/marek/nomagic/monomagic/ml/tiresias/calibrate_and_test.py", line 57, in <module>
Task.add_requirements('requirements.txt')
File "/home/marek/.virtualenvs/tiresias-3.9/lib/python3.9/site-packages/clearml/backend_interface/task/task.py", line 1976, in add_requirements
for req in pkg_resources.parse_requirements(requirements_txt):
File "/home/marek/.virtualenvs/tiresias-3.9/lib/python3.9/site-packages/pkg_resources/_init...