Thanks, yes you are correct the color is derived from the series name, so I guess the issue is the name+Id is not kept in full screen
maybe I should use explicit reporting instead of Tensorboard
It will do just the same 😞
there is no method for settingÂ
last iteration
, which is used for reporting when continuing the same task. maybe I could somehow change this value for the task?
Let me double check that...
overwriting this value is not ideal though, because for :monitor:gpu and :monitor:machine ...
That is a very good point
but for the metrics, I explicitly pass th...
If this is the case I would do:
` # Add the collector steps (i.e. the 10 Tasks
pipe.add_task(...
post_execute_callback=Collector.collect_me
)
pipe.start()
pipe.wait()
Collector.process_results(pipe) `wdyt?
Thanks JitteryCoyote63 let me double check if there is a reason for that (there might be one, not sure)
JitteryCoyote63 , just making sure, does refresh fixes the issue ?
But what I get withÂ
get_local_copy()
 is the following path: ...
Get local path will return an immutable copy of the dataset, by definition this will not be the "source" storing the data.
(Also notice that the dataset itself is stored in zip files, and when you get the "local-copy" you get the extracted files)
Make sense ?
Hi GiganticTurtle0
Let me check
not sure if this is considered a bug or not! but I’d happily make an issue on github if needed.
I think we should, at least for the sake of transparency and visibility 🙂
thanks again for all your help.
My pleasure 🙂
that must have been it. here’s the installed packages when not usingÂ
-m
:
Hmm yes, can you open a GitHub issue on that? (this seems like a bug)
BTW: could it be the Task.init is Not called on the "module.name" entry point, but somewhere internally ?
Sounds great! let me know what you find out 🙂
MuddySquid7 you mean you are creating them with TB ? or are you uploading them as debug images ?
Specifically in the ClearML UI, do you have it under "plots" tab or "debug samples" tab ?
MuddySquid7 I might have found something, and this is very very odd, it seems it will Not upload any new images post the history size, which is very odd considering the number of users actively using this feature...
Do you want to try a hack to see if it solved your issue ?
Hi MuddySquid7 issue is verified, v1.1.1 will be released in a few hours with a fix.
Thank you for noticing!
oh...so is this a bug?
It was always a bug, only an elusive one 😉
Anyhow, I'll make sure we push a fix to GitHub, an RC is planned for later this week, it will contain it
I still wonder how no one noticed ... (maybe 100 unique title/series report is relatively high threshold)
any chance StorageManager could re-download files only if their size is different from file in cache (as an option)?
I think there is force
argument, to force download.
I think the main issue is getting the size from different backends (i.e. s3 /https / etc.)
Maybe we should add it as a GitHub feature request issue?
The main limitation is that the driver "list()" does not return file size.
For example it might be an issue with the default http files-server.
wdyt?
Notice that you need to pass the returned scroll_id to the next call
scroll_id = response["scroll_id"]
Hmm @<1523701083040387072:profile|UnevenDolphin73> I think this is the reason, None
and this means that even without a full lock file poetry can still build an environment
Aws autoscaler will work with iam rules along as you have it configured on the machine itself. Sagemaker job scheduling (I'm assuming this is what you are referring to, and not the notebook) you need to select the instance as well (basically the same as ec2). What do you mean by using the k8s glue, like inherit and implement the same mechanism but for sagemaker I stead of kubectl ?
I think this is the discussion you are after:
https://clearml.slack.com/archives/C01H5VAUZ8R/p1612452197004900?thread_ts=1612273112.002400&cid=C01H5VAUZ8R
Hi SmoggyGoat53
What do you mean by "feature store" ? (These days the definition is quite broad, hence my question)
, is the team open to PRs from external people?
Yes please do! PRs are welcomed! I thought we fixed the GitHub readme to reflect it, anyhow I'll make sure we do 🙂
Hi @<1546665666675740672:profile|AttractiveFrog67>
- Make sure you stored the model's checkpoint (either pass
output_uri=True
inTask.init
or manually upload) - When you call
Task.init
pass "continue_last_task=True
" - Now you can do
last_checkpoint=task.models["output"][-1].get_local_copy()
and all you need is to loadlast_checkpoint
Oh if this is the case you can probably do
` import os
import subprocess
from clearml import Task
from clearml.backend_api.session.client import APIClient
client = APIClient()
queue_ids = client.queues.get_all(name="queue_name_here")
while True:
result = client.queues.get_next_task(queue=queue_ids[0].id)
if not result or not result.entry:
sleep(5)
continue
task_id = result.entry.task
client.tasks.started(task=task_id)
env = dict(**os.environ)
env['CLEARML_TASK_ID'] = ta...
it works if I run the same command manually.
What do you mean?
Can you do:docker run -it <my container here> bash
Then immediately get an interactive bash ?
Hi SubstantialElk6
I'm not sure what you are asking 🙂
Basically the clearml-agent
will pull a Task from an execution queue, and execute it (based on the definition on the Task, i.e. git repo, python packages docker image etc.)