Reputation
Badges 1
25 × Eureka!-rw------- 1 1000 1000 0 Feb 28 23:41 config
If you choose between skipping or logging like nan, then here I find it difficult, it seems that it is better to log than skip, but you need to think.
So I "think" the issue is plotly (UI), doesn't like NaN and also elastic (storing the scalar) is not a NaN fan. We need to check if they both agree on the representation, that it should be easy to fix...
Maybe you could open a github issue, so we do not forget?
Hmm I suspect the 'set_initial_iteration' does not change/store the state on the Task, so when it is launched, the value is not overwritten. Could you maybe open a GitHub issue on it?
ReassuredTiger98 All that said, how about opening an Issue on GitHub (feature request)? if we get a bit of support from users, we could definitely add it
Hi SarcasticSparrow10
You will need to habe multiple trains-agent
s but they will be sharing the same queue (i.e. pulling jobs from the same queue the HPO process is pushing to)
Make sense ?
basically PVC for all the DBs ๐
Hi @<1720249416255803392:profile|IdealMole15>
I'm assuming you mean on a remote machine with clearml-agent running ?
If you do, then you either use clearml-task
to create a Task (Job) and specify the container and script. or click on "Create New Experiment" in the UI, and fill out the git repo / script and specify the docker image.
Make sense?
Hi @<1726410010763726848:profile|DistinctToad76>
Why not just report scalars, the x-axis you can use as "iterations" if this is a running in real time to collect the prompts.
If this is a summary then just report a scatter plot (you can also specify the names of the axis and the series)
None
GreasyPenguin14 could you test with the matplotlib lib example ? (I cannot reproduce it and it seems like something to do with pycharm and matplotlib backend)
https://github.com/allegroai/clearml/blob/master/examples/frameworks/matplotlib/matplotlib_example.py
BattyLion34 are you saying you do not have the "APP CREDENTIALS" section in the profile page?
I wonder if using our own containers which should have most the deps will work better than a simpler container.
Why not, it's transparent, just run in --docker mode and provide a default docker image if the Task doesn't specify one.
If this is the case, then we do not change the maptplotlib backend
Also
I've attempted converting theย
mpl
ย image toย
PIL
ย and useย
report_image
ย to push the image, to no avail.
What are you getting? error / exception ?
ReassuredTiger98 I โค the DAG in ASCII!!!
port = task_carla_server.get_parameter("General/port")
This looks great! and will acheive exactly what you are after.
BTW: when you are done you can do :task_carla_server.mark_aborted(force=True)
And it will shutdown the Clara Task ๐
however setting up the interpertier on pycharm is different on mac for some reason, and the video just didnt match what I see
MiniatureCrocodile39 Are you running on a remote machine (i.e. PyCharm + remote ssh) ?
Thanks ElegantCoyote26 I'll look into it. Seems like someone liked our automagical approach ๐
I think the limit is a few GB, I'm not sure, I'll have to check
And yes the oldest experiments will be deleted first (with the exception of published experiments, they will be deleted last)
Can't say I have noticed that, is this a delay on the send ? Which for some reason is correlated with the epochs ? What was the case with 0.17.5?
Hi TightElk12
it would raise an error if the env where execution happens is not configured to track things on our custom server to prevent logging to the public demo server ?
What do you mean by that? catching the default server instead of the configured one ?
With remote_execution it isย
command="[...]"
ย , but on local it isย
command='train'
ย like it is supposed to be.
I'm not sure I follow, could you expand ?
Ok, so it doesn't follow the exact same rules asย
Task.init
?
Correct
I was afraid all the logs and outputs of a hyperparameter optimization task would be deleted just because no artifacts were created.ย (edited)
Should not happen ๐
Hi @<1697419082875277312:profile|OutrageousReindeer5>
Is NetApp S3 protocol enabled or are you referring to NFS mounts?
Hi @<1719524641879363584:profile|ThankfulClams64>
I am using ClearML Pro and pretty regularly I will restart an experiment and nothing will get logged to ClearML.
I use ClearML with pytorch 1.7.1, pytorch-lightning 1.2.2 and Tensorboard auto
All ClearML has the latest stable updates. (clearml 1.7.4, clearml-agent 1.7.2)
Is this still happening with the latest clearml ( clearml==1.16.3rc2
) ?
What is the TB version?
I remember a fix regrading lightining support
Also just making s...
oh sorry my bad, then you probably need to define all OS environment variable for python temp folder for the agent (the Task process itself is a child process so it will inherit it)
TMPDIR/new/tmp TMP=/new/tmp TEMP=/new/tmp clearml-agent daemon ...
Sure thing, let me know ... ๐
then will have to rerun the pipeline code then manually get the id and update the task.
Makes total sense to me!
Failed auto-generating package requirements: _PyErr_SetObject: exception SystemExit() is not a BaseException subclass
Not sure why you are getting this one?!
ValueError: No projects found when searching for
MyProject/.pipelines/PipelineName
hmm, what are you getting with:
task = Task.get_task(pipeline_uid_here)
print(task.get_project_name())
Hi EagerOtter28
The agent knows how to do the http->ssh conversion on the fly, in your cleaml.conf (on the agent's machine) set force_git_ssh_protocol: true
https://github.com/allegroai/clearml-agent/blob/42606d9247afbbd510dc93eeee966ddf34bb0312/docs/clearml.conf#L25