Reputation
Badges 1
25 × Eureka!is there a way to visualize the pipeline such that this step is βstuckβ in executing?
Yes there is, the pipelline plot (see plots section on the Pipeline Task, will show the current state of the pipeline.
But I have a feeling you have something else in mind?
Maybe add Tag on the pipeline Task itself (then remove it when it continues) ?
I'm assuming you need something that is quite prominent in the UI, so someone knows ?
(BTW I would think of integrating it with the slack monitor, to p...
Hi @<1536518770577641472:profile|HighElk97>
Is there a way to change the smoothing algorithm?
Just like with TB, this is front-end, not really something you can control ...
That said you can report a smoothed value (i.e. via python) as additional series, wdyt ?
That said, it might be different backend, I'll test with the demoserver
In case of scalars it is easy to see (maximum number of iterations is a good starting point
BitterLeopard33
How to create a parent-child Dataset with a same dataset_id and only access the child?
Dataset ID is unique, the child will have a different UID. The name of the Dataset can the the same though.
Specifically to create a child Dataset:
https://clear.ml/docs/latest/docs/clearml_data#datasetcreatechild = Dataset.create(..., parent_datasets=['parent_datast_id'])
Are there any ways to access the parent dataset(assuming its large and i dont want to download it)
...
OddAlligator72 just so I'm sure I understand your suggestion:
pickle the entire locals()
on current machine.
On remote machine, create a mock entry point python, restore the "locals()" and execute the function ?
BTW:
Making this actually work regardless on a machine is some major magic in motion ... π
So what is the difference ? both running from the same machine ?
data["encoded_lengths"]
This makes no sense to me, data is a numpy array, not a pandas frame...
Hi LazyTurkey38
Configuring these folders will be pushed later today π
Basically you'll have in your clearml.conf
` agent {
docker_internal_mounts {
sdk_cache: "/clearml_agent_cache"
apt_cache: "/var/cache/apt/archives"
ssh_folder: "/root/.ssh"
pip_cache: "/root/.cache/pip"
poetry_cache: "/root/.cache/pypoetry"
vcs_cache: "/root/.clearml/vcs-cache"
venv_build: "/root/.clearml/venvs-builds"
pip_download: "/root/.clearml/p...
I think it would be nicer if the CLI had a subcommand to show the content ofΒ
~/.clearml_data.json
Β .
Actually, it only stores the last dataset id at the moment, no not much π
But maybe we should have a cmd line that just outputs the current datasetid, this means it will be easier to grab and pipe
WDYT?
That's with the key at
/root/.ssh/id_rsa
You mean inside the container that the autoscaler spinned ?
Notice that the agent by defult would mount the Host .ssh over the existing .ssh inside the container, if you do not want this behavior you need to set: agent.disable_ssh_mount: true
in clearml.conf
Hi @<1564785037834981376:profile|FrustratingBee69>
It's the previous container I've used for the task.
Notice that what you are configuring is the Default container, i.e. if the Task does not "request" a specific container, then this is what the agent will use.
On the Task itself (see Execution Tab, down below Container Image) you set the specific container for the Task. After you execute the Task on an Agent, the agent will put there the container it ended up using. This means that ...
@<1699955693882183680:profile|UpsetSeaturtle37> good progress, regrading the error, 0.15.0 is supposed to be out tomorrow, it includes a fix to that one.
BTW: can you run with --debug
@<1542316991337992192:profile|AverageMoth57> it sounds like you should use SSH authentication for the agent, just setforce_git_ssh_protocol: true
None
And make sure you have the SSH kets on the agent's machine
Should work with report surface, notice that this is not triangles, assumption is this is a fixed sampling of the surface, sample size is the numpy array matrix and the sample value (i.e. Z ) is the value on the matrix. This means that if you have a set of mesh triangles , you have to projects and sample it.
I think this is what you are after https://trimsh.org/trimesh.voxel.base.html?highlight=matrix#trimesh.voxel.base.VoxelGrid.matrix
PungentLouse55 could you test with 0.15.2rc0 see if there is any difference ?
Thanks ReassuredTiger98 , yes that makes sense.
What's the python version you are using ?
so if i plot image with matplot lib..it would not upload? i need use the logger.
Correct, if you have no "main" task , no automagic π
so how can i make it run with the "auto magic"
Automagic logs a single instance... unless those are subprocesses, in which case, the main task takes care of "copying" itself to the subprocess.
Again what is the use case for multiple machines?
PungentLouse55 from the screenshot I assume the experiment template you are trying to optimize is not the one from the trains/examples π
In that case, and based on the screenshots, the prefix is "Args/" as this is the section name.
Regrading objective metric, again based on your screenshots:objective_metric_title="Accuracy" objective_metric_series="Validation"
Make sense ?
`
Example use case:
an_optimizer = HyperParameterOptimizer(
# This is the experiment we want to optimize
base_task_id=args['template_task_id'],
# here we define the hyper-parameters to optimize
hyper_parameters=[
UniformIntegerParameterRange('General/layer_1', min_value=128, max_value=512, step_size=128),
UniformIntegerParameterRange('General/layer_2', min_value=128, max_value=512, step_size=128),
DiscreteParameterRange('General/batch_size', values=[...
In order for the sample to work you have to run the template experiment once. Then the HP optimizer will find the best HP for it.
DepressedChimpanzee34 I cannot find cfg.py here
https://github.com/allegroai/clearml/tree/master/examples/frameworks/hydra/config_files
(or anywhere else)
That sounds like an internal tritonserver error.
https://forums.developer.nvidia.com/t/provided-ptx-was-compiled-with-an-unsupported-toolchain-error-using-cub/168292
No worries, let's assume we have:base_params = dict( field1=dict(param1=123, param2='text'), field2=dict(param1=123, param2='text'), ... )
Now let's just connect field1:task.connect(base_params['field1'], name='field1')
That's it π
However, that would mean passing back the hostname to the Autoscaler class.
Sorry my bad, the agent does that automatically in real-time when it starts, no need to pass the hostname it takes it from the VM (usually they have some random number/id)
So if you set it, then all nodes will be provisioned with the same execution script.
This is okay in a way, since the actual "agent ID" is by default set based on the machine hostname, which I assume is unique ?
Interesting question, should work and looks like an interesting combination, I'm curious what you come up with.
btw: grafana itself can already provide a lot of alerts for drift etc, this is basically their histogram delta feature
Okay good news, there is a fix, bad news, sync to GitHub will only be tomorrow
SmallBluewhale13
And the Task.init registers 0.17.2 , even though it prints (while running the same code from the same venv) 0.17.2 ?