I can't seem to find a difference between the two, why would matplotlib get listed and pandas does not... Any other package that is missing?
BTW: as an immediate "hack" , before your Task.init
call add the following:Task.add_requirements("pandas")
You mean I can do Epoch001/ and Epoch002/ to split them into groups and make 100 limit per group?
yes then the 100 limit is per "Epoch001" and another 100 limit for "Epoch002" etc. 🙂
Oh, I was assuming you are passing the entire DB backups to the cloud.
Are you saying you just want the file server on the cloud ? if this is the case, I would just use S3
In the installed packages section it includes
pywin32 == 303
even though that is not in my requirements.txt.
So for some reason it is being detected (meaning your code base actually imports it in code)
But you can just remove it, either by manually editing the cloned Task (right click, reset, then you can edit the section), or via codeTask.ignore_requirements("pywin32") task = Task.init(...)
Hi ShallowArcticwolf27
First of all:
If the answer to number 2 is no, I'd loveee to write a plugin.
Always appreciated ❤
Now actually answering the Q:
Any torch.save (or any other framework save) will either register or automatically upload, the file (or folder) in the system. If this is a folder it will be zipped and uploaded, if a file just uploaded to to the assigned storage output (the cleaml-server, any object storage service, or shared folder). I'm not actually sure I...
Basically lock the Task (so you cannot reset it or change it). Usually it also marks "ready to use" etc. It also will publish the models the Task created.
SkinnyPanda43 issue verified, this seems to be related to python 3.9 and subprocesses.
Let me check what we can do
hm ReassuredTiger98 can you send the full log? I think it should have worked (but as you mentioned it might be conda/pip mix?!)
ReassuredTiger98 yes this is odd:
also:Warning, could not locate PyTorch torch==1.12 matching CUDA version 115, best candidate 1.12.0.dev20220407
Seems like it found a matching version and did not use it...
Let me check that
So it makes sense it installs v8.0.1
(maybe originally you provided no version and it installed the latest one)
This is basically pip's doing the package version resolving
What do you have under the "installed packages" ?
Hi AstonishingWorm64
Is this the same ?
https://github.com/allegroai/clearml-serving/issues/1
(I think it was fixed on the later branch, we are releasing 0.3.2 later today with a fix)
Can you try:pip install git+
FileNotFoundError: [Errno 2] No such file or directory: 'tritonserver': 'tritonserver'
This is oddd.
Can you retry with the latest from the github ?pip install git+
Bottom line the driver version in the host machine does not support the CUDA version you have in the docker container
Is this reproducible with the hpo example here:
https://github.com/allegroai/clearml/tree/400c6ec103d9f2193694c54d7491bb1a74bbe8e8/examples/optimization/hyper-parameter-optimization
What's your clearml version? (And is it possible you verify with the latest version?)
Hi AstonishingWorm64
I think you are correct, there is external interface to change the docker.
Could you open a GitHub issue so we do not forget to add an interface for that ?
As a temp hack, you can manually clone "triton serving engine" and edit the container image (under the execution Tab).
wdyt?
How so? they are in one place? the creation of the venv is transparent, and the packages that are there are everything you have in the docker, plus the ability to override them from the UI.
What am I missing here ?
Hi ConfusedPig65
Any keras model will be automatically uploaded if you pass an upload url to the Task init:task = Task.init('examples', 'keras upload test', output_uri="
")
(You can also pass to output_uri s3://buckket/folder or change the default output_uri in the clearml.conf file)
After this line any keras model will be automatically uploaded (you will see it under the Artifacts Tab)
Accessing models from executed tasks:
` trains_task = Task.get_task('task_uid_here')
last_check...
If you are using the latest RC:pip install clearml==0.17.5rc5
You can pass True
it will use the "files_server" as configured in your clearml.conf
I used the http link as a filler to point to the files_server.
Make sense ?
Then check in the clearml.conf
under files_server
And use what you have there (for example http://localhost:8081 )
You can check the keras example, run it twice, on the second time it will continue from the previous checkpoint and you will have input and output model.
https://github.com/allegroai/clearml/blob/master/examples/frameworks/keras/keras_tensorboard.py
So I have a task that just loads a model, but I don't see it as an artifact in the UI
You should see it under Artifacts, Input model if you are calling Keras load function (or similar)
Hmm GreasyLeopard35 can you specify the range you are passing to the HPO, as well as the type of optimization class ? (grid/random/optuna etc.)
` from clearml.automation.parameters import LogUniformParameterRange
sampler = LogUniformParameterRange(name='test', min_value=-3.0, max_value=1.0, step_size=0.5)
sampler.to_list()
Out[2]:
[{'test': 1.0},
{'test': 3.1622776601683795},
{'test': 10.0},
{'test': 31.622776601683793},
{'test': 100.0},
{'test': 316.22776601683796},
{'test': 1000.0},
{'test': 3162.2776601683795}] `
GreasyLeopard35 I think you are on to something, I think UniformParameterRange just misses a min value:
https://github.com/allegroai/clearml/blob/fcad50b6266f445424a1f1fb361f5a4bc5c7f6a3/clearml/automation/parameters.py#L168
Should be:[self.min_value + v*step_size for v in range(0, int(steps))]