Does it work if I launch the clearml-agent on a docker and pip doesn't know the packages to install
Not sure I follow... the "detect_with_pip_freeze" flag (when set) will tell clearml (at runtime) to create the "installed packages" directly from pip freeze (instead of analyzing the code)
PricklyRaven28 did you set the iam role support in the conf?
https://github.com/allegroai/clearml/blob/0397f2b41e41325db2a191070e01b218251bc8b2/docs/clearml.conf#L86
CluelessFlamingo93 I would also fix the pip version requirements to:pip_version: ["<20.2 ; python_version < '3.10'", "<22.3 ; python_version >= '3.10'"]
Although I didn't understand why you mentioned
torch
in my case?
Just a guess 🙂 other frameworks do multi-process as well,
I would guess it relates to parallelization of Tasks execution of the
HyperParameterOptimizer
class?
Yes that might be it, it's basically by product of using python "Process" class for multiprocessing. we are working on a fix, not a trivial unfortunately
Now will these 10 experiments be of different names? How will I know these are part of the 'mnist1' HPO case?
Yes (they will have the specific HP name/value combination).
FYI names are not unique so in theory you could have multiple experiments with the same name.
If you look under the Configuration Tab, you will find all the configuration arguments for the experiment. You can also add specific arguments to the experiment table (click the cogwheel at the right top corner, and select...
Hi OutrageousGiraffe8
Does anybody knows why this is happening and is there any workaround, e.g. how to manually report model?
What exactly is the error you are getting? and with which clearml version are you using?
Regrading manual Model reporting:
https://clear.ml/docs/latest/docs/fundamentals/artifacts#manual-model-logging
Assuming Tensorflow (which would be an entire folder)local_folder_or_files = mode.get_weights_package()
Hi DashingHedgehong5
Is the text the ,labels on the histogram bucket ?
Notice the xlabels
arguments, id this what you are looking for ?
Hi @<1566596960691949568:profile|UpsetWalrus59>
you should call it before initializing the Task
Task.ignore_requirements("pywin32")
task = Task.init(...)
I think I found something,
https://github.com/allegroai/clearml/blob/e3547cd89770c6d73f92d9a05696018957c3fd62/clearml/storage/helper.py#L1442
What's the boto version you have installed?
That said, the arguments are passed Inside the code executed (i.e. monkey patched into the frameworks). This allows it to log and change All the arguments, including the default ones , and allow you to edit them.
Does that make sense ?
PompousHawk82 unfortunately this is kind of binary, either you have full tracking of load/save operations or you do not.
This warning message will disappear in the next version as we will be able to log multiple models under the same Task :)
Can you put here the task.connect line ? (btw: I would assume there is no need for additional connect, if using hydra+fire, no ?)
I just cloned it from the examples that are available in the SaaS console upon account creation
Ohhh! that would explain it. Maybe it is broken there?! let me check a second
Questions
I want to trigger a retrain task when F1
That means that in inference you are reporting the F1 score, correct?
As part of the retraining I have to train all the models and then have to choose best one and deploy it
Are you using passing output_uri to Task.init? are you storing the model as artifact?
You can tag your model/task with "best" tag (and untag the previous one). Then in production , look for the "best" task and get its model
Thoughts?
Hi @<1684010629741940736:profile|NonsensicalSparrow35>
But the provided command is missing the url target for the curl so it is not complete.
Not sure I followed. did you specify "NEW_ADDRESS" ?
or is it the in both cases the URL is locahost ?
DefeatedOstrich93 can you verify lightning actually only stored once ?
tf datasets is able to handle batch downloading quite well.
SubstantialElk6 I was not aware of that, I was under the impression tf dataset is accessed on a file level, no?
ohh sorry, weights_url=path
Basically url can be the local path to the weights file 🙂
EnormousWorm79 you mean to get the DAG graph of the Dataset (like you see in the plots section)?
Hi BlandPuppy7 , is this Trains related, are you trying to integrate it, and need help?
in the UI the installed packages will be determined through the code via the imports as usual ...
This is only in a case where a user manually executed their code (i.e. without trains-agent), then in the UI after they clone the experiment, they can click on the "Clear" button (hover over the "installed packages" to see it) and remove all the automatically detected packages. This will results in the trains-agent
using the "requirements.txt".
Hi ConfusedPig65
Any keras model will be automatically uploaded if you pass an upload url to the Task init:task = Task.init('examples', 'keras upload test', output_uri="
")
(You can also pass to output_uri s3://buckket/folder or change the default output_uri in the clearml.conf file)
After this line any keras model will be automatically uploaded (you will see it under the Artifacts Tab)
Accessing models from executed tasks:
` trains_task = Task.get_task('task_uid_here')
last_check...
If you are using the latest RC:pip install clearml==0.17.5rc5
You can pass True
it will use the "files_server" as configured in your clearml.conf
I used the http link as a filler to point to the files_server.
Make sense ?
It does not upload, the default behavior is to log the artifact (so you know where you stored, but not enforce unnecessary uploads)
If you were to change:task = Task.init(project_name='examples', task_name='Keras with TensorBoard example')
to:task = Task.init(project_name='examples', task_name='Keras with TensorBoard example', output_uri="
")
It would also upload the model
Yey!
My pleasure 🙂