I figured out the problem...
Nice!
Unfortunately, the hyperparameters in configuration object seems to be superior to the hyperparameters in Hyperparameter section
Hmm what do you mean by that ? how did you construct the code itself? (you should be able to "prioritize" one over the over)
Thanks VivaciousPenguin66 !
BTW: if you are running the local code with conda, you can set the agent to use conda as well (notice that if you are running locally with pip, the agent's conda env will use pip to install the packages to avoid version mismatch)
Hi CurvedHedgehog15
I would like to optimize hparams saved in Configuration objects.
Yes, this is a tough one.
Basically the easiest way to optimize is with hyperparameter sections as they are basically key/value you can control from the outside (see the HPO process)
Configuration objects are, well, blobs of data, that "someone" can parse. There is no real restriction on them, since there are many standards to store them (yaml,json.init, dot notation etc.)
The quickest way is to add...
I see, let me check something š
Add '/' , like you would with a file system.Task.init(project_name='main_project/sub_project', task_name='test')
Also, can the image not be pulled from dockerhub but used from the local build instead?
If you have your docker configured to pull from local artifactory, then the agent will do the same š (it is calling the docker command just like you do)
agent.default_docker.arguments: "--mount type=bind,source=$DATA_DIR,target=/data"
Notice that you are use default docker arguments in the example
If you want the mount to always be there use extra_docker_arguments :
https://github.com/...
Hi AstonishingRabbit13
now Iām training yolov5 and i want to save all the info (model and metrics ) with clearml to my bucket..
The easiest thing (assuming you are running YOLOv5 with python train.py
is to add the following env variable:CLEARML_DEFAULT_OUTPUT_URI="
" python train.py
Notice that you need to pass your GS credentials here:
https://github.com/allegroai/clearml/blob/d45ec5d3e2caf1af477b37fcb36a81595fb9759f/docs/clearml.conf#L113
Hi WorriedParrot51
Assuming you run the code "manually" once (i.e. without the agent). Then when you call Task.init it will register the argparser.
When running with the agent, the first time you will call parse, it will automatically override the argparse defaults with the values stored in the Task.
Make sesne?
am getting None for Task.current_task() at the beginning of my script.
Task.init() is doing the magic , only after this call you will have current_task (either running manua...
Hi UpsetBlackbird87
This is an Optuna decision on how many concurrent tests to run simultaneously.
You limited it to 100, but remember Optuna does a Bayesian optimization process, where it decides on the best set of arguments based on the performance of the previous set, this means it will first try X trials, then decide on the next batch.
That said you can a pruner to Optuna specifying how it should start
https://optuna.readthedocs.io/en/v1.4.0/reference/pruners.html#optuna.pruners.Median...
on the host machine or inside the containers that are spinning on the host machine ?
Sure thing, thanks FlutteringWorm14 !
to avoid downgrade to clearml==1.9.1
I will make sure this is solved in clearml==1.9.3 & clearml-session==0.5.0 quickly
PompousBeetle71 the code is executed without arguments, in run-time trains / trains-agent will pass the arguments (as defined on the task) to the argparser. This means you that you get the ability to change them and also type checking š
PompousBeetle71 if you are not using argparser how do you parse the arguments from sys.argv? manually?
If that's the case, post parsing, you can connect a dictionary to the Task and you will have the desired behavior
` task.connect(dict_with_arguments...
Hmm I assume it is not running from the code directory...
(I'm still amazed it worked the first time)
Are you actually using "." ?
Yes, actually ensuring pip is there cannot be skipped (I think in the past it cased to many issues, hence the version limit etc.)
Are you saying it takes a lot of time when running? How long is the actual process that the Task is running (just to normalize times here)
AdventurousRabbit79 are you passing cache_executed_step=False
to the PipelineController ?
https://github.com/allegroai/clearml/blob/332ceab3eadef4997e897d171957975a247a6dc1/clearml/automation/controller.py#L129
Could you send a usage example ?
my pipeline controller always updates to the latest git commit id
This will only happen if the Task the pipeline creates has no specific commit ID, and instead just uses the latest from the git repo. Is this the case ?
@<1523701083040387072:profile|UnevenDolphin73> it's looking for any of the files:
None
Hi @<1541954607595393024:profile|BattyCrocodile47>
is this on your self hosted machine ?
Hi PompousBeetle71
Could you test the latest RC, I think the warning were fixed:pip install trains==0.16.2rc0
Let me know...
FrothyShark37 any chance you can share snippet to reproduce?
Hi DeliciousBluewhale87
You mean per Task? Is it reporting? Is it like the project overview?
is everything on the same network?
Oh no, you are absolutely correct, it is broken (I mean I have no idea why it lists Hydra, or how it got there). I will let the guys know and fix it.
Bottom line, after you clone it, please edit the installed packages and remove the "Hydra" line and replace with just "hydra-core" (no need for version).
The format is the same as "requirements.txt" and will effect the venv created by the agent
Iām not sure if this was solved, but I am encountering a similar issue.
Yep, it was solved (I think v1.7+)
With
spawn
and
forkserver
(which is used in the script above) ClearML is not able to automatically capture PyTorch scalars and artifacts.
The "trick" is to have Task.init before you spawn your code, then (since your code will not start from the same state), you should call Task.current_task(), which would basically make sure everything is...
Are you using tensorboard or do you want to log directly to trains ?
Hi AverageBee39
What's the clearml-server and clearml packge you are using ?
(I looks like some capability that is missing from the server, i.e. needs upgrade ?!)
In the installed packages section it includes
pywin32 == 303
even though that is not in my requirements.txt.
So for some reason it is being detected (meaning your code base actually imports it in code)
But you can just remove it, either by manually editing the cloned Task (right click, reset, then you can edit the section), or via codeTask.ignore_requirements("pywin32") task = Task.init(...)