it fails because my_package using pip...so I have to manually edit the section and remove the "my_package"
MagnificentSeaurchin79 did you manually add both "." and my_package ?
If so, what was the reasoning to add my_package if pip cannot install it ?
Found it, definitely a bug in the callback, it has not effect on the HPO process itself
OddShrimp85 you can see the full configuration at the top of the Task log. What do you have there? Also what is the clearml python version?
And is "requirements-dev.txt" in your git root folder?
What is your clearml-agent version?
I think EmbarrassedSpider34 is correct.
When you pass the requirements to clearml-task, actually the agent depending on how it was configured (conda / pip) will do the installation.
That said, maybe it is worth adding support to provide the env.yml in the CLI ?
(Notice that adding specific channels needs to be configured on the agent, they are not stored per Task)
AlertCamel57 wdyt?
`
Example use case:
an_optimizer = HyperParameterOptimizer(
# This is the experiment we want to optimize
base_task_id=args['template_task_id'],
# here we define the hyper-parameters to optimize
hyper_parameters=[
UniformIntegerParameterRange('General/layer_1', min_value=128, max_value=512, step_size=128),
UniformIntegerParameterRange('General/layer_2', min_value=128, max_value=512, step_size=128),
DiscreteParameterRange('General/batch_size', values=[...
A few epochs is just fine
Hi PungentLouse55
it depends on the trains-server version you are running.
If the trains-server >= 0.16 then you have to add "Args/" prefix. If you are running an older version, then you should not add any prefix.
PungentLouse55 you can find the metrics in the "original" (aka base template) experiment.
I'll make sure we fix the example, because as you pointed, it is broken :(
I want to optimizer hyperparameters with trains.automation but: ...
Yes you are correct, in case of the example code, it should be "General/..." if you have ArgParser, it should be "Args/..." Yes it looks like the metric is wrong, it should be "epoch_accuracy" & "epoch_accuracy"
Hi PungentLouse55
Are you referring to the example code ?
PungentLouse55 , make sure you fix the metric objective and args:
Add "General/" prefix to the list of arguments to optimize, and change the objective metric from "Accuracy" to "epoch_accuracy"
In order for the sample to work you have to run the template experiment once. Then the HP optimizer will find the best HP for it.
you are correct, I was referring to the template experiment
Yes, sorry, that wasn't clear 🙂
Thanks StaleKangaroo85 bug is verified. Let me check to see where exactly is the bug.
Two points
Notice that x_labels should be the size of the histogram It seems that you have to pass the labels as well (otherwise you get the trace-0), so if you add labels=['random histogram']
and labels=['random histogram2']
, you'll get the correct legend.Anyhow I'll make sure we also fix it in code so it is automatically labels are [series] if not specified, thanks!
JitteryCoyote63 no you should not (unless you already have the Task.init call in your code)clearml-data
add the Task.init call at the beginning of the code in the entry point.
This means you should be able to get Task.current_task()
and get back the object.
What do you have under the "uncommitted changes" on the Task that was created?
UnevenDolphin73 clearml.config.get_remote_task_id()
will return the Task ID not the Task object. in order to get automagic to work, one h...
WickedGoat98 give me a minute, I'm not sure it is not ClearML related
WickedGoat98 what's the clearml version you are using?
Will such an docker image need a trains configuration file?
If you need to configure things other than credentials (see above) than yes you might need to map trains.conf
into the pod.
Specifically, if you need, map your trains.conf to /root/.trains
inside the pod/container
clearml should detect the "main" packages used in the repository (not just the main-script), the derivatives will be installed automatically by pip when the agent is installing the environment, once the agent is done setting the environment, it updates back the Task with the full list of packages including all required packages.
Hi UnsightlySeagull42
Just making sure, the two scripts are on your git repo ?
Three options:
In your code: Task.init(..., output_uri='s3://.../'
2. Configure a default output_uri to be used by all tasks: https://github.com/allegroai/clearml/blob/64042f6c4fdaaf15b6c5f816f2fbf50f89c313e2/docs/clearml.conf#L156
3. In the UI after you clone a Task under Execution tab, "output" "destination"
In all cases output_uri can be:
/mnt/share/folder (if you have a shared folder between all machines. http://trains-server:8081/ gs://bucket azure://bucket/
WickedGoat98 no need to open any ports on the agent's machine, the agent is polling the clearml-server, so as long as it can reach it, we are good.
That wasn't scheduled by ClearML).
This means that from Clearml perspective they are "manual" i.e the job it self (by calling Task.init) create the experiment in the system, and fills in all the fields.
But for a k8s job, I'm still unsuccessful.
HelpfulDeer76 When you say "unsuccessful" what exactly do you mean ?
Could it be they are reported to the clearml demo server (the default server if no configuration is found) ?
SmarmyDolphin68
BTW: there is no automatic reporting when you have task = Task.get_task(task_id='your_task_id')
It's only active when you have one "main" task.
You can also check the continue_last_task
argument in Task.init , it might be a good fit for your scenario
https://allegro.ai/docs/task.html#trains.task.Task.init
I think this is great! That said, it only applies when you are spining agents (the default helm is for the server). So maybe we need another one? or an option?
Any chance you can open a GitHub issue so we do not forget this feature ?
Hi JealousParrot68
This is the same as:
https://clearml.slack.com/archives/CTK20V944/p1627819701055200
and,
https://github.com/allegroai/clearml/issues/411
There is something odd happening in the files-server as it replaces the header (i.e. guessing the content o fthe stream) and this breaks the download (what happens is the clients automatically ungzip the csv).
We are working on a hit fix to he issue (BTW: if you are using object-storage / shared folders, this will not happen)