So just after the clone, before creating the env?
Hi FloppyDeer99 ,
In other words, docs introduce that ClearML Open Source supports orchestration, how can I found the relating codes?
You can find many examples https://clear.ml/docs/latest/docs/getting_started/mlops/mlops_first_steps/ , if you have a specific use case you want to check, please share and I can send an example of it.
And what the role of clearml-agent in orchestration, a combination of kube-scheduler and kubelet?
ClearML agent is an ML-Ops tool for users to r...
Hi TrickySheep9 ,
ClearML does analyze your packages, but you always can add any package you like with Task.add_requirements('xlrd', '')
or if its a package that you want the ClearML agent to install always (not per task), you can add it to the agent’s configuration file https://github.com/allegroai/clearml-agent/blob/master/docs/clearml.conf#L82
Can this do the trick?
Hi WackyRabbit7
When calling Task.init()
, you can provide output_uri
parameter. This allows you to specify the location in which model snapshots will be stored.
Allegro-Trains supports shared folders, S3 buckets, Google Cloud Storage and Azure Storage.
For example (with S3):
Task.init(project_name="My project", task_name="S3 storage", output_uri="s3://bucket/folder")
You will need to add storage credentials in ~/trains.conf
file (you will need to add your aws in thi...
Hi BrightElephant64 , can you add an example? Also, the ClearML AWS autoscaler know how to work with ClearML-agent queues
đź‘Ť let me try to reproduce with it. can you write the change you edited in the docker-compose
?
Hi PompousHawk82 ,
Can you try this - https://clearml.slack.com/archives/CTK20V944/p1582334614043800?thread_ts=1582240539.039600&cid=CTK20V944 ?
whats the clearml version you are using?
Hi VexedCat68 ,
the scheduler print
Not sure what do you mean here, can you add an example?
let me check if I can think about something else (I know the enterprise edition has full support for such thing and for unstructured data too).
BTW ClearML always use cache, so the big download is done only once.
đź‘Ť
this is a message about configuration sync.
Its allow you to change scheduler in runtime by editing the Task configuration object
Hi PompousHawk82 . sorry for the delay, I missed the last message. Can you try adding in the spawn process to have task = Task.get_task(task_id=<Your main task Id>)
instead of the Task.init
call?
Hi BitterLeopard33 ,
You want to have the data section in the dataset task uri?
And the packages versions doesn’t match the python? Can you install the python on the system (not as venv)?
Hi PompousParrot44 , you mean delete experiment?
trying to understand what reset the task
Hi ColossalDeer61 ,
Not from the UI, but you can run a simple script to do that (assuming you can parse your configuration file), here is an example:
` from trains import Task
configuration_file = {
"stage_1": {
"batch_size": 32,
"epochs": 10
},
"stage_2": {
"batch_size": 64,
"epochs": 20
},
}
template_task = Task.get_task(task_id=<YOUR TEMPLATE TASK>)
for name, params in configuration_file.items():
# clone the template task into a new...
and it should be the default for docker mode
Hi CleanPigeon16 .
Do you get anything in the UI regarding this failure (in the RESULTS -> CONSOLE section)?
Hi ObliviousCrocodile95 ,
Trying to understand, you want to have the clearml-agent docker mode with pre install virtual environment?
Currently this setup means that I clone my repository in the docker image - so the commit & changes aren’t reflected in this environment. Any way to remedy this?
The clearml-agent should clone your repository and apply the changes from the parent task.
The PipelineController task? If so, you can get the task ( pipeine_task = Task.get_task(task_id=your pipeline task id)
) and after pipeine_task.get_output_destination()
, can this do the trick?
Can you try installing the package on the docker’s python but not on the venv?
Hi JitteryCoyote63 ,
You can get some stats (for the last month) under the workers section in your app, clicking a specific worker will give you some more options
Those doesn’t includes stats per training, but per worker
PanickyMoth78 are you getting this from the app or one of the tasks?
The state folder is not affected
this is /mnt/machine_learning/datasets
folder?
Hi MinuteWalrus85 .
looks like port 8081 is in use, can you check it? docker ps
will output you the running dockers (with the ports)
Hi VexedCat68
You can use argparse
and all the parameters will be log automagically to the hyperparameters section like in https://github.com/allegroai/clearml/blob/master/examples/frameworks/keras/keras_tensorboard.py#L56 example, or just connecting any dict like in https://github.com/allegroai/clearml/blob/master/examples/frameworks/ignite/cifar_ignite.py#L23 example
Hi KindBlackbird59
You can always clone the first task and change the parameters in the second one, is this what you are looking for?
Hi JitteryCoyote63 ,
I tried to reproduce this issue but it passed for me, what versions do you run (for trains, trains-server and trains-agent)?