Reputation
Badges 1
25 × Eureka!Plan is to have it out in the next couple of weeks.
Together with a major update in v0.16
They all "inherit" the same user / environment from one another
GrievingTurkey78 in your cleaml.conf do you have?agent.package_manager.type: condaOr
https://github.com/allegroai/clearml-agent/blob/73625bf00fc7b4506554c1df9abd393b49b2a8ed/docs/clearml.conf#L59
Hi GrittyKangaroo27
Maybe check the TriggerScheduler , and have a function trigger something on k8s every time you "publish" a model?
https://github.com/allegroai/clearml/blob/master/examples/scheduler/trigger_example.py
Or can I enable agent in this kind of local mode?
You just built a local agent
I'm not sure this is configurable from the outside π
cannot schedule new futures after interpreter shutdown
This implies the process is shutting down.
Where are you uploading the model? What is the clearml version you are using ? can you check with the latest version (1.10) ?
SoreDragonfly16 could you test with Task.init using reuse_last_task_id=False for example:task = Task.init('project', 'experiment', reuse_last_task_id=False)The only thing that I can think of is running two experiments with the same project/name on the same machine, this will ensure every time you run the code, you create a new experiment.
Well done man!
Hi GentleSwallow91
I think this would be a good start:
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
wdyt?
- In a notebook, create a method and decorate it by fastai.scriptβs
@call_parse.Any chance you have a very simple code/notebook to reference (this will really help in fixing the issue)?
The way ClearML thinks about it is the execution graph would be something like:
script_1 -> script_2 -> script_3 ->
Where each script would have in/out, so that you can trace the usage.
Trying to combine the two into a single "execution" graph might not represent the orchestration process.
That said visualizing them could be done.
I mean in theory there is no reason why we could add those "datasets" as other types of building blocks, for visualization purposes only
(Of course this would o...
Would it also be possible to query based on
multiple
user properties
multiple key/value I think are currently not that easy to query,
but multiple tags are quite easy to do
tags=["__$all", "tag1", "tag2],
Pseudo-ish code:
create pipelinepipeline = Task.create(..., task_type="controller") pipeline.mark_started() print(pipeline.id)2. launch step A (pass arguments via command line argument / os environment)
` task = Task.init(...)
pipeline_id = os.environ['MY_MAIN_PIPELINE']
pipeline_task = Task.get_task(task_id=pipeline_id)
send some metrics / reports etc.
pipeline_task.get_logger().report_scalar(...)
pipeline_task.get_logger().report_text(...) `wdyt? (obvioudly you need to somehow pass th...
Could you give an example of such configurations ?
(e.g. what would be diff from one to another)
What is the recommended way of providing S3 credentials to cleanup task?
cleaml.conf or OS environment (AWS_ACCESS_KEY_ID ...)
task.update({'script': {'version_num': 'my_new_commit_id'}})
This will update to a specific commit id, you can pass empty string '' to make the agent pull the latest from the branch
could you try this one:frameworks = { 'tensorboard': True, 'pytorch': False }This would log the TB (in the BKG), but no model registration (i.e. serial)
The pod has an annotation with a AWS role which has write access to the s3 bucket.
So assuming the boto environment variables are configured to use the IAM role, it should be transparent, no? (I can't remember what the exact envs are, but google will probably solve it π _
AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_SESSION_TOKEN. I was expecting clearml to pick them by default from the environment.
Yes it should, the OS env will always override the configuration file sect...
Hi @<1773158043551272960:profile|PungentRobin32>
1732496915556 lab03:gpuall DEBUG docker: invalid reference format.
So seems like the docker command is incorrect?! the error you are seeing is the agent failing to spin the docker, what's the OS of the host machine ?
agent.cuda_driver_version = ...
agent.cuda_runtime_version = ...
Interesting idea! (I assume for reporting only, not configuration)
... The agent mentionned used output from nvcc (2) ...
The dependencies I shared are not how the agent works, but how Nvidia CUDA works π
regrading the cuda check with nvcc , I'm not saying this is a perfect solution, I just mentioned that this is how this is currently done.
I'm actually not sure if there is an easy way to get it from nvid...
See the log:
Collecting keras-contrib==2.0.8
File was already downloaded c:\users\mateus.ca\.clearml\pip-download-cache\cu0\keras_contrib-2.0.8-py3-none-any.whl
so it did download it, but it failed to pass it correctly ?!
Can you try with clearml-agent==1.5.3rc2 ?
Woo, what a doozy.
yeah those "broken" pip versions are making our life hard ...
Hmmm why don't you use "series" ?
(Notice that with iterations, there is a limit to the number of images stored per title/series , which is configurable in trains.conf, in order to avoid debug sample explosion)