Reputation
Badges 1
25 × Eureka!Hi @<1544853695869489152:profile|NonchalantOx99>
I would assume the clearml-server configuration / access key is misconfigured in your copy of example.env
Hi @<1544128915683938304:profile|DepravedBee6>
You mean like backup the entire instance and restore it on another machine? Or are you referring to specific data you want to migrate?
BTW if you are upgrading old versions of the server I would recommend upgrading to every version in the middle (there are some migration scripts that need to be run in a few of them)
Should work, follow the backup process, and restore into a new machine:
None
Hmm I think this is not doable ... ๐
(the underlying data is stored in DBs and changing it is not really possible without messing about with the DB)
Hi @<1542316991337992192:profile|AverageMoth57>
is this a follow up of this thread? None
So you have two options
- Build the container from your docker file and push it to your container registry. Notice that if you built it on the machine with the agent, that machine can use it as Tasks base cintainer
- Use the From container as the Tasks base container and have the rest as docker startup bash script. Wdyt?
restart_period_sec
I'm assuming development.worker.report_period_sec
, correct?
The configuration does not seem to have any effect, scalars appear in the web UI in close to real time.
Let me see if we can reproduce this behavior and quickly fix
Hi DisgustedDove53
Is redis used as permanent data storage or just cache?
Mostly cache (Ithink)
Would there be any problems if it is restarted and comes up clean?
Pretty sure it should be fine, why do you ask ?
I suppose the same would need to be done for anyย
clientย
PC runningย
clearml
ย such that you are submitting dataset upload jobs?
Correct
That is, the dataset is perhaps local to my laptop, or on a development VM that is not in theย
clearml
ย system, but I from there I want to submit a copy of a dataset, then I would need to configure the storage section in the same way as well?
Correct
What's the python, torch, clearml version?
Any chance this can be reproducible ?
What's the full error trace/stack you are getting?
Can you try to debug it to where exactly it fails here?
https://github.com/allegroai/clearml/blob/86586fbf35d6bdfbf96b6ee3e0068eac3e6c0979/clearml/binding/import_bind.py#L48
RoughTiger69 wdyt?
Will the new fix avoid this issue and does it still requires theย
incremental
ย flag?
It will avoid the issue, meaning even when incremental is not specified, it will work
That said the issue any other logger will be cleared as well, so, just good practice ...
From theย
logging
ย documentation ...
Hmmm so I guess Kedro should not use dictConfig ?! I'm not sure on the exact use case, but just clearing all loggers seems like a harsh approach
Hi @<1628927672681762816:profile|GreasyKitten62>
Notice that in the github actions example this psecific Task is executed on the GitHub backend, the Task it creates is executed on the clearml-agent.
So basically:
Action -> Git worker -> task_stats_to_comment.py -> Task Pushed to Queue -> Clearml-Agent -> Task execution is here
Does that make sense ?
Thanks JitteryCoyote63 !
Any chance you want to open github issue with the exact details or fix with a PR ?
(I just want to make sure we fix it as soon as we can ๐ )
Hi DangerousDragonfly8
, is it possible to somehow extract the information about the experiment/task of which status has changed?
From the docstring of add_task_trigger
```py def schedule_function(task_id): pass ```
This means you are getting the Task ID that caused the trigger, now you can get all the info that you need with Task.get_task(task_id)
` def schedule_function(task_id):
the_task = Task.get_task(task_id)
# now we have all the info on the Task tha...
Would this be best if it were executed in the Triton execution environment?
It seems the issue is unrelated to the Triton ...
Could I use theย
clearml-agent build
ย command and theย
Triton serving engine
ย task ID to create a docker container that I could then use interactively to run these tests?
Yep, that should do it ๐
I would start simple, no need to get the docker itself it seems like clearml credentials issue?!
Hi RotundHedgehog76
Notice that the "queued" is on the state of the Task, as well as the the tag
We tried to enqueue the stopped task at the particular queue and we added the particular tagWhat do you mean by specific queue ? this will trigger on any Queued Task with the 'particular-tag' ?
Great ascii tree ๐
GrittyKangaroo27 assuming you are doing:@PipelineDecorator.component(..., repo='.') def my_component(): ...
The function my_component
will be running in the repository root, so in thoery it could access the packages 1/2
(I'm assuming here directory "project" is the repository root)
Does that make sense ?
BTW: when you pass repo='.'
to @PipelineDecorator.component
it takes the current repository that exists on the local machine running the pipel...
BTW: why use CLI? the idea of clearml it becomes part of the code, even in the development process, this means add "Task.init(...)" at the beginning of the code, this creates the Tasks and logs them as part of the development. Which means that xecuting them is essentially cloning and enqueuing in the UI. Of course you can automate it directly as part of the code.
UnevenDolphin73fatal: could not read Username for '
': terminal prompts disabled .. fatal: clone of '
' into submodule path '/root/.clearml/vcs-cache/xxx.60db3666b11ac2df511a851e269817ef/xxx/xxx' failed
It seems it tries to clone a submodule and fails due to to missing keys for the submodule.
https://stackoverflow.com/questions/7714326/git-submodule-url-not-including-username
wdyt?
Hi @<1547028031053238272:profile|MassiveGoldfish6>
Is there a way for ClearML to simply save the model once training is done and to ignore the model checkpoints?
Yes, you can simple disable the auto logging of the model and manually save the checkpoint:
task = Task.init(..., auto_connect_frameworks={'pytorch': False}
...
task.update_output_model("/my/model.pt", ...)
Or for example, just "white-label" the final model
task = Task.init(..., auto_connect_frameworks={'pyt...
Do you have a roadmap which includes resolving things like this
Security SSO etc. is usually out of scope for the open-source platform as it really makes the entire thing a lot harder to install and manage. That said I know that on the Enterprise solution they do have SSO and LDAP support and probably way more security features. I hope it helps ๐
OhTask.get_project_object().default_output_destination = None
This has no effect on the backend, meaning this does not actually change the value.from clearml.backend_api.session.client import APIClient c = APIClient() c.projects.update(project="<project_id_here>", default_output_destination="s3://")
btw: how/what it is used for in your workflow ?
JitteryCoyote63 how can I reproduce it? (obviously when I tested it was okay)
Happy new year @<1618780810947596288:profile|ExuberantLion50>
- Is this the right place to mention such bugs?Definitely the right place to discuss them, usually if verified we ask to also add in github for easier traceability / visibility
m (i.e. there's two plots shown side-by-side but they're actually both just the first experiment that was selected). This is happening across all experiments, all my workspaces, and all the browsers I've tried.
Can you share a screenshot? is this r...
HealthyStarfish45 if I understand correctly the trains-agent is running as daemon (i.e. automatically pulling jobs and executes them), the only point might be cancelling a daemon will cause the Task executed by that daemon to be canceled as well.
Other than that, sounds great!
@<1538330703932952576:profile|ThickSeaurchin47> can you try the artifacts example:
None
and in this line do:
task = Task.init(project_name='examples', task_name='Artifacts example', output_uri="
")
HelplessCrocodile8
Basically the file URI might be different on a different machine (out of my control) but they point to the same artifact storage location
We might have thought of that...
in your clearml.conf file:
` sdk{
storage {
path_substitution = [
# Replace registered links with local prefixes,
# Solve mapping issues, and allow for external resource caching.
{
registered_prefix = file:///mnt/data/...
and I install the tar
I think the only way to do that is add it into the docker bash setup script (this is a bash script executed before Task)