Reputation
Badges 1
25 × Eureka!What exactly do you mean by docker run permissions?
Hi LazyLeopard18 ,
See details below, are you using the win10 docker-compose yaml?
https://github.com/allegroai/trains-server/blob/master/docs/install_win.md
I want to be able to delete only the logs since they are taking a lot of space in my case.
I see... I do not think this is possible π
You can disable the auto logging though ... pass auto_connect_streams=False to Task.init
It was installed by 'pip install kwcoco' while my conda env was active.
Well I guess my question is, how does conda know ehere to install it form, if this is not on the public channels ? is there a specific conda channel you added (or preconfigured) ?
ThickDove42 sorry, it took some time πimport json from trains.backend_api.session.client import APIClient client = APIClient() events = client.events.get_task_plots(task='task_id_here') table = json.loads(events.plots[0]['plot_str']) print('column order', table['data'][0]['cells']['values'])Not the most comfortable way, but at least it is there
ReassuredTiger98
(for some reason it kind of jumps over PyTorch, but then installs torchvision?!)
Could you run with the latest with --debug
We just added but you will have to install from git:pip3 install git+Then run with --debug:clearml-agent --debug daemon ...
You should have a download button when you hover over the table, I guess that would be the easiest.
If needed I can send an SDK code but unfortunately there is no single call for that
I still do not get why this leads to some 0.5 values when in my plot there should only be 0 and 1.
Smart sub-sampling (lowpass filter before, aka averaging on a window)
Thanks EnviousStarfish54 !
yes, or (because I deployed clearml using helm in kubernetes) from the same machine, but multiple pods (tasks).
Oh now I see, long story short, no π the correct way of doing that is every node/pod creates it's own dataset,
then when you are done, you create a new version with the X datasets that you created as parents, the newly created version is just "meta" it basically tells the system how to combine the previously generated datasets (i.e. no data is actually re-uploa...
Should have worked, the error you are getting is docker-compose parsing the yml file
Is this exactly the one from the trains-server repo ?
Hi SparklingHedgehong28
What would be the use for "end of docker hook" ? is this like an abort callback? completion ?
instance protection
Do you mean like when instance just died (line spot in AWS) ?
It can be a different agent.
If inside a docker thenclearml-agent execute --id <task_id here> --dockerIf you need venv doclearml-agent execute --id <task_id here>You can run that on any machine and it will respin and continue your Task
(obviously your code needs to be aware of that and be able to pull its own last model checkpoint from the Task artifacts / models)
Is this what you are after?
Hmm SuccessfulKoala55 any chance the nginx http was pushed to v1.1 on the latest cloud helm chart?
Notice: dataset_rgb.list_files() will list the content of the dataset, Not the local files:
e.g.: /folder/myfile.ext and not /hone/user/cache/folder/myfile.ext
So basically i think you are just not passing actual files, you should probably do:for local_file in Path(folder_rgb).rglob('*'): ...
after generating a fresh set of keys
when you have a new set, copy paste them idirectly into the 'cleaml.conf' (should be at the top, can't miss it)
It's just the print (_ repr _) not showing the datafor w in client.workers.get_all(): print(w.data)
Hi @<1570583227918192640:profile|FloppySwallow46>
Hey I have a question, Can you monitor the time for one pipeline,
you mean to see the start / end time of the pipeline?
Click on the details link on the right hand side and you will have all the details on the pipeline task, including running time
This will set more time before the timeout right?
Correct.
task.freeze_monitor()
download()
task.defrost_monitor()
Currently there isn't, but that's a good ides.
What would be the argument of using it vs increasing the timeout ?
btw: setting the resource timeout to 99999 will basically mean that it will wait until the first reported iteration, Not that it will just sleep for 99999sec π
Hi BattyLion34
script_a.py
Β generates fileΒ
test.json
Β in project folder
So let's assume "script_a" generates something and puts it under /tmp/my_data
Then it can create a dateset from the folder /tmp/my_data , with Dataset.create() -> Dataset.sync -> Dataset.upload -> Dataset.finalize
See example: https://github.com/alguchg/clearml-demo/blob/main/process_dataset.py
Then "script_b" can get a copy of the dataset using "Dataset.get()", see examp...
It seems like the naming Task.create a lot of confusion (we are always open to suggestions and improvements). ReassuredTiger98 from your suggestion, it sounds like you would actually like more control in Task.init (let's leave Task.create aside, as its main function is Not to log the current running code, but to create an auxiliary Task).
Did I understand you correctly ?
the other repos i have are constantly worked on and changing too
Not only it will be cloned automatically, the git diff of the sub-modules are stored as well π
WickedGoat98 Notice this is not the "clearml-agent-services" docker but "clearml-agent" docker image
Also the default docker image is "nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04"
Other than that quite similar :)
however, I don't think it's our code, since the trigger is not triggered at all, unless a new task is created :((
Yeah I think you are correct, I'm more interested in understanding the how you use it ...
BTW can you test with the latest clearml python version (the trigger code is the important part)?
@<1720249421582569472:profile|NonchalantSeaanemone34>
dso = Dataset.create(
dataset_project= project_name,
dataset_name= dataset_name,
parent_datasets=[parent_datasets_id],
)
dso = Dataset.get(
dataset_project= project_name,
dataset_name= dataset_name,
only_completed=True,
only_published=False,
alias='latest',
)
why are you creating a dataset then getting a dataset on the same object?
it seems you are trying to upload...
This means that in your "Installed packages" you should see the line:
Notice that this is not a pypi artifactory (i.e. a server to add to the extra index url for pip), this is a direct pip install from a git repository, hence it should be listed in the "installed packages".
If this is the way the package was installed locally, you should have had this line in the installed packages.
The clearml agent should take care of the authentication for you (specifically here, it should do nothing).
If ...
K8s can schedule pod with different priorities.
I'm not sure I agree here, could you refer me to the docs on this ability in k8s ?
So maybe no real scheduling means there is no ClearML scheduling after applying pod to k8s.
That is correct π
Does it will implement in the future?
Yes, this is enterprise feature, in the community you can specify --max-pods limit (which will cause it never to pull a job if it hits the max-pod limit)