Reputation
Badges 1
106 × Eureka!i’ll try to work on something that works on 1.7.2
i had a misconception that the conf comes from the machine triggering the pipeline
but anyway, this will still not work because fastai’s tensorboard doesn’t work in multi gpu 😞
using api.files_server? not default_output ?
My use case is developing the code, i don’t want to spam the UI
don’t have one ATM
also, i don’t need to change it during execution, i want it for a specific run
TimelyMouse69
Thanks for the reply, this is only regarding automatic logging, where i want to disable logging all together (avoiding the task being added to the UI)
you can get updates on the issue i opened
https://github.com/fastai/fastai/issues/3543
but i think the probably better solution would be to create a custom ClearML callback for fastai with the best practices you think are needed…
Or try to fix the TensorBoardCallback, because for now we can’t use multi gpu because of it 😪
It's with decorators.
Interesting, i wasn't aware of this python module for executing accelerate. I'll try to use that.
We used subprocess for it, but for some reason only when invoked in the pipeline the process freezes and doesn't close the main accelerate process. Works fine outside of clearml, any Idea?
i’m following this guide
https://docs.fast.ai/distributed.html#Learner.distrib_ctx
so you run it like thispython -m fastai.launch <script>
Nothing that i think is relevant, I'm using latest from master. It might be a new bug on their side, wasn't sure.
not sure about this, we really like being in control of reproducibility and not depend on the invoking machine… maybe that’s not what you intend
BTW the code above is from clearml github so it’s the latest
@<1523701118159294464:profile|ExasperatedCrab78> Sorry only saw this now,
Thanks for checking it!
Glad to see you found the issue, hope you find a way to fix the second one. for now we will continue using the previous version.
Would be glad if you can post when everything is fixed so we can advance our version.
@<1523701118159294464:profile|ExasperatedCrab78>
Here is an example that reproduces the second error
from clearml.automation import PipelineDecorator
from clearml import TaskTypes
@PipelineDecorator.component(task_type=TaskTypes.data_processing, cache=True)
def run_demo():
from transformers import AutoTokenizer, DataCollatorForTokenClassification, AutoModelForSequenceClassification, TrainingArguments, Trainer
from datasets import load_dataset
import numpy as np
import ...
Looks like the first issue has been solved 🙂
i think the second one still consists, still checking
tnx, i just can’t use 1.7.1 because of the pipeline problem from before
regarding what AgitatedDove14 suggested, i’ll try tomorrow and update
AgitatedDove14 So it looks like it started to do something, but now it’s missing parts of the configuration
Missing key and secret for S3 storage access
(i’m using boto credential chain, which is off by default…)
why isn’t the config being passed to the inner step properly ?
AgitatedDove14 . so if i understand correctly, what i can possibly do is copy paste the https://github.com/fastai/fastai/blob/master/fastai/launch.py code and add the Task.init there?
Glad to hear you were able to reproduce it! Waiting for your reply 🙏
It's models not datasets in our case...
But we can also just tar the folder and return that... Was just hoping to avoid doing that