Reputation
Badges 1
107 × Eureka!thisfrom fastai.callbacks.tensorboard import LearnerTensorboardWriter
doesn’t exist anymore in fastai2
when u say use Task.current_task()
you for logging? which i’m guessing that the fastai binding should do right?
i believe this is because of transformer’s integration:
Automatic ClearML logging enabled.
ClearML Task has been initialized.
when a task already exists
hi, yes we tried with the same result
@<1523701118159294464:profile|ExasperatedCrab78>
Here is an example that reproduces the second error
from clearml.automation import PipelineDecorator
from clearml import TaskTypes
@PipelineDecorator.component(task_type=TaskTypes.data_processing, cache=True)
def run_demo():
from transformers import AutoTokenizer, DataCollatorForTokenClassification, AutoModelForSequenceClassification, TrainingArguments, Trainer
from datasets import load_dataset
import numpy as np
import ...
This is the next step not being able to find the output of the last step
ValueError: Could not retrieve a local copy of artifact return_object, failed downloading
I'm working with the patch, and installing transformers from github
they also appear to be relying on the tensorboard callback which seems not to work on distributed training
reduced to a small snippet
` from fastai.vision.all import *
from fastai.distributed import *
from clearml import Task
from fastai.callback.tensorboard import TensorBoardCallback
from wwf.vision.timm import timm_learner
task = Task.init(project_name='LIOR_TEST', auto_connect_arg_parser={'rank': False})
path = untar_data(URLs.PETS)
size = 460
batch_size = 32
dblock = DataBlock(blocks=(ImageBlock, CategoryBlock),
get_items=get_image_files,
get_y=lambda x: ...
Noting one difference i do is using TensorBoardCallback
, because i believe the clearml docs use an outdated fastai 1 version…
i’m following this guide
https://docs.fast.ai/distributed.html#Learner.distrib_ctx
so you run it like thispython -m fastai.launch <script>
Hi, yes it's running with autoscaler so it's for sure in docker mode
Are you saying that it should've worked? I got 'docker' attribute doesn't exist error. Maybe it's the version of the clearml server?
you can get updates on the issue i opened
https://github.com/fastai/fastai/issues/3543
but i think the probably better solution would be to create a custom ClearML callback for fastai with the best practices you think are needed…
Or try to fix the TensorBoardCallback, because for now we can’t use multi gpu because of it 😪
It's models not datasets in our case...
But we can also just tar the folder and return that... Was just hoping to avoid doing that
AgitatedDove14 . so if i understand correctly, what i can possibly do is copy paste the https://github.com/fastai/fastai/blob/master/fastai/launch.py code and add the Task.init there?
but anyway, this will still not work because fastai’s tensorboard doesn’t work in multi gpu 😞
SmugSnake6 yep, that’s exactly it.
Hope the team is aware and will fix it
The pipeline is a bit complex, but it did that with a very dumb example
don’t have one ATM
Looks like the first issue has been solved 🙂
i think the second one still consists, still checking