
Reputation
Badges 1
149 × Eureka!For datasets it's easily done with a dedicated project, a separate task per dataset, and Artifacts
tab within it
Very nice for small pipelines (where every step could be put into a function in a single repository)
For experiments with no artifacts, everything seems to work properly
And I don't see any new projects / subprojects where that dataset creation Task is stored
Previously I had a separate, manually created project where I stored all newly created datasets for my main project. Very neat
Now the task is visible only in the "All experiment" section, but there is no separate project in the web ui where I could see it...
I launch everything in docker mode, and since it builds an image on every run, it builds default nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04
image, which incurs heavy overhead. What if I want to give it my custom lightweight image instead? The same way I do for all individual tasks
@<1523701205467926528:profile|AgitatedDove14> clearml 1.1.1
Yeah, of course it is in draft mode ( Task.import_task
creates a task in draft mode, it is the lower task on the screenshot)
Refactoring is to account for the new project names. And also to resolve the project name depending on the version of a client
where is it in the docs?
In principle, I can modify almost everything with task_overrides
, omitting export part, and it's fine. But seems that by exporting I can change more things, for example project_name
But I still cannot launch my own pipelines
Yeah, classic. What helped me is this
https://clearml.slack.com/archives/CTK20V944/p1636452811364800?thread_ts=1635950908.285900&cid=CTK20V944
The pipeline is initialized like thispipe = PipelineController(project=cfg['pipe']['project_name'], name='pipeline-{}'.format(name_postfix), version='1.0.0', add_pipeline_tags=True) pipe.set_default_execution_queue('my-queue')
Then for each step I have a base task which I want to clone
` step_base_task = Task.get_task(project_name=cfg[name]['base_project'],
task_name=...
I don't think so. it is solved by installing openssh-client to the docker image or by adding deploy token to the cloning url in web ui
AgitatedDove14 yeah, that makes sense, thank you. That means I need to pass a single zip file to path
argument in add_files
, right?
The files themselves are not on S3 yet, they are stored locally. That's what I want: register a new dataset and upload the data itself to S3
AgitatedDove14 any ideas? 😀
When I launch tasks with a pipeline, they keep complaining about missing pip packages. I run it inside a docker container, and I'm sure these packages are present inside it (when I launch the container locally, run python3 and import them, it works like charm). Any ideas how to fix this?
@<1523701435869433856:profile|SmugDolphin23> could you please review it further? Is it acceptable to be merged?
You are right, I had [None]
as parents in one of the tasks. Now this error is gone
pipeline controller itself is stuck at running mode forever all step tasks are created but never enqueued
I can share some code
OK, I managed to launch the example and it works
RotundHedgehog76 We were using clearml-server
on kubernetes cluster, so I just reached out to our devops to change nginx
settings and re-deploy it
I specifically set is as empty with export_data['script']['requirements'] = {}
in order not to reduce overhead during launch. I have everything installed inside the container