I have attached the screenshot of logs earlier
@<1523701435869433856:profile|SmugDolphin23> I used clearml==1.13.2 and now I am upgrading to clearml=1.14.1 version.Also I would give extra information about Clearml-server docker-compose file images versions is latest right now.
Oh I see. I think there is a mismatch between some clearml versions on your machine? How did you run these scripts exactly? (like the CLI, for example python test.py
?)
Or if you ran it via an IDE, what is the interpreter path?
@<1626028578648887296:profile|FreshFly37> can you share also logs of task ? It may give an idea.
I just use "pip install clearml" command for sdk.
I ran it via IDE. I am using conda environment and when I list the clearml packages it looks like in the below. The interpreter match with base environment.
Regarding pending pipelines: please make sure a free agent is bound to the queue you wish to run the pipeline in. You can check queue information by accessing the INFO section of the controller (as in the first screenshort)
then by pressing on the queue, you should see the worker status. There should be at least one worker that has a blank "CURRENTLY EXECUTING" entry
@<1523701435869433856:profile|SmugDolphin23> I retry the same scenario with clearml==1.14.1 package but still it does not show me the pipelines not showing in the UI :(
For the clearml-server installation I follow the documentation steps one by one. Link is : None
@<1657556312684236800:profile|ManiacalSeaturtle63> can you share how you are creating your pipeline?
@<1523701435869433856:profile|SmugDolphin23> I have attached two screenshots, One is pipeline initialization & other one is the task of the pipeline.
The project's directory is as follows:
The pipeline.py includes the code to run the pipeline & tasks of the pipeline.
├── Makefile
├── README.md
├── ev_xxxxxx_detection
│ ├── __init__.py
│ ├── __pycache__
│ │ └── __init__.cpython-311.pyc
│ ├── clearml
│ │ ├── __pycache__
│ │ ├── clearml_wrapper.py
│ │ ├── constants.py
│ │ ├── data_loader.py
│ │ ├── ev_trainer.py
│ │ ├── pipeline.py
│ │ └── util.py
├── poetry.lock
├── pyproject.toml
@<1523701435869433856:profile|SmugDolphin23> Can you please help me out here
sure, I'll add those details & check. Thank you
Thank you @<1523701435869433856:profile|SmugDolphin23> It is working now after the addition of repo details into each task. It seems that we need to specify repo details in each task to pull the code & execute the tasks on the worker.
When I run it from command line everything return back to normal and pipeline is visible for now. Thank you very much for your helps, time and feedbacks 🙂 @<1523701435869433856:profile|SmugDolphin23>
@<1523701435869433856:profile|SmugDolphin23> I run the code in order to step1, step2 and step3. And then I run the "pipeline_from_task.py" scripts. I follow the ClearML documentation so whole of the codes taken from github repo.
There are two task available in the experiments list as you can see in below. I click the step_1 INFO tab and informations like this. There is no available pipeline controller task maybe thats why UI does not show up the pipeline.
what do you get when you run this code?
from clearml.backend_api import Session
print(Session.check_min_api_server_version("2.17"))
how about this one?
import clearml
import os
print("\n".join(open(os.path.join(clearml.__path__[0], "automation/controller.py")).read().split("\n")[310:320]))
This print string like in below. """
if not self._task:
task_name = name or project or '{}'.format(datetime.now())
if self._pipeline_as_sub_project:
parent_project = (project + "/" if project else "") + self._pipeline_section
project_name = "{}/{}".format(parent_project, task_name)
else:
parent_project = None
project_name = project or 'Pipelines'
# if user disabled the auto-repo, we force local script storage (repo="" or repo=False) """
@<1523701435869433856:profile|SmugDolphin23> I have tried another way by including pipeline.py in the root directory of the code and executed “python3 pipeline.py” & still faced same issue
@<1626028578648887296:profile|FreshFly37> can you please screenshot this section of the task? Also, how does your project's directory structure look like?
@<1626028578648887296:profile|FreshFly37> how are you running this locally in the first place?
If you are running pipeline.py
with cwd as ev_xx_detection/clearml
, then I would not expect you to be able to do from ev_xx_detection.clearml import constants
(for example), but import constants
directly would work (as constants.py
is in the same directory as pipeline.py
). The reason your remote run doesn't work is basically because of this:
cwd is ev_xx_detection/clearml
and ev_xx_detection.clearml.constants
is imported, but the module that should be imported is actually constants
@<1523701435869433856:profile|SmugDolphin23> I have tried the same method as suggested by you and the pipeline still failed, as it couldn't find "modules". Could you please help me here?
I would like to describe the process again, which I was following:
- I created a queue and assigned 2 workers to the queue.
- In the pipeline.py file, to start the pipeline I used
pipe.start(queue="queue_remote")
and for the tasks I usedpipe.set_default_execution_queue('queue_remote')
- In the
working_dir = ev_xxxx_xxtion/clearml
I executed the code usingpython3 pipeline.py
- The pipeline was initiated on queue "
queue_remote
" on worker 01 & the next tasks were initiated on queue "queue_remote
" on worker 02 and it failed, as it couldn't find the modules in worker 02.