![Profile picture](https://clearml-web-assets.s3.amazonaws.com/scoold/avatars/AgitatedDove14.png)
Reputation
Badges 1
25 × Eureka!How does a task specify which docker image it needs?
Either in the code itself 'task.set_base_docker' or with the CLI, or set it in the UI when you clone an experiment (everything becomes editable)
Thanks FrothyShark37
I just verified, this would work as well, I suspect what was missing is the plt.show
call, this is the actual call that triggers clearml
and the agent default runtime mode is docker correct?
Actually the default is venv mode, to run in docker mode add --docker
to the command line
So I could install all my system dependencies in my own docker image?
Correct, inside the docker it will inherit all the preinstalled packages, But it will also install any missing ones (based on the Task requirements. i.e. "installed packages" section)
Also what is the purpose of the
aws
block in the clearml.c...
FlatStarfish45
In the parent task, the libs appear installed.
What do you mean by "parent Task"? Is this the base task we are optimizing (i.e. the experiment / model we are optimizing) ?
Or is it the "Optimization Task" itself?
basically the idea is you do not need to configure the Experiment manually, it is created when you actually develop the code / run/debug it, or you have the CLI taking everything from your machine and populating it
Hi FrothyShark37
Can you verify with the latest version?
pip install -U clearml
JitteryCoyote63
I agree that its name is not search-engine friendly,
LOL 😄
It was an internal joke the guys decided to call it "trains" cause you know it trains...
It was unstoppable, we should probably do a line of merch with AI 🚆 😉
Anyhow, this one definitely backfired...
@PipelineDecorator.component(repo="..")
The imports are not recognized - they are not on the pythonpath of the task that the agent starts.
RoughTiger69 add the imports inside the functions itself, you cal also specify the, on the component@PipelineDecorator.component(..., package=["pcakge", "package==1.2.3"])
or@PipelineDecorator.component(...): import pandas as pd # noqa ...
I would like to force the usage of those requirements when running any script
How would you force it? Will you just ignore the "Installed Packages" section ?
(basically python abusing types/casting where the value can be both str/bool on the same argparser aergument)
While if I just download the right packages from the requirements.txt than I don't need to think about that
I see you point, the only question how come these packages are not automatically detected ?
BTW: dockerhub is free and relatively cheap to upgrade 🙂
(GitHub also offers docker registry)
Make sure you have the S3 credentials in your agent's clearml.conf :
https://github.com/allegroai/clearml-agent/blob/822984301889327ae1a703ffdc56470ad006a951/docs/clearml.conf#L210
JitteryCoyote63
I agree that its name is not search-engine friendly,
LOL 😄
It was an internal joke the guys decided to call it "trains" cause you know it trains...
It was unstoppable, we should probably do a line of merchandise with AI 🚆 😉
Anyhow, this one definitely backfired...
It should move you directly into the queue pages.
Let me double check (working on the community server)
Is is across the board for any Task ?
What would you expect to happen if you clone a Task that used the requirements.txt, would you ignore the full "pip freeze" and use the requirements .txt again, or is this thime we want to use the "installed packages" ?
LovelyHamster1 from the top, we have two steps:
We run the code "manually" (i.e. without the agent) this step create the experiment (Task) and automatically feels in the "installed packages" (which are in the same format as regular requirements.txt) An agent is running a cloned copy of the experiment (Task). The agents creates a new venv on the agent's machine, then the agent is using the "Installed packages" section as a replacement to regular "requirements.txt" and installs everything fro...
btw: you can also configure --extra-index-url in the agent's clearml.conf
Hi UnsightlyLion90
from my understanding agent do the job of SLURM,
That is kind of correct (they overlap in some ways 🙂 )
Any guide of how to integrate both of them?
The easiest way is to just add the "Task.init()" call to your code, and use SLURM to schedule the job. this will make sure all jobs are fully logged (this can also includes automatically uploading the models, and artifact support etc)
Full SLURM support (i.e. similar to the k8s glue support), is currently ou...
PompousBeetle71 just making sure, and changing the name solved it?
So I had to add it explicitly via a docker init script
Oh yes, that makes sense, can't think of a better hack other than sys.path.append(os.path.join(os.path.dirname(__file__), "src"))
Hi VexedCat68
can you supply more details on the issue ? (probably the best is to open a github issue, and have all the details there, so we have better visibility)
wdyt?
Hi SlimyElephant79
As you can imagine, wandb's tracking code would be present across the code modules and I was hoping for a structured approach that would help me transition to ClearMLs experiment tracking.
Do you guys a have a layer in between that does the reporting, or is the codebase riddled with direct reporting calls ? if the latter, then I guess search and replace ? or maybe a module that "converts" wandb call to clearml call ? wdyt?
Hi TroubledJellyfish71
What do you have listed on the Task's execution "installed packages" section ? (of the original Task) ?
How did it end up with an http link of pytorch ?
Usually it would be torch==1.11
...
EDIT:
I'm assuming the original Task was executed on a Mac M1, what are you getting when calling pip freeze
?
And where is the agent running ? (and is it venv or docker mode?)
WackyRabbit7 I might be missing something here, but the pipeline itself should be launched on the "pipelines" queue, is the pipeline itself running? or is it the step itself that is stuck in ""queued" state?
. Perhaps it is the imports at the start of the script only being assigned to the first task that is created?
Correct!
owever when I split the experiment task out completely it seems to have built the cloned task correctly.
Nice!!
Task.running_localy()
Should do the trick
can i run it on an agent that doesn't have gpu?
Sure this is fully supported
when i run clearml-serving it throughs me an error "please provide specific config.pbtxt definion"
Yes this is a small file that tells the Triton server how load the model:
Here is an example:
https://github.com/triton-inference-server/server/blob/main/docs/examples/model_repository/inception_graphdef/config.pbtxt
they are just neighboring modules to the function I am importing.
So I think that is you specify the repo,, on the remote machine you will end with the code of the component sitting at the root folder of the repo, from there I assume you can import the rest, the root git path should be part of your PYTHONPATH automatically.
wdyt?
Can you post here the actual line? seems like we can fix it to also support this scenario (if we could test it)