WackyRabbit7 interesting! Are those "local" pipelines all part of the same code repository? do they need their own environment ?
What would be the easiest pipeline interface to run them locally? (I would if we could support this workflow, it seems you are not alone in this approach, and of course that you can always use them remotely, i.e. clone the pipeline and launch it on an agent)
I started running it again and it seems to have passed the phase where it failed last time
Yey!
Yes it is a common case....
I have the feeling ShinyLobster84 WackyRabbit7 you are not alone in this one 🙂 let me make sure we change the default value of Yes it is a common case to False, so the code looks cleaner
i keep getting an failed getting token error
MiniatureCrocodile39 what's the server you are using ?
Sure go to the "All Projects" and filter by Task Type, application / service
Hi BattyLion34
I might have a solution, in order to make sure the two agents are not sharing the "temp" folder:
create two copies of ~/clearml.conf , let's call them :
~/clearml_service.conf ~/clearml_agent.confThen in each one select a different venvs_dir see here:
https://github.com/allegroai/clearml-agent/blob/822984301889327ae1a703ffdc56470ad006a951/docs/clearml.conf#L90
for example:
~/.clearml/venvs-builds1 ~/.clearml/venvs-builds2Now start the two agents with:
The service age...
Hmm that makes sense, I "think" the enterprise offering has a solution for that as well (i.e. full separation over static cluster), but probably the best way to constituent this avenue is talk to Sales (I'm assuming they'll setup a call to discuss the details)
Going back to the open source, I think that adding the credentials as part of the source code might allow to have "credentials" auto populate as part of the remote execution, wdyt?
first try the current setup usingÂ
pip
, and if it fails, useÂ
poetry
 ifÂ
poetry.lock
 exists
I guess the order here is not clear to me (the agent does the opposite), why would you start with pip if you are using poetry ?
I pull all the parameters, and then manually filter on the HP keys (manually=I have to plug them in, they are not part of optimizer object)
So is this an improvement to optimizer._get_child_tasks_ids(...) interface ?
e.g. return a structure like:[ { 'id': task_id, 'hp1': value, 'hp2': value, 'hp3': value, 'objective': dict(title='title', series='series', value=42 }, ]
This looks strange that only a single scalar is reported.
Hi @<1691620877822595072:profile|FlutteringMouse14>
Do I have to use Hydra
You can, and then the entire configuration is fully captured by ClearML (automatically) while you can still override values with the manual "key.sub=value" both in the UI and in the CLI
Otherwise you can connect nested dict with task.connect (these will be flattened with / for sub keys).
Or you can connect configuration files ( task.connect_configuration ) and edit them as is in the UI (with override of...
I assume so 🙂 Datasets are kind of agnostic to the data itself, for the Dataset it's basically a file hierarchy
HealthyStarfish45 We are now working on improving the k8s glue (due to be finished next week) after that we can take a stab at slurm, it should be quite straight forward. Will you be able to help with a bit of testing (setting up a slurm cluster is always a bit of a hassle 🙂 )?
Hi @<1544128915683938304:profile|DepravedBee6>
You mean like backup the entire instance and restore it on another machine? Or are you referring to specific data you want to migrate?
BTW if you are upgrading old versions of the server I would recommend upgrading to every version in the middle (there are some migration scripts that need to be run in a few of them)
If you cannot change the "TrainerState" (i.e. inherit and pass it into the code)
you cloud also monkey-patch it, something like
` class OurTrainerState(TrainerState):
def init(...)
...
def load_from_json(cls, json_path: str):
super().load_from_json(json_path))
Task.current_task().upload_artifact(...)
trainer.state = OurTrainerState(trainer.state) `
somehow set docker_args and docker_bash_setup_script equivalent??task.set_base_docker(...)# somehow setup repo and branch to download to remote instance before runningThis is automatically detected based on your local commit/branch as well ass uncommitted changes
The .ssh is mounted, but the owner is my local user,
sudo -H clearml-agent ...to allow sudo to access home
Hi FrothyShark37
Can you verify with the latest version?
pip install -U clearml
Hi @<1603198134261911552:profile|ColossalReindeer77>
I would also check this one: None
Hi NastySeahorse61
Did you archive And delete the experiments from the archive?
BTW: I think this question belongs to
But I have no idea what will be input of step2.
What do you mean by that? the assumption is that somehow the output of step 1 will be passed (a string reference) to step 2, what am I missing ?
For visibility, after close inspection of API calls it turns out there was no work against the saas server, hence no data
You mean for running a worker? (I think plain vanilla python / ubuntu works)
The only change would be pip install clearml / clearml-agent ...
GrotesqueOctopus42
The problem is that when I import some function from a file in another folder, that task doesn't catch the files depencies.
Just to be clear, if this is another file, you have to have all the files in the same git repo for the agent to actually be able to fetch them on the remote machine.
If you have a mix of notebooks and code, you have to have the local code in a git repo,
Make sense ?
Hi TartBear70
I'm setting up reproducibility myself but when I call Task.init() the seed is changed
Correct
. Is it possible to tell clearml not to initialize any rng? It appears that task.set_random_seed() doesn't change anything.
I think this is now fixed (meaning should be part of the post weekend release)
. Is this documented?
Hmm i'm not sure (actually we should write it, maybe in Task.init docstring?)
Specifically the function that is being called is:
https://gi...
GiganticTurtle0
I think that what you are looking for is:param_dict = {'key': 1234} task.connect(param_dict, name='general')Notice that when this code runs manually (i.e. not by the agent), the dict is stored on "general" parameter section of the Task.
But when the code is executed by the Agent, the opposite happens and the parameters from the "general" section of the Task or put back into the param_dict , here the casting is done based on the type of the original values.
Generall...
Do you have python 3.7 in the docker ?
Alright I have a followup question then: I used the param --user-folder “~/projects/my-project”, but any change I do is not reflected in this folder. I guess I am in the docker space, but this folder is not linked to my the folder on the machine. Is it possible to do so?
Yes you must make sure the docker can mount a persistent folder for you to work on.
Let me check what's the easiest way to do that
Most likely yes, but I don't see how clearml would have an impact here, I am more inclined to think it would be a pytorch dataloader issue, although I don't see why
These are most certainly dataloader process. But clearml-agent when killing the process should also kill all subprocesses, and it might be there is something going on that prenets it from killing the subprocesses ...
Is this easily reproducible ? Can you verify it is still the case with the latest RC of clearml-agent ?