BoredPigeon26 , it looks like the file isn't accessible through your browser. Are you sure the remote machine files are accessible?
How about when you view it in the datasets view? Also what version of clearml
package do you have?
If you run an agent in docker mode ( --docker
) the agent will run a docker run
command and the task will be executed inside a container. In that scenario, I think, if you kill the daemon then the docker will stay up and finish the job (i think, haven't tested)
Looping in @<1523703436166565888:profile|DeterminedCrab71> & @<1523701435869433856:profile|SmugDolphin23> for visibility
Hi @<1523708340473958400:profile|SweetHippopotamus84> , can you please provide some code snippet that reproduces this?
Interesting! Do they happen to have the same machine name in UI?
Everything in None
You can set up username & password, it's in the documentation 🙂
You should check the status of that container
You can add torch to the installed packages section manually to get it running but I'm curious why it wasn't logged. How did you create the original experiment?
Hi SarcasticSquirrel56 ,
How are the agents running? On top of K8s or bare metal?
Also, can you do a diff between the ~/clearml.conf
of your local machine and the one on the agent?
Hi PunyWoodpecker71 ,
It's best to run the pipeline controller in the services queue because the assumption is that the controller doesn't require much compute power as opposed to steps that can be resource exhausting (depends on pipeline of course)
Can you please provide full logs of everything?
I'm guessing this is a self deployed server, correct?
Can you try with auto_connect_streams=True ? Also, what version of clearml
sdk are you using?
Hi @<1708653001188577280:profile|QuaintOwl32> , you can control all of this on the task level. For example through code you can use Task.set_base_docker
- None
You can add all of these as arguments
I'm being silly. You're actually directing it to the file itself to where it resides
or /home/<USER_NAME>/clearml.conf
You can view all projects and search there 🙂
They do look identical, I think the same issue (If it's an issue) also affects https://clear.ml/docs/latest/docs/references/sdk/dataset/#list_added_files
@<1529271085315395584:profile|AmusedCat74> , what happens if you try to run it with clearml
1.8.0?
@<1719524641879363584:profile|ThankfulClams64> , if you set auto_connect_streams to false nothing will be reported from your frameworks. With what frameworks are you working, tensorboard?
So even if you abort it on the start of the experiment it will keep running and reporting logs?
I suggest you see this example - None
And see how you can implement an if statement into this sample to basically create 'branches' in the pipeline 🙂
Sure, if you can post it here or send in private if you prefer it would be great
Hi @<1570583227918192640:profile|FloppySwallow46> , can you please elaborate on what you were doing, what happened?
I don't think datasets don't have visualization out of the box, you need to add these previews manually. Only HyperDatasets feature from the Scale & Enterprise versions truely visualizes all the data.
According to your code snippet there isn't any visualization add on top of the dataset
Hi SoreHorse95 , I'm not sure I understand. You know the reason for the failure and you say that the machines are identical but they do have different versions cached. Why is the one with the different version failing actually?
Is there a reason it is requiring pytorch? )
The script you provided has only clearml
as a requirement
@<1722786138415960064:profile|BitterPuppy92> , we are more than happy to accept pull requests into our free open source 🙂