So in which scenario do you want to keep those folders as artifacts and where would you like to store them?
Hi @<1826791494376230912:profile|CornyLobster42> , it looks like there might be an issue with the image. Have you tried other images? From what I see here - None
Many people are talking about various issues with various solutions.
What if you try with this image? projects/ml-images/global/images/c0-deeplearning-common-cu121-v20231209-debian-11
Hi ShaggySquirrel23 , is this package inside some artifactory?
The agent needs access to the package while running, you need to have it accessible somehow on a remote machine as well. What is your setup?
GrievingTurkey78 , let me take a look into it 🙂
Hi PunyWoodpecker71 ,
It's best to run the pipeline controller in the services queue because the assumption is that the controller doesn't require much compute power as opposed to steps that can be resource exhausting (depends on pipeline of course)
CluelessElephant89 , Hi!
It looks like there is a problem with the API server. Can you please look for the docker logs and see what errors that it prints and paste here 🙂
BattyDove56 , the warning doesn't seem related. As I mentioned before, you need to check the elastic logs to see what's the issue. post them here so we can look together 🙂
Hi @<1590514584836378624:profile|AmiableSeaturtle81> , did you have a chance to try out the solution suggested in GitHub and play with it a bit?
Always good to use the latest 🙂
Its a setting on the autoscaler:
Hi ReassuredOwl55 , can you please elaborate on your use case or exactly what you're trying to achieve?
Hi @<1536881167746207744:profile|EnormousGoose35> , can you take a look at the webserver and apiserver container logs to see what errors are there?
@<1523701295830011904:profile|CluelessFlamingo93> , I'm not sure what you mean. Whenever you run a pipeline code (pipeline from decorators) if it's from a repository that repo will be logged. Where are you importing "train" from? What if you import entire package & point to the specific module?
And from the error you get, like I mentioned it looks there is no services queue. I would check the logs on the agent-services container to see if you get any errors as this is the agent in charge of listening to the 'services' queue
What issues are you facing?
Hi @<1736556881964437504:profile|HelplessFly7> , I don't think there is such an integration. Currently Poetry, pip and Conda are supported. I think you could make a PR on this for the clearml-agent
Hi SwankyCrab22 ,
Regarding Task.init() did you try passing docker_bash_setup_script and it didn't work? Because according to the docs it should be available with Task.init() as well. Also, after the Task.init() you can use the following method:
https://clear.ml/docs/latest/docs/references/sdk/task#set_base_docker
to also add a docker_setup_bash_script in it's args.
Regarding running the script after the repository is downloaded - I'm not sure. But certainly...
Hi @<1523701504827985920:profile|SubstantialElk6> , thanks for the heads up 🙂
Hi @<1820993248525553664:profile|DisturbedReindeer69> , I think you're looking for the --output-uri parameter in clearml-data create - None
RotundSquirrel78 , do you have an estimate how much RAM the machine running ClearML server? Is it dedicated to ClearML only or are there other processes running?
By the way, is there a specific functionality you're looking for?
Hi @<1524560082761682944:profile|MammothParrot39> , I think you need to run the pipeline at least once (at least the first step should start) for it to "catch" the configs. I suggest you run once with pipe.start_locally(run_pipeline_steps_locally=True)
Regarding the packages issue:
What python did you run on originally - Because it looks that 1.22.3 is only supported by python 3.8. You can circumvent this entire issue by running in docker mode with a docker that has 3.7 pre-installed
Regarding the data file loading issue - How do you specify the path? Is it relative?
Hi @<1546303277010784256:profile|LivelyBadger26> , how did you set the random seed? I think you can also disable ClearML's random seed override and set one with Pytorch
Hi, how did you save the dataset so far?
Can you add the api section of your clearml.conf and also a log of a task?
Hi UnevenDolphin73 , is there a specific setting you're looking for?
GreasyLeopard35 , please try with the latest clearml-agent