Hi ReassuredTiger98 ,
I think it is something that was logged during the initial run, then the clearml-agent
simply recreates the environment 🙂
What ClearML version are you on?
Hi @<1546303293918023680:profile|MiniatureRobin9> , can you please add logs of the tasks + controller?
@<1523701181375844352:profile|ExasperatedCrocodile76> , I think you need to set agent.package_manager.system_site_packages: True
In clearml.conf
ClearML doesn't assume that you have all the necessary packages installed so it does need to have some sort of basis for the packages to install
I'm afraid that would be the best method. You could probably hack something into clearml sdk yourself since it's open source
BrightMosquito10 simply re-run it with the new version 🙂
Hi @<1544853695869489152:profile|NonchalantOx99> , it looks like your cache is filling up. You can control it through the various cache configurations in clearml.conf
None
Just search for cache over the document and you'll find all the relevant configs 🙂
Hi 🙂
Please try specifying the file itself explicitly
So even if you abort it on the start of the experiment it will keep running and reporting logs?
You can specify a different docker image per experiment, so the same agent can run many different docker images (As long as it is run in docker mode from the start) 🙂
Didn't have a chance to try and reproduce it, will try soon 🙂
Hi @<1590514584836378624:profile|AmiableSeaturtle81> , you need to setup your s3 key/secret in clearml.conf
I suggest following this documentation - None
AbruptWorm50 , you can send me. Also can you please answer the following two questions?When were they registered? Were you able to view them before?
Also, you mention plots but in the screenshot you show debug samples. Can I assume you're talking about debug samples?
Pipeline is a unique type of task, so it should detect it without issue
VirtuousFish83 Hi 🙂
What versions are you running with? ClearML, ClearML-Agent, Torch, Lightning. Which OS are they run on and with what python version.
Do you maybe have a snippet to play around with to try and reproduce the issue?
Hi HappyDove3 , you mean when using app.clear.ml?
Hi @<1526734383564722176:profile|BoredBat47> , do you see any errors in the elastic container?
Hi @<1523701842515595264:profile|PleasantOwl46> , I'm not sure. Do you see any errors in the API server on such a startup?
My bad, if you set auto_connect_streams to false, you basically disable the console logging... Please see the documentation:
auto_connect_streams (Union[bool, Mapping[str, bool]]) – Control the automatic logging of stdout and stderr.
Hi @<1545216070686609408:profile|EnthusiasticCow4> , start_locally()
has the run_pipeline_steps_locally
parameter for exactly this 🙂
Oh I see. Technically speaking the pipeline controller is a task of a special type of itself. So technically speaking you could provide the task ID of the controller and clone that. You would need to make sure that the relevant system tags are also applied so it would show up properly as a pipeline in the webUI.
In addition to that, you can also trigger it using the API also
Hi @<1790915053747179520:profile|KindParrot86> , currently Slack alerts are available as an example for the OS - None
You can write an adapter for it to send emails instead of Slack alerts
Hi @<1718799873618219008:profile|FunnyPeacock68> , you can set this up in clearml.conf
of the running agent
None
Adding a custom engine example is on the 'to do' list but if you manage to add a PR with an example it would be great 🙂
This is also used in automated scenarios and over possible network issues, the retry is built in and is a good compromise - basically making the SDK resilient to network issues. The error you're getting is a failure to connect, unrelated to the credentials...
OddShrimp85 , Hi 🙂
I'm afraid that the only way to load contents of setup A into setup B is to perform a data merge.
This process basically requires merging the databases (mongodb, elasticsearch, files etc.). I think it's something that can be done in the paid version as a service but not in the open one.
Hi @<1618418423996354560:profile|JealousMole49> , why not just use different datasets? Just to make sure I'm understanding correctly - you have a duplication of data on both s3 and local?