I'm not sure. Maybe @<1523701087100473344:profile|SuccessfulKoala55> can help 🙂
Hi @<1584716373181861888:profile|ResponsiveSquid49> , what optimization method are you using?
It would work from your machine as well, but the machine needs to be turned on... like when an ec2 instance that is running.
Hi GentleSwallow91 ,
- When using jupyter notebooks its best to do
task.close()
- It will bring the same affect you're interested in - If you would like to upload to the server you need to add the following parameter to your
Task.init()
The parameter is output_uri. You can read more here - https://clear.ml/docs/latest/docs/references/sdk/task#taskinit
You can either mark it asTrue
or provide a path to a bucket. The simplest usage would be ` Task.init(..., output_uri...
Hi MagnificentWorm7 ,
I'm not sure I understand. You're trying to upload files to a dataset from different concurrent processes?
Is it the services docker that comes with the docker compose or did you run your own agent?
Hi @<1556450111259676672:profile|PlainSeaurchin97> , I think this is what you are looking for:
None
BTW - did the agent print out anything? Which version of clearml-agent are you using?
Adding a custom engine example is on the 'to do' list but if you manage to add a PR with an example it would be great 🙂
Hi ElegantCoyote26 ,
It doesn't seem that using port 8080 is mandatory and you can simply change it when you run ClearML-Serving - i.e docker run -v ~/clearml.conf:/root/clearml.conf -p 8085:8085
My guess is that the example uses port 8080 because usually the ClearML backend and the Serving would run on different machines
RotundSquirrel78 , try going into localhost:8080/login
I would guess sosudo docker logs --follow trains-webserver
Please filter by 'XHR' and see if there are any errors (any return codes that aren't 200)
Hi DeterminedCrocodile36 ,
To use a custom engine you need to change the process tree.
https://github.com/allegroai/clearml-serving/tree/main/examples/pipeline
Section 3 is what you're interested in
And here is an example of the code you need to change. I think it's fairly straightforward.
https://github.com/allegroai/clearml-serving/blob/main/examples/pipeline/preprocess.py
ReassuredTiger98 , I played with it myself a little bit - It looks like this happens for me when an experiment is running and reporting images and changing metric does the trick - i.e reproduces it. Maybe open a github issue to follow this 🙂 ?
ReassuredTiger98 , does it happen on any experiment with debug images or only on running experiments?
You're totally right, if you managed to upload to a bucket then folder failure should be unrelated to permissions
Hi LethalCentipede31 , I don't think there is an out of the box solution for this but saving them as debug samples sounds like a good idea. You can simply report them as debug samples and that should also work 🙂
Hi @<1552101474769571840:profile|DepravedLion86> , do you have something that reproduces this behavior?
Hi @<1572032783335821312:profile|DelightfulBee62> , I think 1 TB should be enough. I would suggest maybe even having 2T just for the safe side
Hi @<1523701122311655424:profile|VexedElephant56> , can you please elaborate a bit more on how you set up the server? Is it on top of a VPN? Is there a firewall? Is it a simple docker compose or on top of K8s?
Hi @<1535069219354316800:profile|PerplexedRaccoon19> , the agent will try to use the relevant python version according to what the experiment ran on originally. In general, it's best to run inside dockers with a docker image specified per experiment 🙂
@<1523704157695905792:profile|VivaciousBadger56> , ClearML's model repository is exactly for that purpose. You can basically use InputModel and OutputModel for handling models in relation to tasks
Hi @<1698868530394435584:profile|QuizzicalFlamingo74> , Try compression=False
Hi @<1580367711848894464:profile|ApprehensiveRaven81> , I'm afraid this is only option for the open source version. In the Scale/Enterprise licenses there are SSO/LDAP integrations
Hi, I think you can get that from support@clear.ml
What ClearML version are you on?
I think this is what you're looking for, tell me if it helps 🙂
I think it depends on your implementation. How are you currently implementing top X checkpoints logic?
Then I think users.get_all
would be right up your alley 🙂