(torchvision vs. cuda compatibility, will work on that),
The agent will pull the correct torch based on the cuda version that is available at runtime (or configured via the clearml.conf)
We're not using a load balancer at the moment.
The easiest way is to add ELB and have amazon add the httpS on top (basically a few clicks on their console)
Closing the data doesnt work: dataset.close() AttributeError: 'Dataset' object has no attribute 'close'
Hi @<1523714677488488448:profile|NastyOtter17> could you send he full exception ?
logger.report_scalar("loss", "train", iteration=0, value=100)logger.report_scalar("loss", "test", iteration=0, value=200)
I think I found something, let me dig deeper 🙂
BTW: how did it get there ?
I want to store only my raw data in my blob storage, and I want to create a Hyperdataset with all the artificats, metrics, frames,
Yes that's exactly how it works.
None
This line adds a reference to raw file (local/remote)
[https://github.com/allegroai/clearml/blob/1b474dc0b057b69c76bc2daa9eb8be927cb25efa[…]es/hyperdatasets/data-registration/register_dataset_wit...
Thanks PompousBaldeagle18 !
Which software you used to create the graphics?
Our designer, should I send your compliments 😉 ?
You should add which tech is being replaced by each product.
Good point! we are also missing a few products from the website, they will be there soon, hence the "soft launch"
let's call it an applicative project which has experiments and an abstract/parent project, or some other name that group applicative projects.
That was my way of thinking, the guys argued it will soon "deteriorate" into the first option :)
Sure LazyTurkey38 here's a nice hack for that:
` # code here
task.execute_remotely(queue_name=None, clone=False, exit_process=False)
patch the Task and actually send it for execution
if Task.running_locally():
task.update_task(task_data={'script': {'branch': 'new_branch', 'repository': 'new_repo'}})
# now to actually enqueue the Task
Task.enqueue(task, queue_name='default') You can also clear the git diff by passing "diff": "" `
wdyt?
or by trains
We just upload the image as is ... I think this is SummaryWriter issue
okay that makes sense, if this is the case I would just use clearml-agent execute --id <task_id here> to continue the training Task.
Do notice you have to reload your last chekcpoint from the Task's models/artifacts to continue 🙂
Last question, what is the HPO optimization algorithm, is it just grid/random search or optuna hbop/optuna, if this is the later, how do make it "continue" ?
Hi WickedStarfish97
As a result, I don’t want the Agent to parse what imports are being used / install dependencies whatsoever
Nothing to worry about here, even if the agent detects the python packages, they are installed on top of the preexisting packages inside the docker. That said if you want to over ride it, you can also pass packages=[]
What do you have in "server_info['url']" ?
I see,
@<1571308003204796416:profile|HollowPeacock58> can you please send the full log?
(The odd thing is it is trying to install the python 3.10 version of torch, when your command line suggest it is running python 3.8)
Any plans to add unpublished state for clearml-serving?
Hmm OddShrimp85 do you mean like flag, not being served ?
Should we use archive ?
The publish state, basically locks the Task/Model so they are not to be changed, should we enable unlocking (i.e. un-publish), wdyt?
time.sleep(time_sleep)
You should not call time.sleep in async functions, it should be asyncio.sleep,
None
See if that makes a difference
Hi @<1556812486840160256:profile|SuccessfulRaven86>
Please notice that the clearml serving is not designed for public exposure, it lacks security layer, and is designed for easy internal deployment. If you feel you need the extra security layer I sugget either add external JWT alike authentication, or talk to the clearml people, their paid tiers include enterprise grade security on top
Interesting, if this is the issue, a simple sleep after reporting should prove it. Wdyt?
BTW are you using the latest package? What's your OS?
Hi @<1694157594333024256:profile|DisturbedParrot38>
You mean how to tell the agent to pull only some submodules of your git?
If this is the case you can actually remove them on your git branch, submodule is a file with a soft link. Wdyt?
I'm saying that because in the task under "INSTALLED PACKAGES" this is what appears
This is exactly what I was looking for. Thanks!
Yes that makes sense, I think this bug was fixed a long time ago, and this is why I could not reproduce it.
I also think you can use a later version of clearml 🙂
Think I will have to fork and play around with itÂ
NICE! (BTW: if you manage to get it working I'll be more than happy to help push the PR)
Maybe the quickest win is to store just the .py as model ?
Ohhh I see, yes this is regexp matching, if you want the exact match:'^{}$'.format(name)
You do not need the cudatoolkit package, this is automatically installed if the agent is using conda as package manager. See your clearml.conf for the exact configuration you are running
https://github.com/allegroai/clearml-agent/blob/a56343ffc717c7ca45774b94f38bd83fe3ce1d1e/docs/clearml.conf#L79
What do you mean by "tag" / "sub-tags"?
Hi PanickyMoth78
Hmm yes, I think the StorageManager (i.e. the google storage pythonclinet) also needs a json file with the credentials.
Let me check something
Basically it solves the remote-execution problem, so you can scale to multiple machines relatively easy :)
PleasantGiraffe85 you can disable the SSL verification on the client end:
https://github.com/allegroai/clearml-agent/blob/21c4857795e6392a848b296ceb5480aca5f98e4b/docs/clearml.conf#L12
Basically you can just manually create the clearml.comf with only the following:api { api_server: web_server: files_server: `
credentials {"access_key": "EGRTCO8JMSIGI6S39GTP43NFWXDQOW", "secret_key": "x!XTov_G-#vspE*Y(h$Anm&DIc5Ou-F)jsl$PdOyj5wG1&E!Z8"}
# verify...