Thanks Martin, so does it mean I wonβt be able to see the data hosted on S3 bucket in ClearMl dashboard under datasets tab after registering it?
Sure you can, let's assume you have everything in your local /mnt/my/data you can just add this folder with add_files then upload to your S3 bucket with upload(output_uri=" None ",...)
make sense ?
But I think this error has only appeared since I upgraded to version 1.1.4rc0
Hmm let me check something
and they don't know how to write code, is this still possible?
well this means there is some standard of the data, right? what is that standard? unfortunately in our space there is no standard fort data, it's just too generic, so everyone always end with custom parsing of a sort.
Does that make sense ?
Hi JitteryCoyote63
Just making sure, the package itself it installed as part of the "Installed packages", and it also installs a command line utility ?
Thanks PompousBeetle71
Quick question, what frameworks are you using?
Do you use save method directly to file stream (or any other direct storage)?
Hmm, I think the issue is here (the docker command mount)'-v', '/tmp/.clearml_agent.de0n48pm.cfg:/root/clearml.conf'
creating a dataset with parents worked very well and produced great visuals on the UI!
woot woot!
I tried the squash solution, however this somehow caused a download of all the datasets into my
so this actually works, kind or like git squash, bottom line it will repackage the data from all the different versions into one new version. This means downloading the data from all squashed versions, then repackaging it into a single new version. Make sense ?
PompousBeetle71 , basically reset experiment will clear all the outputs, and input model model is well, input, it is not cleared. In the next execution it will be overridden. There is actually a way to change it from the UI, and override the initial model weights.
Hi @<1556812486840160256:profile|SuccessfulRaven86>
I'm assuming this relates to the SaaS service.
API calls are away to measure usage, basically metric reports are bunched into a single call, agents pings / query is API call, and so on so forth.
How many hours you had training tasks reporting data? how many agents running and so on
DrabCockroach54 that is quite cool!
Basically here is what I would do
Query Tasks that are both Running and Do not have system tag "development" (that means running on agents) + filter only tasks that start say 10 min ago Go over the list and see if (1) they have GPU scalar reported (meaning GPU is accessible) (2) min/max/val of GPU utilization is under 5%wdyt?
Sure thing, anyhow we will fix this bug so next version there is no need for a workaround (but the workaround will still hold so you won't need to change anything)
Hi @<1532532498972545024:profile|LittleReindeer37>
Yes you are correct it should capture the entire jupyter notebook in sagemaker studio.
Just verifying this is the use case, correct ?
okay thanks! let's pick it up on github π
Have a wrapper over Task to ensure S3 usage, tags, version number etc and project name can be skipped and it picks from the env var
Cool. Notice that when you clone the Task and the agents executes it, the project is already defined, so this env variable is meaningless, no ?
if project_name is None and Task.current_task() is not None: project_name = Task.current_task().get_project_name()This should have fixed it, no?
Could it be it was never allocated to begin with ?
AstonishingSeaturtle47 I think there's a workaround for the GitHub multiple repo issue. See https://gist.github.com/gubatron/d96594d982c5043be6d4
Seems like something is not working with the server, i.e. it cannot connect with one of the dockers.
May I suggest to carefully go through all the steps here, make sure nothing was missed
https://github.com/allegroai/trains-server/blob/master/docs/install_linux_mac.md
Especially number (4)
IrritableJellyfish76 if this is the case, my question is what is the reason to use Kubeflow? (jupyterLab server spinning is a good answer for example, pipelines are to my opinion a lot less)
Try removing this magic environment that tells the sub-process there was already an Initialized Task.
import os env = dict(**os.environ) env.pop('TRAINS_PROC_MASTER_ID', None) π
Are you getting the error from boto failing to launch additional ec2 instances ?
What exactly do you get automatically on the "Installed Packages" (meaning the "my_package" line)?
Nooooooooooooooooooooooo
NICE! MoodyCentipede68 this is awesome π
wouldn't it be possible to store this information in the clearml server so that it can be implicitly added to the requirements?
I think you are correct, and if we detect that we are using pandas to upload an artifact, we should try and make sure it is listed in the requirements
(obviously this is easier said than done)
And if instead I want to force "get()" to return me the path (e.g. I want to read the csv with a library that is not pandas) do we have an option for that?
Yes, c...
Hi @<1631102016807768064:profile|ZanySealion18>
I'm using SSH for authentication, however, known_hosts doesn't seem to be passed to the docker so it prompts for authentification/fingerprint. Any ideas?
Hmm it is supposed to automatically mount your ~/.ssh folder into the docker to solve for that.
First try to set force_git_ssh_protocol: true
None
If that does not he...