Hi SuccessfulKoala55 , would they need the fileserver to route to minio then? E.g.
This will ensure that any actions by clearml-data and models are saved into the S3 object store.
api {
files_server: s3://ecs.ai:80/clearml-data/default
}
aws {
s3 {
credentials {
host: http://ecs.ai:80
## Insert the iam credentials provided by your SAs here.
}
}
}
But if user forgot to do above, they will be saved on ClearML server. If I switch off f...
Hi,
basically i run this block first and ended the script.task = Task.init(project_name="afro-nmt", task_name=args.taskname, continue_last_task=args.taskid) Logger.current_logger().report_scalar(title="BLEU",series="JW300",value=args.jwbleu, iteration=args.lastiter)
Then i run another script, with series different.
` task = Task.init(project_name="afro-nmt", task_name=args.taskname, continue_last_task=args.taskid)
Logger.current_logger().report_scalar(title="BLEU",series="SS900",value=arg...
Hi FriendlySquid61 , AgitatedDove14 , the issue and possible fix is in this issue raise. https://github.com/allegroai/clearml-agent/issues/51
Yeah that sounds good. But from user perspective, especially the untrained, they wouldn't know what to point to. Example, some may think it's an exe, some think it's a zip bundle, and others think it's any github repo with the word vscode.
For example, it would useful to integrate https://github.com/whylabs/whylogs#features into ClearML as part of data and model monitoring. WhyLogs would have their own static page that would preferably be displayed as a new custom tab (besides logs, scalars and plots.).
f you can directly access the machine running the agent, yes you could. If not reverse proxy is in the workingÂ
Hi AgitatedDove14 , i might have misunderstood your previous comment above. Do you mean that clearml-session can only work regardless of whether xforwarding is configured, if we have direct access to the Kubernetes worker when we run K8S glue?
We did some testing today and clearml-session tried to tunnel with a k8s cluster ip, and thus failed.
If we setup a ingress with Me...
Hi AgitatedDove14 , thanks.
In this case i am running k8s glue (machine glue), which will then spawn off pods in kubernetes worker (machine worker). So when you say direct access, are you refering to the Glue machine or K8S Worker machine?
They don't have the same version. I do seem to notice that if the client is using version 3.8, during remote execution will try to use that same version despite the docker image not installed with that version.
Unfortunately due to security, clients can't have direct access to the nodes. Is there any possible workarounds at the moment?
Thanks TimelyPenguin76 , is there an env var for the S3 connection as well?
Ok. Problem was resolved with latest version of clearml-agent and clearml.
Hi TimelyPenguin76 ,
If you notice in the last screenshot, it state the bucket name to be http://ecs.ai . It then it tries to open http://s3.amazonaws.com/ecs.ai/clearml-models/artifact/uploading_file?X-Amz-Algorithm= ....
Hi, i will have to get back to you again. Need to check every client's repo to determine your hypothesis.
Hi erez, i think i would want to reference the code that transformed the data. Take for example, i received 10k images, i performed some transformation and save it as a next version before i split it up for my ML training. Some time later, i receive a new set of 10k images and wants to apply the same transformation and then append it to the previous 10k as another version. Clearml-data does well for the data-versioning part, but in terms of data provenance, its not clear how i can associate t...
What type of pipeline steps are you running? From task, decorator or function?
We were trying with 'from task' at the moment. But the question apply to all methods.
If they're all running on the same container why not make them the same task and do things in parallel?
The tasks were created by different teams and their tasks content is rather independent and modular. Usage of them is usually optional. For example, task1 performs 'image whitening', task2 performs 'image resize'.
Ok i get the logic now. extra_docker_shell_script
executes before clearml-agent talks to clearml server.
AgitatedDove14 , will these be fixed?
Passing env via the code Passing env via template yaml
What's the diff between template-yaml and --overrides-yaml? I used the latter to ensure the gpu is passed in.
I'm also noticing a lot of this while the k8s glue is running.Ex: Expecting value: line 1 column 1 (char 0) K8S Glue pods monitor: Failed parsing kubectl output:
thanks SuccessfulKoala55 . I verified your last comment and it works.
Hi yes, still getting the SSLs. It looks like some incompatibility with the OS ssl libraries.
From ClearML perspective, how would we enable this, considering we don't have direct control or even IP of the agents
Hi CostlyOstrich36 , thanks. I will check with the Enterprise team then.
Hi, how may i task.init() within these sub processes without write access to the 3rd party scripts and python executables?
Does the enterprise version support natively?
Thanks CostlyOstrich36 , how do i know how is the parts indexed in the first place? Or rather, how is chunk and parts defined? Say in the context of images, videos, text documents...etc.
Hi, when i tried ip:port, it references the right host and bucket....BUT... the file is not found on the ECS S3 even though i can see from the logs that it states Completed model upload to s3://ecs.ai:80/clearml-models/artifacts/ ...