
Reputation
Badges 1
25 × Eureka!Hi ItchyJellyfish73
The behavior should not have changed.
"force_repo_requirements_txt" was always a "catch all option" to set a behavior for an agent, but should generally be avoided
That said, I think there was an issue with v1.0 (cleaml-server) where when you cleared the "Installed Packages" it did not actually cleared it, but set it to empty.
It sounds like the issue you are describing.
Could you upgrade the clearml-server
and test?
How does ClearML select reference branch? Could it be that ClearML only checks "origin" branch?
Yes ๐ I think we can quickly fix that, I'm just trying to realize if there are down sides to running "git ls-remote --get-url" without origin
BeefyHippopotamus73 this error seems like it is coming from boto3, are you sure the credentials are properly configured and that you have read permission ?
ContemplativeGoat37
http://1.it seems the DNS resolving to the server fails? (Temporary failure in name resolution) Is this running on an agent, or manually ? "clearml.Task - WARNING - ### TASK STOPPED - USER ABORTED - STATUS CHANGED ###" Is this you manually aborting the Task or is it aborting itslef due to the connectivity ?
4. what's the clearml/clearml-agent versions ?
Archived is actually just a "flag" on the Task. If you actually want to delete it (incl artifacts), in the archived view, right click and select delete
is removed from the experiment list?
You mean archived ?
Hi @<1720249421582569472:profile|NonchalantSeaanemone34>
Sorry I missed this message. Yeah the reason it's not working is because the way the returned value is stored and passed is by using 'pickle' , unfortunately python pickle does not support storing lambda functions...
https://docs.python.org/3/library/pickle.html#id8
That makes sense to me, what do you think about the following:
` from clearml import PipelineDecorator
class AbstractPipeline(object):
def init():
pass
@PipelineDecorator.pipeline(...)
def run(self, run_arg):
data = self.step1(run_arg)
final_model = self.step2(data)
self.upload_model(final_model)
@PipelineDecorator.component(...)
def step1(self, arg_a):
# do something
return value
@PipelineDecorator.component(...)
def step2(self, arg_b):
# do ...
Hi ContemplativePuppy11
This is really interesting point.
Maybe you can provide a pseudo class abstract of your current pipeline design, this will help in trying to understand what you are trying to achieve and how to make it easier to get there
So from foo.mod import
"translates" to foo-mod @ git+
None ..
?
python version to be used and conda will install it
clearml does that automatically (albeit it is not shown in the UI, which should be fixed)
suspect permissions, but not entirely sure what and where
Seems like it.
Check the config file on the agent machine
https://github.com/allegroai/clearml-agent/blob/822984301889327ae1a703ffdc56470ad006a951/docs/clearml.conf#L18
https://github.com/allegroai/clearml-agent/blob/822984301889327ae1a703ffdc56470ad006a951/docs/clearml.conf#L19
potential sources of slow down in the training code
Is there one?
Hi @<1697056701116583936:profile|JealousArcticwolf24>
Awesome deployment ๐คฉ
Yes if you need another scalable model serving you can just run another instance of the clearml-serving-inference
https://github.com/allegroai/clearml-serving/blob/7ba356efc97a6ae2159283d198d981b3c1ab85e6/docker/docker-compose.yml#L77
So you end up with two of them, one per models environ...
None
No they are not, they are taking the vscode backend and put it behind a webserver-ish
Good question ๐from clearml import Task Task.init('examples', 'test')
hmm, yes, but then this kind of a hacky solution... The original #340 was about packaging source code that was not in git... Now we want to add "data" (even if ephemeral) on to it, no?
My thinking is somehow make sure a Task can reference a "Dataset" to be downloaded before it starts by the agent ?!
Are you suggesting just taking theย
read_and_process_file
ย function out of theย
read_dataset
ย method,
Yes ๐
As for the second option, you mean create the task in theย
init
ย method of the NetCDFReader class?
correct
It would be a great idea to make the Task picklelizable,
Adding that to the next version to do list ๐
I can then programmatically choose which file to import with importlib. Is there a way to tell clearml programmatically to analyze the files, so it can built up the requirements correctly?
Sadly no ๐
It analyzes the running code, then if it decides it is not a self contained script it will analyze the entire repo ...
I just saw thatย
Task.create
ย takes
Task.create
is Not Task.init. It is meant to allow you to create new Tasks (think Jobs) from ...
Your git execution needs this file, just like your machine does, to know where the server is and how to authenticate. You have to Manually pass it to your git action.
another though, see what happens if you remove the .save and .close and stay with the .show, maybe the close
call somehow interfere's with it ?
Otherwise, if you can test one of the shaps examples and see maybe they fail in your setup that's another avenue to take for reproducing the issue
orpip install -U trains
Are they ephemeral or later used by other Tasks, execution etc ?
For example: configuration files, they are specific for an execution, and someone will edit them.
Initial weights files, are something that multiple execution might needs them, and they will be used to restore an execution. Data, even if changing, is usually used by multiple executions tasks etc.
It seems like you treat these files as "configurations", is that right ?
The address is valid. If i just go to the files server address on my browser,
@<1729309131241689088:profile|MistyFly99> what is the exact address of those files? (including the http prefix) and what is the address of the web application ?
EmbarrassedSpider34
Sync_folder and upload
Several times along the code and then
Do notice they overwrite one another...
Yes, this is exactly how the clearml k8s glue works (notice the resource allocation, spin nodes up/down, is done by k8s which sometimes do take some time, if you only need "bare metal nodes" on the cloud, it might be more efficient to use the aws autoscaler, that essentially does the same thing
BTW: if you feel like writing a wrapper it could be cool ๐
Do I set theย
CLEARML_FILES_HOST
ย to the end point instead of an s3 bucket?
Yes you are right this is not straight forward:CLEARML_FILES_HOST="
s3://minio_ip:9001 "
Notice you must specify "port" , this is how it knows this is not AWS. I would avoid using an IP and register the minio as a host on your local DNS / firewall. This way if you change the IP the links will not get broken ๐
GiddyTurkey39 Hmm I'm assuming that by default it cannot access that IP range.
Are you using virtual-box for the VM?
EDIT:
Can I assume the machine running the VM (a.k.a the host) can access the trains-server
?