Reputation
Badges 1
533 × Eureka!AgitatedDove14 permanent. I want to start with a CLI interface that allows me add users to the trains server
Config == conf_obj
no?
whatttt? I looked at config_obj
didn't find any set
method
the link to manual model registry doesn't work
Very nice thanks, I'm going to try the SA server + agents setup this week, let's see how it goes ✌
Do i need to copy this aws scaler task to any project I want to have auto scaling on? what does it mean to enqueue hte aws scaler?
essentially editing apiserver.conf
section auth.fixed_users.users
Martin: In your trains.conf, change the valuefiles_server: '
s3://ip :port/bucket'
Isn't this a client configuration ( trains-init
)? Shouldn't be any change to the server configuration ( /opt/trains/config...
)?
I know I can configure the file server on trains-init
- but that only touches the client side, what about the container on the trains server?
AgitatedDove14 clearml version on the Cleanup Service is 0.17.0
I'm trying it now
Sorry I meant this link
https://azuremarketplace.microsoft.com/en-us/marketplace/apps/apps-4-rent.clearml-on-centos8
google store package could be the cause, because indeed we have the env var set, but we don't use the google storage package
Makes sense
So I assume, trains assumes I have nvidia-docker installed on the agent machine?
Moreover, since I'm going to use Task.execute_remotely
(and not through the UI) is there any code way to specify the docker image to be used?
Mmm maybe, lets see if I get this straight
A static artifact is a one-upload object, a dynamic artifact is an object I can change during the experiment -> this results at the end of an experiment in an object to be saved under a given name regardless if it was dynamic or not?
So dynamic or static are basically the same thing, just in dynamic, I can edit the artifact while running the expriment?
Second, why would it be overwritten if I run a different run of the same experiment? As I saw, each object is stored under a directory with the task ID which is unique per run, so I assume I won't be overriding artifacts which are saved under the same name in different runs (regardless of static or dynamic)
I was refering to what is the returned object of Task.artifacts['...']
- when I call .get
I understand what I get, I'm asking because I want to see how the object I'm calling .get
on behaves
Cool - what kind of objects are returned by .artifacts.
getitem
? I want to check their docs
I only found Project ID, which I'm not sure what this refers to - I have the project name
Okay so at the first part of the code, we define some kind of callback that we add to our steps, so later we can collect them and attach the results to the pipeline task. It looks something like this
` class MedianPredictionCollector:
_tasks_to_collect = list()
@classmethod
def collect_description_tables(cls, pipeline: clearml.PipelineController, node: clearml.PipelineController.Node):
# Collect tasks
cls._tasks_to_collect.append(node.executed)
@classmethod...
AgitatedDove14 I really don't know how is this possible... I tried upgrading the server, tried whatever I could
About small toy code to reproduce I just don't have the time for that, but I will paste the callback I am using to this explanation. This is the overall logic so you can replicate and use my callback
From the pipeline task, launch some sub tasks, and put in their post_execute_callback
the .collect_description_tables
method from my callback class (attached below) Run t...
I just tried setting the conf in the section Martin said, it works perfectly
even though I apply append
could be 192.168.1.255?