Reputation
Badges 1
25 × Eureka!Hi GrievingTurkey78
First, I would look at the CLI clearml-data
as a baseline for implementing such a tool:
Docs:
https://github.com/allegroai/clearml/blob/master/docs/datasets.md
Implementation :
https://github.com/allegroai/clearml/blob/master/clearml/cli/data/main.py
Regrading your questions:
(1) No, a new dataset version will only store the diff from the parent (if files are removed it stored the metadata that says the file was removed)
(2) Yes any get operation will downl...
Hi SharpDove45
whatΒ
Β suggested about how it fails on bad/missing credentials
Yes, this is correct, since you specifically set the hosts worst case you will end up with wrong credentials π
SharpDove45 FYI:
if you set the environment variable CLEARML_NO_DEFAULT_SERVER=1
, it will make sure never to default to the demo server
GrotesqueDog77 one issue with this design, in order to run a sub-component, the call must be done from the parent component, does that make sense?
` def step_one(data):
return data
def step_two(path):
return model
def both_steps()
path = step_one("stuff")
return step_two(path)
def pipeline():
both_steps() Which would make
both_steps ` a component and step_one and step_two sub-components
wdyt?
PungentLouse55 hmmm
Do you have an idea on how we could quickly reproduce it?
Hi @<1535069219354316800:profile|PerplexedRaccoon19>
What do you mean by simulate?
You can manually setup and run a Task if you need,
'clearml-agent execute --id task_id' add --docker for docker mode.
This will setup the env and run the task
Would that go under
arguments
?
yes π
Also what is the base path where the git repo is cloned? So if my repo is called myProject.git, what would the full path be?
For example https://github.com/ <user>/myProject.git
btw: how come you do not have this field auto populated from running the code locally or using clearml-task
CLI?
With default settings, to upload 2 datasets of 120 GB and 70 Gb it took more than 6 hours!
SmugSnake6 at the end s the an outcome of limited bandwidth or limited CPU ?
You can already sort and filter experiments based on any hyper parameter or metric that the experiment reports, there is no need for any custom language query. Also all created filter/sorted table can be shared exactly as they are, so you can create leaderboards and share specific filters. You can also use the search bar in order to filter based on experiment name / comment. Tags will be added soon as well π
Example of custom columns is here (the screen grab is a bit old, now there is als...
Thus, the return data from step 2 needs to be available somewhere to be used in step 3.
Yep π
It will serialize the data on the dict?
I thought it will just point to a local file location where you have the data π
I didnβt know that each steps runs in a different process
Actually ! you can run them as functions as well, try:if __name__ == '__main__': PipelineDecorator.debug_pipeline() # call pipeline function here
It will just run them as functions (ret...
Thanks GrievingTurkey78 !
It seems that under the hood they user argparser
See here:
https://github.com/google/python-fire/blob/c507c093fa6622ab5efee21709ffbf25974e4cf7/fire/parser.py
Which means it might just work?!
What do you think?
GrievingTurkey78 sure, aws autoscaler can do that:
https://github.com/allegroai/clearml/blob/master/examples/services/aws-autoscaler/aws_autoscaler.py
Not yet π
It should not be complex to implement,
The actual aws auto scaler class is implementing just two functions:
def spin_up_worker(self, resource, worker_id_prefix, queue_name):
https://github.com/allegroai/clearml/blob/e9f8fc949db7f82b6a6f1c1ca64f94347196f4c0/clearml/automation/auto_scaler.py#L104
def spin_down_worker(self, instance_id):
https://github.com/allegroai/clearml/blob/e9f8fc949db7f82b6a6f1c1ca64f94347196f4c0/clearml/automation/auto_scaler.py#L...
GrievingTurkey78 MagnificentSeaurchin79 do you guys want to start a PR branch we cal all work on it?
Hi WhimsicalLion91
You can always explicitly send a value:from trains import Logger Logger.current_logger().report_scalar("title", "series", iteration=0, value=1337)
A full example can be found here:
https://github.com/allegroai/trains/blob/master/examples/reporting/scalar_reporting.py
The main question I have is why is the ALB not passing the request, I think you are correct it never reaches the serving server at all, which leads me to think the ALB is "thinking" the service is down or is not responding, wdyt?
Notice that if you are using TB, everything you report to the TB will appear as well π
NastyOtter17 can you provide some more info ?
WhimsicalLion91
What would you say the use case for running an experiment with iterations
That could be loss value per iteration, or accuracy per epoch (iteration is just a name for the x-axis in a sense , this is equivalent to time series)
Make sense?
Hi StraightDog31
I am having trouble using theΒ
StorageManager
Β to upload files to GCP bucket
Are you using the storagemanager
directly ? or are you using task.upload_artifact
?
Did you provide the GS credentials in the clearml.conf file, see example here:
https://github.com/allegroai/clearml/blob/c9121debc2998ec6245fe858781eae11c62abd84/docs/clearml.conf#L110
Notice both needs to be str
btw, if you need the entire folder just use StorageManager.upload_folder
Can you share the storagemanager usage, and error you are getting ?
Hi DepressedChimpanzee34
if you try to extend it more then the width of the column to the right, it doesn't do anything..
You mean outside of the window? or are you saying you cannot extend it?
Just verifying, we are talking about the latest version of clearml-server ?
Yes, no reason to attach the second one (imho)
And is Task.init called on all processes ?
Wait, how do I reproduce it on community server? Maybe it has something to do with number of columns ? Or whether it is already wider than the screen? What's your browser / OS ?