Reputation
Badges 1
25 × Eureka!GreasyPenguin66 you can pass:AZURE_STORAGE_ACCOUNT AZURE_STORAGE_KEY
As the default azure access/secret 🙂
I'm assuming TF was not part of the original requirements, and was automatically pulled by one of the packages, hence the latest version ....
Awesome! any way to hear the talk w/o/ registering for the whole conference?
CloudySwallow27 Anyway we will make sure we upload the talk to the clearml youtube channel after the Talk
No worries, and I hope you manage to get that backup.
Is there a way to filter a experiments in a hyperparameter sweep based on a given range of a parameter/metric in the UI
Are you referring to the HPO example? or the Task comparison ?
Can you run the entire thing on your own machine (just making sure it doesn't give this odd error) ?
So would this pseudo code solve the issue
def pipeline_creator():
pipeline_a_id = os.system("python3 create_pipeline_a.py")
print(f"pipeline_a_id={pipeline_a_id}")
something like that?
(obviously the quesiton how would you get the return value of the new pipeline ID, but I'm getting a head of myself)
Hi TrickySheep9
So basically the idea is you can quickly code a scheduler with your own logic, then launch is on the "services queue" to run basically forever 🙂
This could be a good example:
https://github.com/allegroai/clearml/blob/master/examples/services/monitoring/slack_alerts.py
https://github.com/allegroai/clearml/blob/master/examples/automation/task_piping_example.py
Thanks GentleSwallow91
That's a good tip, where in the docs would you add it?
Ohh, clearml is designed so that you should not worry about that, download_dataset = StorageManger.get_local_copy()
this is cashed, meaning the machine that runs that like the second time will not re download the path.
This means step 1 is redundant, no?
Usually when data is passed between components it is automatically uploaded as artifact to the Task (stored on the files server or object storage etc.) then downloaded and passed to the next steps.
How large is the data that you are wo...
I set up the alert rule on this metric by defining a threshold to trigger the alert. Did I understand correctly?
Yes exactly!
Or the new metric should...
basically combining the two, yes looks good.
PompousParrot44 I think the website should address that:
https://allegro.ai/
But the TD;DR is the enterprise version adds Full Dataset Versioning on top, with end-to-end integration from code to DLOps (e.g.. data sampling , database query capabilities, data visualization, multi-site support, permission etc,)
It should also work with host IP and two docker compose files.
I'm not sure where to push a for a unified docker compose?
Basically it solves the remote-execution problem, so you can scale to multiple machines relatively easy :)
Hi SubstantialElk6
you can do:from clearml.config import config_obj config_obj.get('sdk')
You will get the entire configuration tree of the SDK section (if you need sub sections, you can access them with '.' notation, e.h. sdk.storage
)
Verified, you are correct "." in label enumeration will break the clone .
I'll make sure this bug is passed to backend guys to fix. Thanks TenseOstrich47 !
meanwhile maybe "_" instead ? 😁
AbruptHedgehog21 looking at the error, seems like you are out of storage 😅
RipeGoose2 models are automatically registered
i.e. added to the models artifactory, but it only points to where the files are stored
Only if you are passing the output_uri
argument to the Task.init, they will be actually uploaded.
If you want to disable this behavior you can passTask.init(..., auto_connect_frameworks={'pytorch': False})
MysteriousBee56 that is very strange definitely explains it, kudos on debugging it !!!
tried it and restarted the agent, but not working properly
What do you mean not working? can you provide logs ?
or me it sounds like the starting of the service is completed but I don't really see if the autoscaler is actually running. Also I don't see any output in the console of the autoscaler.
Do notice the autoscaler code itself needs to run somewhere, by default it will be running on your machine, or on a remote agent,
GiganticTurtle0 fix was pushed 🙂
you can test with:pip install git+
🤞
I'm assuming you cannot directly access port 10022 (default ssh port on the remote machine) from your local machine, hence the connection issue. Could that be?
it was uploading fine for most of the day
What do you mean by uploading fine most of the day ? are you suggesting the upload stuck to the GS ? are you seeing the other metrics (scalars console logs etc) ?
Hi @<1578193384537853952:profile|MoodyOx45>
I have a task A that creates another task B via subprocess.
So the thing about the agent, when it runs the code, there is only One task to rule them all. basically any fork/spawn of subprocess will automatically be logged as the parent Task
I think that what you want is to build a pipeline from those Tasks? Or create a Task and enqueue it manually directly from Task A?
(btw: you can forcefully cause the subprocess to create it's own Task b...