
Reputation
Badges 1
25 × Eureka!You might need to play around a bit, it might be that StorageHelper.get(' gs://bucket ') and then helper.list('folder/*')
Let me know what worked π
Yes, offline got broken in 1.3.0 π , RC fixed it:pip install clearml==1.3.1rc0
Stable release later this week
If the right properties are set can the profile tab be added?
I guess that is doable, that said some of the graphs are not straight forward to support like this one:
https://www.tensorflow.org/guide/images/tf_profiler/trace_viewer.png
GrievingTurkey78 short answer no π
Long answer, the files are stored as differentiable sets (think changes set from the previous version(s)) The collection of files is then compressed and stored as a single zip. The zip itself can be stored on Google but on their object storage (not the GDrive). Notice that the default storage for the clearml-data is the clearml-server, that said you can always mix and match (even between versions).
Hi GrievingTurkey78 ,
Yes this is a per file download, but I think you can list the bucket and download everything
Try:from trains import StorageManager from trains.storage.helper import StorageHelper helper = StorageHelper.get('gs://bucket/folder') remote_files = helper.list('*') for f in remote_files: StorageManager.get_local_copy(f)
Containers are not running
? but you are running the docker-compose, how come no containers are running ?
@<1560074028276781056:profile|HealthyDove84> if you want you can PR a fix, it should be very simple basically:
None
elif np_dtype == str:
return "STRING"
elif np_dtype == np.object_ or np_dtype.type == np.bytes_:
return "BYTES"
return None
Lambdaβs are designed to be short-lived, I donβt think itβs a fine idea to run it in a loop TBH.
Yeah, you are right, but maybe it would be fine to launch, have the lambda run for 30-60sec (i.e. checking idle time for 1 min, stateless, only keeping track inside the execution context) then take it down)
What I'm trying to solve here, is (1) quick way to understand if the agent is actually idling or just between Tasks (2) still avoid having the "idle watchdog" short lived, to that it can...
Hi @<1691620877822595072:profile|FlutteringMouse14>
In the latest project I created, Hydra conf is not logged automatically.
Any chance the Task.init call is not on the main script (where the Hydra is) ?
So this should be easier to implement, and would probably be safer.
You can basically query all the workers (i.e. agents) and check if they are running a Task, then if they are not (for a while) remove the "protection flag"
wdyt?
No, I just want to register a new model in the storage.
Is the model file is already uploaded, you can register it without a Task:InputModel.import_model(...)
https://github.com/allegroai/clearml/blob/b3a2b3425c5098ebfc0598c9dfb3e670d4a87706/clearml/model.py#L521
I need to create a separate task for this right?
If you want the model to be uploaded, then yes you have to create a Task.
So far my local and remote gitlab repositories are synchronized, I suspect, thatΒ
Failed applying git diff, see diff above
Β error is caused by cached repository from which clearml tries to run the process. I've cleaned the cache, but it haven't helped.
Hmm can you test with empty "uncommitted changes" ?
Just making sure when you say still does n't work, you are not trying to run the Task with the git diff that includes teh binary data right?
Yes, but where I can fi...
Bad news, there isn't a nice interface to get the table from the Optimizer object (I will make sure we add it, no reason not to).
But you can very easily get all the information you need and more:all_the_tasks = an_optimizer.get_top_experiments(top_k=100)
Then for every task in the list you can get All the information:for task in all_the_tasks: task_params_as_dict = task.get_parameters() task_scalars = task.get_last_scalar_metrics()
Basically the Task object enables you to que...
Hi @<1523702868694011904:profile|AbruptCow41>
Check what are you getting when running git status
inside the working directory, this is essentially how it works. Are you expecting to later run it with an agent?
Hi GrievingTurkey78
I'm assuming similar to https://github.com/pallets/click/
?
Auto connect and store/override all the parameters?
Hmmm, can you view the settings? that's the only thing I can think of at the moment that will be diff between your setup and the working one...
Also, is there a way for you to have the trains-server behind https (on your GCP)
EnviousStarfish54
it seems that if I don't use plt.show() it won't show up in Allegro, is this a must?
Yes , at plt.show / plt.save Trains will capture the plot and send it to the backend.
BTW: when you hover over the empty plot area, do you see the plotly objects, or is it all blank ?
EnviousStarfish54 Sure, see scatter2d
https://allegro.ai/docs/examples/reporting/scatter_hist_confusion_mat_reporting/#2d-scatter-plots
TenseOstrich47 notice:task.logger.report_matplotlib_figure( title=f"Performance Heatmap - {name}", series="Device Brand Predictions", iteration=0, figure=figure, **report_image=True,** )
report_image=True means it will be uploaded as an image not a plot (like imshow), the default is False , which would put it under Plots section
Code you add a few prints, and see where it hangs ? there's no reason for it to hang (even the plot upload is done ...
Hi UnsightlyShark53 I think you are absolutely right, there is no reason for the trains.errors.UsageError: ArgumentParser.parse_args() ...
Error.
As you mentioned, if auto_connect_arg_parser=False
is False, it should just ignore what it picked automatically.
I will make sure the error is resolved I will also make sure, you will still be able to connect the argparse manually with task.connect(parser)
after the Task has been created. Thanks for the reference! I took a look o...
Check on which queue the HPO puts the Tasks, and if the agent is listening to these queues
Assuming git repo looks something like:.git readme.txt module | +---- script.py
The working directory should be "."
The script path should be: "-m module.scipt"
And under the Configuration/Args, you should have:args1 = value args2 = another_value
Make sense?
Why can I only callΒ
import_model
Actually creates a new Model object in the system
InputModel(id) will "load" a model based on the model id
Make sense ?
do you have docker installed on all slurm agent/worker machines
Docker support?
What is the proper way to change a clearml.conf ?
inside a container you can mount an external clearml.conf, or override everything with OS environment
https://clear.ml/docs/latest/docs/configs/env_vars#server-connection
Make sense π
Just make sure you configure the git user/pass in the docker-compose so the agent has your credentials for the repo clone.