Reputation
Badges 1
25 × Eureka!So are you saying the large file size download is the issue ? (i.e. network issues)
BitterStarfish58 could you open a GitHub issue on it? I really want to make sure we support it (and I think it should not be very difficult)
Hi UnevenDolphin73
This differentiable storage - does it only work on file additions/removal, or also on intra-file changes?
This is on a file level, meaning you change a single byte in the file, the entire file will be packaged in the new version.
Make sense ?
Looks great, let me see if I can understand what's missing, because it should have worked ...
What do you have under the "installed packages" ?
But only 1 node will copy it.
they can only copy it after the first is finished, and they are not aware it is trying to set the exact venv, hence the race
Hi @<1743079861380976640:profile|HighKitten20>
but when I try to use code stored in a GIT (Bitbucket) repo I got a repository cloning error, specifically
did you pass configure the git repo application/pass here: None
Oh sorry:pip install clearml-agent==1.2.0rc4
Also automatically detects if you have an active venv inside the container and uses it instead of the system wide python
Yes, I find myself trying to select "points" on the overview tab. And I find myself wanting to see more interesting info in the tooltip.
Yep that's a very good point.
The Overview panel would be extremely well suited for the task of selecting a number of projects for comparing them.
So what you are saying, this could be a way to multi select experiments for detailed comparison (i.e. selecting the "dots" on the overview graph), is this what you had in mind?
GiganticTurtle0 , let me add some background. The idea is that at some point you had your code running on your machine (when developing it for example),
when you actually executed the code itself in development, you call 'task.init' (to track the development process for example). This Task.init call, did the analysis of the code and python package dependencies and stored in on the Task. Then when you clone the Task, it already lists all the python packages your code directly imports (see "In...
Well if we the "video" from TB is not in mp4/gif format than someone will have to encode it.
I was just pointing that for the encoding part we might need additional package
Is the agent itself registered on the clearml-server (a.k.a can you see it in the UI?)
In your trains.conf, change the valuefiles_server: '
s3://ip :port/bucket'
TBH ClearML doesn't seem to be picking the model up so I need to do it manually
This is odd, cleamrl will pick framework level serialization, but not just any pickle call
Why do I need an output_uri for the model saving? The dataset API can figure this out on its own
So that it knows where to upload it, if your are setting True
this will be the default files server, you can also set iy for shared files system, S3 GCP storage etc.
If no value is passed, it will just log th...
None
No they are not, they are taking the vscode backend and put it behind a webserver-ish
You can always click on the name of the series and remove it for display.
Why would you need three graphs?
Hi FierceFly22
You called execute_remotely a bit too soon. If you have any manual configuration, they have to be called before, so they are stored in the Task. This includes task.connect and task.connct_configuration.
Hmm I wonder, can you try with this line before?Task._report_subprocess_enabled = False frameworks = { 'tensorboard': True, 'pytorch': False } Task.init(...)
C will be submitted to a different queue and I donβt care as much
Is there a way to define βtask affinityβ in this way?
Hi RoughTiger69 ,
when you say Task affinity, you mean, I want C to be executed next to A/B ? Affinity as a concept doesn't really exist, it can be abstracted to a queue, where you have agents pulling from multiple queues. Then C can be pushed to one the the queues (in theory you might be able to programmtically control the Queue of C), wdyt?
SmarmySeaurchin8 what's the mount command you are using?
Hi @<1544853721739956224:profile|QuizzicalFox36>
http:/34.67.35.46:8081/...
notice there is a / missing in the link, how is that possible? it should be http://
Essentially the example provide just prints out ids to the log file,
What do mean?
PungentLouse55 , make sure you fix the metric objective and args:
Add "General/" prefix to the list of arguments to optimize, and change the objective metric from "Accuracy" to "epoch_accuracy"
@<1523701868901961728:profile|ReassuredTiger98> if you use the latest RC! i sent and run with --debug
in the log you will see the full /tmp/conda_envaz1ne897.yml
content
Here it is copied from your log, do you want to see if this one works:
channels:
- defaults
- conda-forge
- pytorch
dependencies:
- blas~=1.0
- bzip2~=1.0.8
- ca-certificates~=2020.10.14
- certifi~=2020.6.20
- cloudpickle~=1.6.0
- cudatoolkit~=11.1.1
- cycler~=0.10.0
- cytoolz~=0.11.0
- dask-core~=2021.2.0
- de...
EnviousStarfish54 Yes i'm not sure what happens there we will have to dive deeper, but now that you got us a code snippet to reproduce the issue it should not be very complicated to fix (I hope π€ )
ShakyJellyfish91 what exactly are you passing to Task.create?
Could it be you are only passing script=
and leaving repo=
None ?
DeliciousBluewhale87 not on the opensource, for some reason it is not passed π
Could you explain the use case ?
`
Example use case:
an_optimizer = HyperParameterOptimizer(
# This is the experiment we want to optimize
base_task_id=args['template_task_id'],
# here we define the hyper-parameters to optimize
hyper_parameters=[
UniformIntegerParameterRange('General/layer_1', min_value=128, max_value=512, step_size=128),
UniformIntegerParameterRange('General/layer_2', min_value=128, max_value=512, step_size=128),
DiscreteParameterRange('General/batch_size', values=[...