It will also allow you to pass them to Hydra (wither as overloaded, or directly edit the entire hydra config)
This is what I just used:
` import os
from argparse import ArgumentParser
from tensorflow.keras import utils as np_utils
from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import Activation, Dense, Softmax
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint
from clearml import Task
parser = ArgumentParser()
parser.add_argument('--output-uri', type=str, required=False)
args =...
Hmm SuccessfulKoala55 what do you think?
I just cloned it from the examples that are available in the SaaS console upon account creation
Ohhh! that would explain it. Maybe it is broken there?! let me check a second
I wonder if I just need to join 2 docker-compose files to run everything in one session
Actually that could also work
But for reference, when I said IP i meant the actual host network IP not the 127.0.0.1 (which is the same as localhost)
Hi EagerOtter28
The agent knows how to do the http->ssh conversion on the fly, in your cleaml.conf (on the agent's machine) set force_git_ssh_protocol: true
https://github.com/allegroai/clearml-agent/blob/42606d9247afbbd510dc93eeee966ddf34bb0312/docs/clearml.conf#L25
Hi @<1569496075083976704:profile|SweetShells3>
Are you using the standard docker-compose ? are using the default elastic container ?
What exactly changed ?
The 'on-premise' server fails to connect to the ClearML server because of the VPN I think
I think you are correct.
You can quickly test it, try ti run curl http://local-server:8008 see if that works
Hi BoredSquirrel45
as of today, my required packages aren't being recognized in cloned
Are you saying you are editing the code directly in the cloned Task, then enqueue the Task an the agent does not "auto recognize" the package ?
Hi @<1566959357147484160:profile|LazyCat94>
So it seems the arg parser is detecting the configuration YAML
The first thing I would suggest is changing it to a relative path (so that when launched on remote machines it will find the YAML file)
Regardless how are you launching the HPO ? are you spinning a new agent ?
(as background, argparser arguments are injected in realtime by the agent or the HPO when running as subprocesses)
Now in case I needed to do it, can I add new parameters to cloned experiment or will these get deleted?
Adding new parameters is supported 🙂
Hi AverageBee39
What's the clearml-server and clearml packge you are using ?
(I looks like some capability that is missing from the server, i.e. needs upgrade ?!)
So without the flush I got the error apparently at the very end of the script -
Yes... it's a python thing, background threads might get killed in random order, so that when one needs a background thread that died you get this error, which basically should mean you need to do the work in the calling thread.
This actually explains why calling Flush solved the issue.
Nice!
Do you accidentally know if there are any plans for an implementation with the logger variable, so that in case of something it would be possible to write to different tables?
CheerfulGorilla72 what do you mean "an implementation with the logger variable" ? pytorch-lighting defaults to the TB logger, which clearml will automatically catch and log into the clearml-server, you can always add additional logs with clearml interface Logger.current_logger().report_???
What am I mis...
Is there a way to do this all elegantly?
Of yes there is, this is how TaskB code will look:
` task = Task.init(..., 'task b')
param = {'TaskA' :'TaskAs ID HERE'}
task.connect(param)
taska_model = Task.get_task(param['TaskA']).models['output''][-1]
torch.load(taska_model.get_local_copy())
train
torch.save('modelb') `I might have missed something there, but generally speaking this will let you:
Select TASKA as a parameter of TaskB training process Will register automagically Tasks'A...
I reached over 1M API calls in about one week using clearml-serving
Oh that makes sense now 🙂
If I remember correctly, adding an additional model to a signal clearml-serving instance should not actually change the number of API calls, they are mostly affected by the number of clearml-serving / containers and not in the number of models.
My only point is, if we have no force_git_ssh_port or force_git_ssh_user we should not touch the SSH link (i.e. less chance of us messing with the original URL if no one asked us to)
RipeGoose2 you mean to have the preview html on S3 work as expected (i.e. click on it add credentials , open in a new tab) ?
My pleasure 🙂
Maybe we should do a webinar... I have a feeling the MLOps aspects are not as straight forward as we would like to think ...
Hi NonchalantDeer14
In multi-gpu, can you still see the logs on the local Tensorboard ?
Are you running manually or with an agent ?
Hmm that makes sense, btw the PYTHONPATH set by the agent would be the working dir listed under the Task, But if you set the agent.force_git_root_python_path the agent would also add the root git repo to the python path
Gitlab has support for S3 based cache btw.
This might still be considered "slow" compared to local-dist/cluster mount
Would adding support for some sort of post task script help? Is something already there?
Interesting, can you expand on the use case? (currently there is only pre-task script, for setup)
I'm checking the preview HTML and it seems like it was not uploaded...
Also, how do pipelines compare here?
Pipelines are a type of Task, so like Tasks you can clone and enqueue them, or set them as the target of the trigger.
the most flexible solution would be to have some way of triggering the execution of a script in the parent task environment,
This is the exact idea of the TriggerScheduler None
What am I missing here?
RattySeagull0 I think you are correct, python 3.6 is the installed inside the docker. Is it important to have 3.7 ? You might need another docker (or change the installation script and install python 3.7 inside)