Ohh sorry I missed that and answered on the original message, nvm 🙂 all is well now
Nicely done DeterminedToad86 🙂
Wasn't this issue resolved by torch?
The --template-yaml allows you to use foll k8s YAML template (the overrides is just overrides, which do not include most of the configuration options. we should probably deprecate it
ThickDove42 you need the latest cleaml-agent RC for the docker setup script (next version due next week)pip install clearml-agent==0.17.3rc0
named as
venv_update
(I believe it's still in beta). Do you think enabling this parameter significantly helps to build environments faster?
This is deprecated... it was a test to use the a package that can update pip venvs, but it was never stable, we will remove it in the next version
Yes, I guess. Since pipelines are designed to be executed remotely it may be pointless to enable an
output_uri
parameter in the
PipelineDecorator.componen...
If you passed the correct path it should work (if it fails it would have failed right at the beginning).
BTW: I think it is clearml-agent --config-file <file here> daemon ...
which is probably why it does not work for me, right?
Correct, you need to pass the entire configuration (it is stored as a blob, as opposed to the hyperparameters that are stored as individual values)
` :param configuration_overrides: Optional, override Task configuration objects.
Expected dictionary of configuration object name and configuration object content.
Examples:
{'General': dict(key='value')}
{'General': 'config...
why are there indefinitely growing anonymous tasks, even after i've closed the main schedulers.
The anonymous Tasks are The Dataset you are creating (a Dataset version is also a Task of a certain type with artifacts, the idea is usually Datasets are created from code, hence the need to combine the two).
Make sense ?
SarcasticSquirrel56
if I configure manually the pods for the different nodes, how do I make clearml server aware that those agents exist?
Basically the agent register themselves on your cleaml-server, and they register on which Queue(s) they listen to. In other words the interface to choose the different types of machines/gpus is by enqueue the Task to different queues.
For example: Queue(1): "CUDA11_GPUx1" , Queue(2): "CUDA10_GPUx1"
Make sense ?
EDIT:
I guess to achieve what I w...
WickedGoat98 until the next RC release (should not take long) this will solve it:df = pd.concat([tickerDf.Close, tickerDf_Change.Close_pcent], axis=1) df = df[1:] df.index = df.index.astype(str) setattr(df, 'ticker', args.symbol)Basically removing the nan and converting the datetime to string representation (so plotly.js likes it)
CrookedWalrus33 from the log it seems the code is trying to use "kwcoco" but it is not listed under any "Installed packages" nor do you see any attempt to install it. Can you confirm ?
In theory yes, in practice you will be using the same docker image for all the services, and they will never interfere with one another. and you have the option to do more sophisticated stuff, like map the file-server data for a clean up service (should be out in a few days :)) so a balance. Also remember that relatively speaking docker are quite light weight, this is not like saying a VM per service...
You mean I can do Epoch001/ and Epoch002/ to split them into groups and make 100 limit per group?
yes then the 100 limit is per "Epoch001" and another 100 limit for "Epoch002" etc. 🙂
BoredHedgehog47 could it be "python" python points to python 2.7 inside your container, as opposed to python3 on your machine
(this error is python2 trying to run python 3 code)
https://stackoverflow.com/questions/20555517/using-multiple-versions-of-python"Training classifier with command:\n python -m sfi.imagery.models.bbox_predictorv2.train
Thanks PompousBeetle71
Quick question, what frameworks are you using?
Do you use save method directly to file stream (or any other direct storage)?
dataset catalogue as advertised.
Creating the Dataset on ClearML, is the catalog, you can move datasets around, put in sub-folders add tags add meta-data, search etc. I think this qualifies as a dataset catalog , no?
Hi @<1533620191232004096:profile|NuttyLobster9>
Hi All, is there a way to clone a pipeline from the web UI like you can with a task?
Right click on the pipeline and select Run (it is basically the same thing as cloning it)
Maybe failed pipelines with zero steps count as completed
zero steps counts as successful.
That said, how could it have zero steps if one of the steps failed? no?
Hmm I think the approach in general would be to create two pipeline tasks, then launch them from a third pipeline or trigger externally? If on the other hand it makes sense to see both pipelines on the same execution graph, then the nested components makes a lot of sense. Wdyt?
Hi @<1600299043865497600:profile|MagnificentSeaurchin90>
Any chance you can provide more info on the error?
if I want to compare two experiments the scalar plots do not load ( loading forever ).
I'm assuming the issue is the Plots tab? or is it the Scalars? what do you have in the Plots? can you send an image of the single experiment ?
I'm guessing the extra index URL can be a URL to the github repo of interest?
The extra index URL is exactly what you would be passing to pip install, meaning it has to comply to pypi artifactory api.
Make sense ?
I got everything working using the default queue. I can submit an experiment, and a new GPU node is provisioned, all good
Nice!
My next question, how do I add more queues?
You can create new queues in the UI and spin a new glue for the queue (basically think of a queue as an abstraction for a specific type of resource)
Make sense ?
What do you mean by "modules first and find a way to install that package" ?
Are those modules already in wheels ? are they part a git repository?
(the pipeline component can also start inside a git repository it clones)
We're wondering how many on-premise machines we'd like to deprecate.
I think you can see that in the queues tab, no?