Reputation
Badges 1
25 × Eureka!When i say accessing, it means i want to use the data for training(without actually getting a local copy of it ).
How can you "access" it without downloading it ?
Do you mean train locally on a subset, then on the full dataset remotely ?
Hi SillyPuppy19
I think I lost you half way through.
I have a single script that launches training jobs for various models.
Is this like the automation example on the Github, i.e. cloning/enqueue experiments?
flag which is the model name, and dynamically loading the module to train it.
a Model has a UUID in the system as well, so you can use that instead of name (which is not unique), would that solve the problem?
This didn't mesh well with Trains, because the project a...
so "add_external_files" means the files are not actually uploaded, they are "registered" as external links. This means the upload is basically doing nothing , because there is nothing to upload
Where exactly are you getting the error?
You mean does one solution is better than combining maintaining and automating 3+ solutions (dvc/lakefs + mlflow + cubeflow/airflow)
Yes I'd say it is. BTW if you have airflow running for other automations you can very easily combine the automation with clearml and have a single airflow automation for everything, but the main difference now airflow only launches logic, never actual compute/data (which are launched and scaled via clearml
Does that make sense?
I want the model to be stored in a way that clearml-serving can recognise it as a model
Then OutputModel or task.update_output_model(...)
You have to serialize it, in a way that later your code will be able to load it.
With XGBoost, when you do model.save clearml automatically picks and uploads it for you
assuming you created the Task.init(..., output_uri=True)
You can also manually upload the model with task.update_output_model or equivalent with OutputModel class.
if you want to dis...
Hi SuperiorCockroach75
You mean like turning on caching ? What do you mean by taking too long?
Yeah the doctring is always the most updated 🙂
Seems the apiserver is out of connections, this is odd...
SuccessfulKoala55 do you have an idea ?
Yeah that makes sense, I mean it will probably be a bit more than that per month when it's up but half when it's down (just fyi, when AWS instances are down you still pay for the EBS storage).
If you are trying o save a buck here, double check on that otherwise you will end at the same cost level but after spending resource on migrating.
If you want a good hack you can always download the data and then just store it locally (i.e. half the migration job) and just reduce the number of users whe...
Hi SourSwallow36
What do you man by Log each experiment separately ? How would you differentiate between them?
FileNotFoundError: [Errno 2] No such file or directory: 'tritonserver': 'tritonserver'
This is oddd.
Can you retry with the latest from the github ?pip install git+
I want to build a real time data streaming anomaly detection service with clearml-serving
Oh, so the way it currently works clearml-serving will push the data in real-time into Prometheus (you can control the stats/input/out), then you can build the anomaly detection in grafana (for example alerts on histograms over time is out-of-the-box, and clearml creates the histograms overtime).
Would you also need access to the stats data in Prometheus ? or are you saying you need to process it ...
Please attach the log 🙂
That makes total sense, this is exactly an OS scenario for signal 9 🙂
Ohh I see.
In your web app, look for the "?" icon (bottom left corner), click on it, it should open the full platform documentation
VexedCat68 actually a few users already suggested we auto log the dataset ID used as an additional configuration section, wdyt?
yes, looks like. Is it possible?
Sounds odd...
Whats the exact project/task name?
And what is the output_uri?
When you install using pip <filename> you should end up with something like:minerva @ file://... or minerva @ https://...
Hi VexedCat68
Are we talking youtubes ? docs? courses ?
Hi LazyTurkey38
Configuring these folders will be pushed later today 🙂
Basically you'll have in your clearml.conf
` agent {
docker_internal_mounts {
sdk_cache: "/clearml_agent_cache"
apt_cache: "/var/cache/apt/archives"
ssh_folder: "/root/.ssh"
pip_cache: "/root/.cache/pip"
poetry_cache: "/root/.cache/pypoetry"
vcs_cache: "/root/.clearml/vcs-cache"
venv_build: "/root/.clearml/venvs-builds"
pip_download: "/root/.clearml/p...
Any chance you can share the Log?
(feel free to DM it so it will not end up public)
Interesting...
We could followup the .env configuration, and allow the clearml-task to add configuration files from cmd line. This will be relatively easy to add. We could expand the Environment support (that somewhat exists), and add the ability to read variables from .emv and Add them to an "hyperparemeter" section, named Environment. wdyt?
Hmm you mean how long it takes for the server to timeout on registered worker? I'm not sure this is easily configured
Hi SmallDeer34
Can you try with the latest RC , I think we fixed something with the jupyter/colab/vscode support!pip install clearml==1.0.3rc1
How did you define the decorator of "train_image_classifier_component" ?
Did you define:@PipelineDecorator.component(return_values=['run_model_path', 'run_tb_path'], ...Notice two return values
SarcasticSparrow10 LOL there is a hack around it 🙂
Run your code with python -O
Which basically skips over all assertion checks
So it should cache the venvs right?
Correct,
path: /clearml-cache/venvs-cache
Just making sure, this is the path to the host cache folder
ClumsyElephant70 I think I lost track of the current issue 😞 what's exactly not being cached (or working)?
Can you send the full log as attachment?
Hi @<1590514584836378624:profile|AmiableSeaturtle81>
I think you should use add_external_files , instead of add_files (which is for local files)
None