Reputation
Badges 1
25 × Eureka!So for this...
Sorry, what is exactly "this" ?
Ohh, yes that makes sense so just send them as a list of links in a single calldataset.source_url(["s3://", "s3://"], ...)
This will be a single update
https://github.com/allegroai/clearml/blob/ff7b174bf162347b82226f413040ff6473401e92/clearml/datasets/dataset.py#L430
Hi CourageousDove78
Not the cleanest, but you can basically pass everything here:
https://allegro.ai/clearml/docs/rst/references/clearml_api_ref/index.html#post--tasks.get_all
Reasoning is that it is passed almost as is to the server for the actual query.
That is a good question ... let me check π
Hi SarcasticSparrow10
which database services are used to...
Mongo & Elastic
You can query everything using ClearML interface, or talk directly with the databases.
Full RestAPI is here:
https://clear.ml/docs/latest/docs/references/api/endpoints
You can use the APIClient for easier pythonic interface:
See example here
https://github.com/allegroai/clearml/blob/master/examples/services/cleanup/cleanup_service.py
What is the exact use case you have in mind?
EnviousPanda91 notice that when passing these arguments to clearml-agent you are actually passing default args, if you want an additional argument to Always be used, set the extra_docker_arguments
here:
https://github.com/allegroai/clearml-agent/blob/9eee213683252cd0bd19aae3f9b2c65939d75ac3/docs/clearml.conf#L170
EnviousPanda91 the host checks if you have a .ssh folder on the machine, if you do, it will copy+mount it into the container, then it will delete the copy when the container is down.
Specifically /tmp/clearml_agent.ssh.rbw8o0t7
is the copy of the .ssh that the agent created, and now it is mounting it into the container
Hi IrritableJellyfish76
If you are running a code that uses clearml from kubeflow, you have out of the box integration between the two, what am I missing?
we will try to use Triton, but itβs a bit hard with transformer model.
Yes ...
All extra packages we add in serving)
So it should work, you can also run your preprocess class manually from your own machine (for debugging), if you pass to it a local file (basically the downloaded model file from the UI, it should work
it. But itβs maybe not the best solution
Yes... it is not, separating the pre/post to CPU instance and letting triton do the GPU serving is a lot more effici...
Thanks!
I think this one will cover both case (the issue is with files on the root of the dataset)if not (fnmatch(k, path) and fnmatch(k if '/' in k else '/{}'.format(k), '*/' + wildcard))}
Hi SubstantialBaldeagle49
2. Sure follow the back procedure and restore on the new server
3. Yes
task=Task.get_task(task_id='aa')
task.get_logger().report_scalar()
Hi @<1730033904972206080:profile|FantasticSeaurchin8>
You mean in the UI , or when reporting on the SDK?
I see if this is the case try to set
'output_uri="file:///full/path/to/dir"'
Notice it has to have the full path there and the file:// prefix
No, it is zipped and stored, so in order to open the zipfile and read the files you have to download them.
That said everything is cached, so if the machine already downloaded the dataset there is zero download / unzipping,
make sese?
SmilingFrog76 this is not a weird mechanism at all , this is proper HPC scheduler πtrains-agent
is not actually aware of other nodes, it is responsible for launching a Task on its own hardware (with whatever configuration it was set). What can be done is to use the trains-agent
inside a 3rd party scheduler and have the scheduler allocate the node and trains-agent spin the experiment. There is a k8s example here: basically pulling jobs for the trains-server queue and pushing ...
Hi @<1561885941545570304:profile|PunyKangaroo87>
What do mean by store data locally?
Like clearml-data? I.e Dataset?
You can always use file:///root/path/folder as destination, this will store everything into the local folder, is that it?
Hi @<1566596960691949568:profile|UpsetWalrus59>
you should call it before initializing the Task
Task.ignore_requirements("pywin32")
task = Task.init(...)
`
Example use case:
an_optimizer = HyperParameterOptimizer(
# This is the experiment we want to optimize
base_task_id=args['template_task_id'],
# here we define the hyper-parameters to optimize
hyper_parameters=[
UniformIntegerParameterRange('General/layer_1', min_value=128, max_value=512, step_size=128),
UniformIntegerParameterRange('General/layer_2', min_value=128, max_value=512, step_size=128),
DiscreteParameterRange('General/batch_size', values=[...
PungentLouse55 from the screenshot I assume the experiment template you are trying to optimize is not the one from the trains/examples π
In that case, and based on the screenshots, the prefix is "Args/" as this is the section name.
Regrading objective metric, again based on your screenshots:objective_metric_title="Accuracy" objective_metric_series="Validation"
Make sense ?
so firs yes, I totally agree. This is why the clearml-serving
has a dedicated statistics module that creates histograms over time, then we push it into Prometheus and connect grafana to it for dashboards and alerts.
To be honest, I would just use it instead of reporting manually, wdyt?
I'm not familiar with this one, I think you should be able to control it with:
None
CLEARML_AGENT__API__HTTP__RETRIES__BACKOFF_FACTOR
I'm getting lot of bizarre errors running without a docker image attached
I think there is a mix in terminology
ClearML Agent can run in two different modes:
- virtual env - where it create a new venv for every Task executed
- docker mode- where it spins a docker as Base environment, then inside the docker (in real time) it will fetch the code, install missing python packages etc.There is no need to build a specific docker container, for example you can use the "python:3.10-bullseye" d...
Hi SkinnyPanda43
I realized that the params are not being saved anymore
Could you test with clearml==1.0.4 ?
Yes, sorry, that wasn't clear π
NICE! MagnificentSeaurchin79 could you PR this fix?
Archived is actually just a "flag" on the Task. If you actually want to delete it (incl artifacts), in the archived view, right click and select delete
Oh, fork the repository (this will create a copy on your GitHub account), this is done from GitHub's web page
Then commit to your repository (on the master branch)
Then in the GitHub page of the repository on your account, you will have a green button suggesting you to PR it π
SmugSnake6 I think the latest version (1.8.0) tries to parallelize it
You can also control max_workers
Hi @<1523704757024198656:profile|MysteriousWalrus11>
in the pipeline quickly between pipeline.add_step() functions?
You mean you want to get access to the parent Task ids and query them directly ?
I think the easiest way is to pass it as one of the parameters
(you can get to the pipeline Task itself from the running component, then get the dag, but these are internal functions, maybe we should make them external for easier querying ?)
pipe.add_step(
name="stage_process",
...
Hi @<1523704207914307584:profile|ObedientToad56>
hat would be the right way to extend this with let's say a custom engine that is currently not supported ?
as you said 'custom' π
None
This is actually a custom
engine, (see (3) in the readme, and the preprocessing.py
implementing it) I think we should actually add a specific example to custom
so this is more visible. Any thoughts on what would...