Reputation
Badges 1
25 × Eureka!Great, you can test directly from the master 🙂pip3 install -U git+
The release was supposed to be out this week, got delayed by some py2 support issue, anyhow the release will be almost exactly like the latest we now have on the GitHub repo (and I'm assuming it will be out just after the weekend)
Hmmm maybe
I thought that was expected behavior from poetry side actually
I think this is the expected behavior, hence bug?!
hi ElegantCoyote26
but I can't see any documentation or examples about the updates done in version 1.0.0
So actually the docs are only for 1.0... https://clear.ml/docs/latest/docs/clearml_serving/clearml_serving
Hi there, are there any plans to add better documentation/example
Yes, this is work in progress, the first Item on the list is custom model serving example (kind of like this one https://github.com/allegroai/clearml-serving/tree/main/examples/pipeline )
about...
Thanks SparklingHedgehong28
So I think I'm missing information on what you call "Instance protection" ?
You mean like respining spot instances ? or is it away to review the performance of AWS ASG (i.e. like a watchdog of a sort) ?
CloudyHamster42 what's the trains-server version ?
I should manually copy it to the remote services agents?
The code itself needs to run somewhere, currently this has to be your machine, either you manually run the AWS autoscaler or an agents runs it for you. Make sense ?
Hi RobustRat47
What do you mean by "log space for hyperparameter" , what would be the difference ? (Notice that on the graph itself you can switch to log scale when viewing in the UI) ?
Or are you referring to the hyper parameter optimization, allowing you to add log space ?
let me check a sec
I guess i need to do something like the following after the task was created:
...
Yes!
Why use the "post" callback and not the "pre" callback?
The post get's back the Model object. The pre allows you to decide if you actually want to log in the first place (come to think about it, maybe you want that as well 🙂 )
Just to make sure I understand, running locally creates the Args/command correctly, then when actually executed on the remote machine (i.e. execute_remotely creates the correct Args/command But when the agent actually executes it) it updates back the Args/command as a list. Is that a correct description ?
WickedGoat98 Actually the fileserver replied, so it all looks fine to me.
Try to run the text example again, see if you are still getting the fileserver error .
I try to add it to ClearML Serving, but it call
forward
method by default
If this is the case, then the statement above is odd to me, if this is a custom engine, who exactly is calling " forward
" ?
(in you code example you specifically call generate, as you should)
LittleShrimp86 did you try to run the pipeline form the UI on remote machines (i.e. with the agents)? Did that work?
AstonishingSeaturtle47 How would the code run without the sub-modules? And what is the problem we are trying to solve? (Because unfortunately there is no switch to disable it)
BTW: if you want to sync between artifacts / settings, I would recommend calling task.reload() to get the latest values back from the server.
Hi DrabOwl94
I think that if I understand you correctly you have a Lot of chunks (which translates to a lot of links to small 1MB files, because this is how you setup the chunk size). Now apparently you have reached the maximum number of chunks per specific Dataset version (at the end this meta-data is stored in a document with limited size, specifically 16MB).
How many chunks do you have there?
(In other words what's the size of the entire dataset in MBs)
Hi @<1523704152130064384:profile|SmallGiraffe94>
Yes it is possible!
set the User Properties of a dataset when creating a Dataset with
A bit hackish but should work.
dataset = Dataset.create(dataset_project="project", dataset_name="name")
dataset._task.set_user_properties(key="value")
dataset_ids = Task.query_tasks(
project_name=["project/.datasets/name"],
task_filter=dict(
type=[str(Task.TaskTypes.data_processing)],
exact_match_regex_flag=False,
...
Hi QuaintJellyfish58
like we want set using UI but limited option on dropdown list
You mean like limit the ability of users to choose specific values? (so they do not mess things up?)
JitteryCoyote63
I am setting up a new machine with two rtx 3070 GPU
Nice! you are one of the lucky few who managed to buy them 🙂
Which makes me think that the wrong torch package is installed
I think that torch 1.3.1 is does not support cuda 11 😞
DistressedGoat23 check this example:
https://github.com/allegroai/clearml/blob/master/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.pyaSearchStrategy = RandomSearch
It will collect everything on the main Task
This is a curial point for using clearml HPO since comparing dozens of experiments in the UI and searching for the best is just not manageable.
You can of course do that (notice you can actually order them by scalars they report, and even do ...
Hi ReassuredOwl55
How would I find Tasks that have the same code with different inputs/parameters?
Assuming you have the git repo
you can do:Task.query_tasks(..., task_filter={'_all_'=dict(fields=['script.repository'], pattern='github.com/user/repo'))
wdyt?
great 🙂
two things:
I'm not sure argparse supports dict as a type (I mean it will take anything but I'm not sure it will parse your arguments as dict) I know there was an issue with argparsing, but I think it was solvedbtw: Basically the way clearml-agent works, it does not actually pass the arguments in commandline but directly to the argparser at runtime
What happens if you clone the Task (the one with Args showing and without the explicit task.connect(_args)
and send it to the age...
Hi @<1523706266315132928:profile|DefiantHippopotamus88>
The idea is that clearml-server acts as a control plane and can sit on a different machine, obviously you can run both on the same machine for testing. Specifically it looks like the clearml-sering is not configured correctly as the error points to issue with initial handshake/login between the triton containers and the clearml-server. How did you configure the clearml-serving docker compose?
Hi @<1547028031053238272:profile|MassiveGoldfish6>
The issue I am running into is that this command does not give me the dataset version number that shows up in the UI.
Oh no, I think you are correct, it will not return the version per dataset 😞 (I will make sure we add it)
But with the dataset ID you can grab all the properties:Dataset.get(dataset_id="aabbcc").version
wdyt
GrievingTurkey78 notice that when enqueuing an aborted Task, the agent will not deleted the previously reported metrics/logs