Reputation
Badges 1
25 × Eureka!Hi GleamingGrasshopper63
How well can the ML Ops component handle job queuing on a multi-GPU server
This is fully supported π
You can think of queues as a way to simplify resources for users (you can do more than that,but let's start simple)
Basicalli qou can create a queue per type of GPU, for example a list of queues could be: on_prem_1gpu, on_prem_2gpus, ..., ec2_t4, ec2_v100
Then when you spin the agents, per type of machine you attach the agent to the "correct" queue.
Int...
Hi IrateBee40
What do you have in your ~/clearml.conf
?
Is it pointing to your clearml-server ?
No worries π glad it worked
right click on the experiment, select Reset, now you can edit it.
The only downside is that you cannot see it in the UI (or edit it).
You can now do:data = {'datatask': 'idhere'} task.connect(data, 'DataSection')
This will create another section named "DataSection" on the configuration tab. then you will be able to see/edit the input Task.id
JitteryCoyote63 what do you think?
Question - why is this the expected behavior?
It is π I mean the original python version is stored, but pip does not support replacing python version. It is doable with conda, but than you have to use conda for everything...
These are both specific cases of the glue, and yes both need to be fixed.
(1) I think is actually a feature, nonetheless we should support it.
FriendlySquid61 could you verify specifically on (2)
Is this caused by running the script with the arguments
Yep π
Thank you WackyRabbit7 please feel free to remind me if it slips away during my night time (yes I do sleep , contrary to common belief :))
for future reference this is indeed a PEP-610 related bug, f
π
can we also set theΒ
poetry
Β version used?....
Actually the agent assumes poetry is preinstalled (so whatever you already have on the docker) ...
That said, maybe we should install a specific version (after installing pip, we could do that if poetry is selected)
wdyt ?
Hmm... That's what happens with the exception of None/'' if type is str... There is no way to differentiate in the UI.
This is why we opted for type=str
will "cast" everything to str so you always get str, while not specifying a type will leave the variable as is... If you have an idea on how to support both, feel free to suggest π
Hi JitteryCoyote63 ,
These properties are usually not available on the UI and are used internal, hence the lack of documentation. Regrading parent
property, it will hold a parent Task.id (str) , that said it has no real effect on the Task itself. You can however search for Tasks with a specific parent ID (For examples, this is how the the hyper parameter class is using this property)
Hi StormyOx60
Yes, by default it assumes any "file://" or local files, are accessible (which makes sense because if they are not, it will not able to download them).
there some way to force it to download the dataset to a specified location that is actually on my local machine?
You can specify a specific folder is not "local" and what it will do it will copy the zip locally and unzip it.
Is this what you are after ?
Hi @<1567321739677929472:profile|StoutGorilla30>
Is it necessary to serve keras model using triton engine?
It is not, but it is the most efficient way to serve keras models, and this is why by default clearml-serving is using Nvidia Triton (we are talking 10x factors)
I would start with the keras example, see that it works and then work your way into your example (notice you always need to provide the layers form the in/out of the model)
[None](https://github.com/allegroai/clearml-s...
@<1546303254386708480:profile|DisgustedBear75> is think this was a UI bug, they are just releasing a new version that fixes that (i.e. server version), are you running a self-hosted server?
ReassuredTiger98 yes this is exactly it π
agent.package_manager.type will select for the agent weather it should use conda or pip to do the installation. Basically if you develop on conda you should select conda.
The agent will first try to install packages using conda, then it will collect the missing packages and install them into the save environment only using pip.
Hi ReassuredTiger98
It's clearml
that needs to support subparser, and it does support it.
What are you seeing in the Args section ?
(Notice that at the end all the args parsing are stored on the global "args" variable after you call the pasre_args(), clearml
will basically take those variables and put them into Args
section)
Thanks @<1527459125401751552:profile|CloudyArcticwolf80> ! let me see if we can reproduce it
And command is a list instead of a single str
"command list", you mean the command
argument ?
With remote_execution it isΒ
command="[...]"
Β , but on local it isΒ
command='train'
Β like it is supposed to be.
I'm not sure I follow, could you expand ?
Thanks ReassuredTiger98 , yes that makes sense.
What's the python version you are using ?
@<1527459125401751552:profile|CloudyArcticwolf80> what are you seeing in the Args section ?
what exactly is not working ?
Hi EagerOtter28
Let's say we query another time and get 60k images. Now it is not trivial to create a new dataset B but only upload the diff: ...
Use Dataset.sync (or clearml-data sync) to check which files where changed/added.
All files are already hashed, right? I wonder whyΒ
clearml-data
Β does not keep files in a semi-flat hierarchy and groups them together to datasets?
It kind of does, it has a full listing of all the files with their hash (SHA2) values, ...
SteepDeer88
Try the following:
` Task.add_requirements("pycocotools-windows", "; platform_system == "Windows"")
Task.add_requirements("pycocotools", "; platform_system != "Windows"")
Task.init(...) You should see in your "installed packages" something like:
pycocotools-windows ; platform_system == "Windows"
pycocotools ; platform_system != "Windows" `
i'm Jax, not Manoj! lol.
I know π I just mentioned that this issue is being actively discussed
Run clearml-agent and enqueue the pipeline ? What am i missing?
Yeah you can ignore those, this is some python GC stuff, seems to be related with the OS and python version
No Task.create is for creating an external Task not logging your own process,
That said you can probably override the git repo with env vars:
None
This seems more complicated that I thought... I think you are correct, and it fails to load the entire module, let me check what I can do