Reputation
Badges 1
25 × Eureka!Hi ConvincingSwan15
A few background questions:
Where is the code that we want to optimize? Do you already have a Task of that code executed?
"find my learning script"
Could you elaborate ? is this connect to the first question ?
Hmm, maybe the original Task was executed with older versions? (before the section names were introduced)
Let's try:DiscreteParameterRange('epochs', values=[30]),
Does that gives a warning ?
You put it there 🙂 so the assumption you know what you are looking for, or use glob? wdyt?
EnthusiasticCoyote30 you can register an existing Model with:from clearml import InputModel model = InputModel.import_model(weights_url="
"...)
Archived is actually just a "flag" on the Task. If you actually want to delete it (incl artifacts), in the archived view, right click and select delete
ReassuredTiger98 yes this is exactly it 🙂
agent.package_manager.type will select for the agent weather it should use conda or pip to do the installation. Basically if you develop on conda you should select conda.
The agent will first try to install packages using conda, then it will collect the missing packages and install them into the save environment only using pip.
SoreDragonfly16 . In the hyper parameters Tab, you have "parallel coordinates" (next to the "add experiment" the button saying "values" press on it and there should be " parallel coordinates")
Is that it?
DeliciousBluewhale87
You could also just upload the data (i.e do not call close). Then you will be able to change it later obviously, this will make in intractable.
BTW: the clearml-data stores delta changes, so if you only change a few files it will only store those.
Whats the trains server IP? It seems everything is configured with local host?
GiddyTurkey39 I think I need some more details, what exactly is the scenario here?
Let me know if there is an issue 🙂
the only thing that missing is some plots on the clearml server (app ) when i got to the details of the train i cannot see the matrix confusion for example ( but its exists on the bucket )
How do you report the "matrix confusion" ? (I might have an idea on what's the difference)
I think that what you need is the triggers, check this one:
https://clear.ml/docs/latest/docs/references/sdk/trigger
Yep 🙂
Also maybe worth changing the entry point of the agent docker to always create a queue if it is missing?
This smells like a driver/image issue on the instance VM
What are you getting if add this inside your code?
os.system('nvidia-smi')
okay this points to an issue with the k8s glue, I think it somehow failed to launch the pod. Can you send me the log of the clearml-k8s-glue ?
Hi UnsightlySeagull42
Do you mean how tp pass user/pass (user/token) to the clearml-agent so it can clone your repository ?
https://github.com/allegroai/clearml-agent/blob/a2db1f5ab5cbf178840da736afdc370cfff43f0f/docs/clearml.conf#L18
NICE! MagnificentSeaurchin79 could you PR this fix?
OddAlligator72 okay, that is possible, how would you specify the main python script entry point? (wouldn't that make more sense rather than a function call?)
How do you determine which packages to require now?
Analysis of the actual repository (i.e. it will actually look for imports 🙂 ) this way you get the exact versions you hve, but nit the clutter of the entire virtual environment
That is a good point, when the UI returns a pop error, you have all the info, on "toaster" messages, I guess it does not?!
As long as it does not make the toaster message too large, seems like a good idea to add
AdventurousRabbit79 you mean like minio / ceph ?
Oh that's definitely off 🙂
Can you send a quick toy snippet to reproduce it ?
Hmm I see your point.
Any chance you can open a github issue with a small code snippet to make sure we can reproduce and fix it?
Can't figure out what made it get to this point
I "think" this has something to do with loading the configuration and setting up the "StorageManager".
(in other words setting the google.storage)... Or maybe it is the lack of google storage package?!
Let me check
CrookedWalrus33 can you post the clearml.conf you have on the agent machine?
basically use the template 🙂 we will deprecate the override option soon
But essentially Prefect also has agents to run jobs on machines where the processes run (which seems to be exactly the same model as in ClearML),
Yes ait is conceptually very similar
this data is highly regulated data, ...
The main difference that with ClearML the agents are running on Your machines (either local or on Your cloud account) the clearml-server does not actually have access to the data streaming through it.
Does that make sense ?