BattyDove56 , that was my suspicion as well, that's why I wanted to see the logs 🙂
Doesn't look like anything critical 🙂
Hi IdealGorilla64 , can you please provide a screenshot? How much smaller than 200 have you tried. And finally which version of ClearML are you using? 🙂
Hi DistressedGoat23 , can you please elaborate a bit on what you're like to do?
clearml-agent
is for orchestration - remote execution. clearml
is the python package you need to install and add the magic lines of code:
https://github.com/allegroai/clearml
It depends on what you use K8s for
What is your use case though? I think the point of local/remote is that you can debug in local
Hi @<1523701523954012160:profile|ShallowCormorant89> , I think you can simply spin down all the containers and copy everything in /opt/clearml/
I'm sorry. I think I wrote something wrong. I'll elaborate:
The SDK detects all the packages that are used during the run - The Agent will install a venv with those packages.
I think there is also an option to specify a requirements file directly in the agent.
Is there a reason you want to install packages from a requirements file instead of just using the automatic detection + agent?
Hi FierceHamster54 , I'm afraid currently this is not possible. Maybe open a Github issue to track this 🙂
but I want to change what is shown by the GUI so that would need to be a setting on the server itself?
Can you please elaborate?
I would suggest googling that error
Hi PerplexedElk26 , It seems you are correct. This capability will be added in the next version of the server.
Check the docker containers logs to see if there are any errors in them when you try to view the worker stats
I think this is because you're working on a "local" dataset. Only after finalizing the dataset closes up. Can you describe your scenario and what was your expected behavior?
Hi @<1734020162731905024:profile|RattyBluewhale45> , are they running anything? Can you see machine statistics on the experiments themselves?
Can you share a screenshot of the workers page?
StickyCoyote36 , I think that is the solution. Is there a reason you want to ignore the "installed packages"? After all those are the packages that the task was ran with.
Hi 🙂
Regarding the input issue - Try defining in your ~/clearml.conf
the following: sdk.development.default_output_uri
to wherever you want it uploaded. I'm guessing that when you're running the original input model is created through the script and downloaded?
Regarding tagging - I think you need to connect tags individually to output models if you wanna connect it only to outputs
So If I manually add a dataset (many excels), in a folder, and copy that folder to NFC
How would you do that?
Also 1 more question, can we have more than one storage options , a secondary storage maybe. if yes which changes need to be performed.
You can. But that would entail creating a new dataset with output_uri pointing to the new location
I'm guessing this is a self deployed server, correct?
DilapidatedDucks58 , regarding internal workings - MongoDB - all experiment objects are saved there. Elastic - Console logs, debug samples, scalars all is saved there. Redis - some stuff regarding agents I think
If you go into the settings, at the bottom right you will see the version of the server
And the experiments ran on agents or locally (i.e pycharm/terminal/vscode/jupyter/...)
Also, if you open Developer Tools, do you see any errors in the console?
What version is the server? Do you see any errors in the API server or webserver containers?
DistressedGoat23 , how are you running this hyper parameter tuning? Ideally you need to have
` From clearml import Task
task = Task.init() `
In your running code, from that point onwards you should have tracking
VictoriousPenguin97 , I managed to reproduce the issue with 1.1.3 as well. It should be fixed in the next version 🙂
Meanwhile as a workaround please try using shorter file name. The file name you provided is almost 200 characters long.
Keeping it under 150 characters will still work (I made sure to test it).
GrievingTurkey78 , can it be a heavy calculation that takes time? ClearML has a fallback to time instead of iterations if a certain timeout has passed. You can configure it with task.set_resource_monitor_iteration_timeout(seconds_from_start=<TIME_IN_SECONDS>)