Hi VexedCat68
One of my steps just finds the latest model to use. I want the task to output the id, and the next step to use it. How would I go about doing this?
When you say "I want the task to output the id" do you mean to pass t to the next step:
Something like this one:
https://github.com/allegroai/clearml/blob/c226a748066daa3c62eddc6e378fa6f5bae879a1/clearml/automation/controller.py#L224
Sorry @<1524922424720625664:profile|TartLeopard58> 😞 we probably missed it
clearml-session is still being developed 🙂
Which issue are you referring to ?
Hi TrickyRaccoon92
Are you sure plotly (the front-end module displaying the plots in the UI) supports it ?
WickedGoat98 this is awesome! Let me know how I could help 🙂
BTW: I checked regrading the plot comparison, this is a BE issue due to the size of the plot, I was told a fix will be deployed in a day or two.
Hi @<1643060801088524288:profile|HarebrainedOstrich43>
try this RC let me know if it works 🙂
pip install clearml==1.13.3rc1
Queues can have multiple workers, and that implies multiple instances of a task can run concurrently.
@<1533619716533260288:profile|SmallPigeon24> as long as these are the Exact same instances you can have them runing simultaneously (think multi node training), that said each one should "know" not to report over the others, because of course it will overwrite the reports.
Back to your point on multiple agents:
You cannot have two Tasks in the same queue, that means that a single agen...
I think I'm missing the connection between the hash-ids and the txt file, or in other words why is the txt file containing full path not relative path
SteadyFox10 TRAINS_CONFIG_FILE or CLEARML_CONFIG_FILE
What happens when you call:
from clearml.backend_interface.task.repo import ScriptInfo
print(ScriptInfo._ScriptInfo__legacy_jupyter_notebook_server_json_parsing(None))
That is awesome!
If you feel like writing a bit about the use-case and how you solved it, I think AnxiousSeal95 will be more than happy to publish something like that 🙂
Could you see if that makes a difference ?
The configuration tab -> configuration objects -> pipeline is empty
That's the reason it is doing nothing 😞
How come it is empty if you Cloned the local one?
In any case, do you have any suggestion of how I could at least hack tqdm to make it behave? Thanks
I think I know what the issue is, it seems tqdm is using Unicode for the CR this is the 1b 5b 41
sequence I see on the binary log.
Let me see if I can hack something for you to test 🙂
I see TightElk12
You can always setup the OS environments : CLEARML_API_HOST CLEARML_WEB_HOST CLEARML_FILES_HOST with the correct configuration Or you can simply set CLEARML_NO_DEFAULT_SERVER=1 which will prevent any usage of the default demo serverwdyt?
Hi TrickyRaccoon92
Yes please update me once you can, I would love to be able to reproduce the issue so we could fix for the next RC 🙂
When you say status, what do you mean? Is it active? Running a task?
Hmm that is odd, can you send an email to support@clear.ml ?
If this is the case, there is nothing you need to change, just provide the docker image (no need to pass packages
)
ClumsyElephant70
Could it be virtualenv package is not installed on the host machine ?
(From the log it seems you are running in venv mode, is that correct?)
So that agent on different nodes will probably require different cuda-version images.
That makes sense SarcasticSquirrel56
I would edit the helm chart (or deploy manually) based on a selector that will select the different nodes/gpus and assign the correct containers (i.e. matching CUDA versions to the diff GPUs / drivers)
BTW: you can also playaround with k8s glue, which would dynamically spin pods based on clearml Tasks.
wdyt?
AdventurousButterfly15
Despite having manually installed this torch version, during task execution agent still tries to install it somehow and fails:
Are you running the agent in venv mode? or docker mode?
Notice that in docker mode it inherits the python packages from the container, and adds/reinstalls missing packages. In venv mode it creates a New clean venv (there is no way to inherit a venv, venv can only inherit from system wide installed packages)
The idea is that you cannot e...
Yes MuddySquid7 it is automatically detects it (regardless of you uploading DF as an artifact).
How are you saving the dataframe ?
(it will auto log any joblib.save call, is that it?)
at means I need to pass a single zip file toÂ
path
 argument inÂ
add_files
 , right?
actually the opposite, you pass a folder (of files) to add_files. Then add_files remembers the files location (and pre calculates the hash of the files content). When you call upload
it will actually compress the files that changed into a zip file (or files depending on the chunk size), and upload the files to the destination (as specified in the upload
call...
The fact is that I use docker for running clearml server both on Linux and Windows.
My question was on running the agent, is it running with --docker
flag, i.e. docker mode
Also, just forgot to note, that I'm running clearml-agent and clearml processes in virtual environment - conda environment on Windows and venv on Linux.
Yep that answers my question above 🙂
Does it make any sense to chdngeÂ
system_site_packages
 toÂ
true
 if I r...
Could you give an example of such configurations ?
(e.g. what would be diff from one to another)
Hi GiganticTurtle0
Sure, OutputModel can be manually connected:model = OutputModel(task=Task.current_task()) model.update_weights(weights_filename='localfile.pkl')
Hi DangerousDragonfly8
You mean you want to trigger something when users archive a Task ?
Correct,
Notice that the glue has it's own defaults and the ability to override containers from the UI
os.environ['TRAINS_PROC_MASTER_ID'] = '1:da0606f2e6fb40f692f5c885f807902a' os.environ['OMPI_COMM_WORLD_NODE_RANK'] = '1' task = Task.init(project_name="examples", task_name="Manual reporting") print(type(task))
Should be: <class 'trains.task.Task'>
I'm assuming those errors are from the triton containers? where you able to run the simple pytorch mnist example serving from the repo?