JitteryCoyote63 with pleasure 🙂
BTW: the Ignite TrainsLogger will be fixed soon (I think it's on a branch already by SuccessfulKoala55 ) to fix the bug ElegantKangaroo44 found. should be RC next week
But what should I do? It does not work, it says incorrect password as you can see
How are you spinning the agent machine ?
Basically 10022 port from the host (agent machine) is routed into the container, but it still needs to be open on the host machine, could it be it is behind a firewall? Are you (client side runnign clearml-session) on the same network as the machien runnign the agent ?
BitingKangaroo95 can you post here the entire console output of clearml-session (including full command line) ?
SillyPuppy19 I think this is a great idea, basically having the ability to have a callback function called before aborting/exiting the process.
Unfortunately today abort will give the process 2 seconds to gracefully quit and then it kills the process. It was not designed to just send an abort signal, as these will more often than not, will not actually terminate the process.
Any chance I can ask you to open a GitHub Issue and suggest the callback feature. I have a feeling a few more users ...
DisgustedDove53 , TrickySheep9
I'm all for it!
I can think of two options here, (1) use the k8s glue + apply template with ports mode see discussion https://clearml.slack.com/archives/CTK20V944/p1628091020175100
(2) create an interface (queue) to launch arbitrary job on the k8s cluster, with the full pod definition on the Task. This will allow the clearml-session to setup everything from the get go.
How would you interface with the k8s operator, and what exactly will it do?
(BTW: the reas...
Ssh is used to access the actual container, all other communication is tunneled on top of it. What exactly is the reason to bind to 0.0.0.0 ? Maybe it could be a flag that you, but I'm not sure in what's the scenario and what are we solving, thoughts?
Hi ThoughtfulBadger56
If I clone and enqueue the cloned task on the webapp, does the clearml server execute the whole cmd above?
You mean agent will execute it? Do you have Task.init inside your code ?
agentservice...
Not related, the agent-services job is to run control jobs, such as pipelines and HPO control processes.
GiganticTurtle0 we had this discussion in the wrong thread, I moved it here.
Moved from the wrong thread
Martin.B [1:55 PM]
GiganticTurtle0 the sample mock pipeline seems to be running perfectly on the latest code from GitHub, can you verify ?
Martin.B [1:55 PM]
Spoke too soon, sorry 🙂 issue is reproducible, give me a minute here
Alejandro C [1:59 PM]
Oh, and which approach do you suggest to achieve the same goal (simultaneously running the same pipeline with differen...
... training script was set to upload every epoch. Seems like this resulted in a torrent of metrics being uploaded.
oh that makes sense, so basically you were bombarding the server with requests, and ending with kind of denial of service
Awesome ! thank you so much!
1.0.2 will be out in an hour
Hi CrookedAlligator14
or is underlying data also accessible?
What do you mean by "underlying data" ?
Here is a nice hack for you:Task.add_requirements( package_name='carla', package_version="> 0 ; python_version < '2.7' # this hack disables the pip install")
This will essentially make sure the agent will skip the installation of the package, but at least you will know it is there.
Hi @<1695969549783928832:profile|ObedientTurkey46>
Why do tags only show on a version level, but not on the dataset-level? (see images)
Tags of datasets are tags on "all the dataset versions" i.e. to help someone locate datasets (think locating projects as an analogy). Dataset Version tags are tags on a specific version of the dataset, helping users to locate a specific version of the dataset. Does that make sense ?
Hi ResponsiveCamel97
The agent generates a new configuration file to be mounted into the docker, with all the new folders as they will be seen inside the docker itself. One of the changes is the system_site_packages as inside the docker we want the new venv to inherit everything from the docker system installed packages.
Make sense ?
Hi LudicrousParrot69
Not sure I follow, is this pyfunc running remotely ?
Or are you looking for interfacing with previously executed Tasks ?
Hmm are you running from inside the Kaggle jupyter thing ?
How can I track in clearML that this and that row was part of experiment x because it belonged to test/training data set y?
Hi @<1543766544847212544:profile|SorePelican79>
the experiments themselves will have a link to the Dataset they were using. From a dataset perspective, the idea is not to limit you, so essentially it will package all your files, and retrieve them when you fetch the datset. In terms of specifying a row / sample. My suggestion is to mark those rows when training a...
Hmm I assume it is not running from the code directory...
(I'm still amazed it worked the first time)
Are you actually using "." ?
JitteryCoyote63 could you test the latest RC 😉pip install clearml-agent==0.17.2rc4
Just fixed, will be merged later, basically some field you are not supposed to change post execution (but system tags should be exempt from that). The SDK checks before the backend does, so you get a nice error 🙂 anyhow the backend will obviously allow it
Sorry I missed the additional "." in the _update_requirements
Let me check ....
Task.add_requirements('.')
Should work
Great, you can test directly from the master 🙂pip3 install -U git+