Reputation
Badges 1
25 × Eureka!If the load balancer it Gateway can do the computation and leverage caching,
Oh that's True. But unfortunately out of scope for the open-source (well at the end someone needs to pay our salaries π )
Iβd prefer not to have our EC2 instance directly exposed to the public Internet.
Yep, I tend to agree π
(with older clearml versions thoughβ¦).
Yes, we added content type header for the files when uploading to S3 (so it is easier for users to serve them back). But it seems the python 3.5 casting from Path to str breaks it mimetype call....
Hmm, #790 should be solved in 1.7.2
Yes, I always see the "model uploaded completed" for such stuck tasksAny chance this is reproducible ?
How many processes do you see running (i.e. ps -Af | grep python) ?
What is the training framework? is it multiprocess ? how are you launching the process itself? is it Linux OS? is it running inside a specific container ?
@PipelineDecorator.component(repo="..")
The imports are not recognized - they are not on the pythonpath of the task that the agent starts.
RoughTiger69 add the imports inside the functions itself, you cal also specify the, on the component@PipelineDecorator.component(..., package=["pcakge", "package==1.2.3"])
or@PipelineDecorator.component(...): import pandas as pd # noqa ...
With pleasure, I'll make sure we officially release RC1 soon :)
Please send the full log, I just tested it here, and it seems to be working
Hi ExcitedCat13
Sure, download the plugin from the git repo (Install instructions in the repo).
Regarding remote debugging, are referring to ssh ?
The plugin itself is designed to make sure that when you work on a remote machine with pycharm clearml will log the local git repo and changes (as the .git folder is not synced to the remote machine)
BTW: seems like conda doesn't support git+git:// packages
How about switching to pip ? you can still run the entire thing from conda env, it will just use pip & venv to install everything, other than that it should work as expected.
Decorators are good π
Something along the lines of
` @PipelineDecorator.pipeline(...)
def pipeline(skip_a=False):
if not skip_a:
a = step_a()
else:
# somehow get a previous A?
# let's call it cached A
a = "replace with real'
step_b(a)
... `Is this the gist?
If it is, this looks like, "how can I control whether A is cached or not", is that correct?
I want to run only that sub-dag on all historical data in ad-hoc manner
But wouldn't that be covered by the caching mechanism ?
ok, yes, but this will install the package of the branch specified there.
Correct
So If im working on my own branch and want to run an experiment, I would have to manually put in the git path my current branch name.
When you say your own branch you mean local (i.e. not pushed to remote git repo) ?
Sigint (ctrl c) only
Because flushing state (i.e. sending request) might take time so only when users interactively hit ctrl c we do that. Make sense?
Hmm there was a commit there that was "fixing" some stuff , apparently it also broke it
Let me see if we can push an RC (I know a few things are already waiting to be released)
Agent works when I am running it from virtual environment but stucks in the same place all the time when I using Docker
Can you please provide a log? I'm not sure what it means stuck
Yes, actually ensuring pip is there cannot be skipped (I think in the past it cased to many issues, hence the version limit etc.)
Are you saying it takes a lot of time when running? How long is the actual process that the Task is running (just to normalize times here)
PunySquid88 RC1 is out with a fix:pip install trains-agent==0.14.2rc1
they are just neighboring modules to the function I am importing.
So I think that is you specify the repo,, on the remote machine you will end with the code of the component sitting at the root folder of the repo, from there I assume you can import the rest, the root git path should be part of your PYTHONPATH automatically.
wdyt?
So I had to add it explicitly via a docker init script
Oh yes, that makes sense, can't think of a better hack other than sys.path.append(os.path.join(os.path.dirname(__file__), "src"))
So the only difference is how I log in into machine to start clear-ml
the only different that I can think of is the OS Environments in the two login types:
can you run export
in the two cases and check the diff between them?export
This is why we recommend using pip and not conda ...
PunySquid88 after removing the "//gihub" package is it working ?
with conda ?!
VirtuousFish83 I remember an issue on github with something similar , what's the cleamrl- server version you are using ?
HurtWoodpecker30
The agent uses the
requirements.txt
)
what do you mean by that? aren't the package listed in the "Installed packages" section of the Task?
(or is it empty when starting, i.e. it uses the requirements.txt from the github, and then the agent lists them back into the Task)
Can the user be overwritten during task configuration (I don't see such an option in the documentation)?
Hm, not really π this is tied with security feature on top.
That said,
stored them as a k8s secret and they are reused whenever anyone from our ML team starts a new ML model training
Does that mean you are running an Agent on the k8s cluster? what's exactly the flow that causes your k8s credentials to be used
CheerfulGorilla72
yes, IP-based access,
hmm so this is the main downside of using IP based server, the links (debug images, models, artifacts) store the full URL (e.g. http://IP:8081/ http://IP:8081/... ) This means if you switched IP they will no longer work. Any chance to fix the new server to the old IP?
(the other option is somehow edit the DB with the links, I guess doable but quite risky)
Hi FrothyShark37
is the task scheduler only acessible through the SDK?
yes, in the open source version this is strictly code based. I know the enterprise tier has a UI for it, but in terms of features I believe this is equivalent
Hi @<1687653458951278592:profile|StrangeStork48>
secrets manager per se,
Quick question, are you running the trains-server over http or https ?
Weβd be using https in production
Nice π
@<1687653458951278592:profile|StrangeStork48> , I was reading this thread trying to understand what exactly is the security concern/fear here, and I'm not sure I fully understand. Any chance you can elaborate ?
Assuming it was hashed, the seed would be stored on the same server, so knowing both would allow me the same access, no?