Reputation
Badges 1
25 × Eureka!ReassuredTiger98 could you provide more information ? (versions, scenario. etc.)
Hi @<1600661423610925056:profile|StrongMouse81>
using serving base url and also other endpoint of model we add using:
clearml-serving model add
we get the attached respond:
And other model endpoints are working for you?
you mean to spin a pod with the agent inside it (daemon in services mode).
Or connect the services queue to the k8s cluster (i.e. define the pod template that uses cpu with not a lot of ram)?
GloriousPenguin2 could you open a GitHub issue on it? Just making sure this will actually get fixed š
Yey!
Out of curiosity, what's the workflow with snowflake?
not sure if this is considered a bug or not! but Iād happily make an issue on github if needed.
I think we should, at least for the sake of transparency and visibility š
thanks again for all your help.
My pleasure š
In the "installed packages" section you should have "nvidia-dali-cuda110" In the agent's clearml.conf you should add:extra_index_url: ["
", ]
https://github.com/allegroai/clearml-agent/blob/master/docs/clearml.conf#L78
Should solve the issue
DeliciousSeal67 the agent will use the "install packages" section in order to install packages for the code. If you clear the entire section (you can do that in the UI or programmatically) then it will revert to requirementsd.txt
Make sense ?
do I need to create a brand new dataset with a new name that inherits from the original?
Yes, you just create a new version, specify the parent one, add changes and close it.
If you later need you can squash a version (same ides as git squash). Make sense ?
Thank you MuddyCrab47 !
Regrading model versioning:
All models are logged automatically by trains (no need so specify it, as long as you are using one of the automagically connected frameworks: PyTorch/keras/TF/SKlearn)
You can see see how it looks like on the demoapp:
https://demoapp.trains.allegro.ai/projects/5371015f43f043b1b4ad7203c1ff4a95/models
Regrading Dataset management, we have a simple workflow demonstrated below, bascially using artifacts as dataset storage, with very easy int...
Okay the type is inferred from the default value of the function step itself, that means that both:data_frame = step_one(pickle_url, extra=1337)
anddata_frame = step_one(pickle_url, 1337)
Will pass extra as int
.
That said if the default value of the argument is missing, it will revert to str
In order to use the type hints as casting hint, we actually need to improve the task.connect
to support the type casting (they are stored)
RoughTiger69 the easiest thing would be to use the override option of Hydra:parameter_override={'Args/overrides': '[the_hydra_key={}]'.format(a_new_value)})
wdyt?
GrotesqueOctopus42
The problem is that when I import some function from a file in another folder, that task doesn't catch the files depencies.
Just to be clear, if this is another file, you have to have all the files in the same git repo for the agent to actually be able to fetch them on the remote machine.
If you have a mix of notebooks and code, you have to have the local code in a git repo,
Make sense ?
So General would have created a General instead of Args?
yes,
This is a must, you have to specify the hyperparameters section you are referencing.
https://github.com/allegroai/clearml/blob/5a9155b2039413280f13dfded1121470c4c4323d/examples/pipeline/step2_data_processing.py#L21
This is actually:task.connect(args, name='General')
Basically there is no "random_state" only "General/random_state"
Make sense ?
Could you send the logs?
Assuming you are using docker-compose, the console output is a good start
Hi ColossalAnt7
Try ctrl-F5 and refresh the page?!
It seems you are missing a few buttons š
TenseOstrich47 / PleasantGiraffe85
The next version (I think releasing today) will already contain scheduling, and the next one (probably RC right after) will include triggering. That said currently the UI wizard for both (i.e. creating the triggers), is only available in the community hosted service. That said I think that creating it from code (triggers/schedule) actually makes a lot of sense,
pipeline presented in a clear UI,
This is actually actively worked on, I think Anxious...
Woot woot š
Hi @<1689446563463565312:profile|SmallTurkey79>
This call is to set an existing (already created Task's requirements). Since it was just created it waits for the automatic package detection before overriding it.
What you want is " Task.force_requirements_env_freeze
" (notice Class level, that need to be called Before Task.init)
Task.force_requirements_env_freeze(requirements_file="requirements.txt")
task = Task.init(...)
Hi @<1547028031053238272:profile|MassiveGoldfish6>
hmm yeah you need to remove the "hidden" system_tag from the project
from clearml.backend_api.session.client import APIClient
c = APIClient()
print(c.projects.get_by_id("PROJECT_ID_HERE").to_dict())
c.projects.update(project="PROJECT_ID_HERE", system_tags=["test"])
print(c.projects.get_by_id("PROJECT_ID_HERE").to_dict())
Notice you can get the project ID from the URL
`/projects/1974af8ccdac454b836c47349c4e826e/experiments/84...
. Ive seen parameters connect and task create in
seconds
and other times it takes 4 minutes.
This might be your backend (cleamrl-server) replying slowly becuase of load?
Is there a way (at the class level) to control the retry logic on connecting to the API server?
The difference in the two screenshots is literally only the URLs in
clearml.conf
and it went from 30s down to 2-3s.
Yes that could be network, also notice that there is aut...
yup! that's what I was wondering if you'd help me find a way to change the timings of. Is there an option I can override to make the retry more aggressive?
you mean wait for less?
None
add to your clearml.conf:
api.http.retries.backoff_factor = 0.1
Seems like everything is in order. Can you curl to the API/web/files server?
or do you mean the machine I ran the experiment locally?
Yes this one
Oh i get it now, can you test:git ls-remote --get-url github
and thengit ls-remote --get-url
And the agent section on this machine is:api_server:Ā
web_server:Ā
files_server:Ā
Is that correct?