
Reputation
Badges 1
25 × Eureka!Hi MelancholyElk85
I think you are right, OutputModel is missing, remove
method.
Maybe we should have a class method on Model , something like:@classmethod Model.remove(model: Union[str, Model], delete_weights_file: bool, force: bool): # actually remove model and weights file
wdyt?
Okay found it, ElegantCoyote26 the step name is changed but the Task name remains the same ... π
I'll make sure we fix it on the next version
In the side bar you get the title of the graphs, then when you click on them you can see the diff series on the graphs themselves
I would clone the first experiment, then in the cloned experiment, I would change the initial weights (assuming there is a parameter storing that) to point to the latest checkpoint, i.e. provide the full path/link. Then enqueue it for execution. The downside is that the iteration counter will start from 0 and not the previous run.
JitteryCoyote63
So there will be no concurrent cached files access in the cache dir?
No concurrent creation of the same entry π It is optimized...
Hmmm why don't you use "series" ?
(Notice that with iterations, there is a limit to the number of images stored per title/series , which is configurable in trains.conf, in order to avoid debug sample explosion)
using this is it possible to add to requirements of task with task_overrides?
Correct, but you will be replacing (not adding) requirements
EmbarrassedSpider34 I can update that an RC should be out later today with a fix π
BTW copying the cmd line assumes that you are running it in the same machine...
sets up the venv correctly, prints
Starting Task Execution:
then does nothing
Can you provide a log?
Do you see the code/git reference in the Pipeline Task details - Execution Tab ?
Okay let me check if we can reproduce, definitely not the way it is supposed to work π
or by trains
We just upload the image as is ... I think this is SummaryWriter issue
clearml-agent daemon --detached --queue manual_jobs automated_jobs --docker --gpus 0
If the user running this command can run "docker run", then you should ne fine
So I see this in the build, which means it works , and compiles, what is missing ?
` Building wheels for collected packages: leap
Building wheel for leap (setup.py) ... [?25l- \ |
1667848450770 UH-LPT371:0 DEBUG / - \ | / - done
[?25h Created wheel for leap: filename=leap-0.4.1-cp38-cp38-linux_x86_64.whl size=1052746 sha256=1dcffa8da97522b2611f7b3e18ef4847f8938610180132a75fd9369f7cbcf0b6
Stored in directory: /root/.cache/pip/wheels/b4/0c/2c/37102da47f10c22620075914c8bb4a9a2b1f858263021...
Thanks RipeGoose2 !
clearml logging starts from n+n (thats how it seems) for non explicit
I have to say it looks like the expected behavior , I think.
Basically matching the TB, no?
EnviousStarfish54 a fix is already available in the latest RC
Could you verify it solves your issue as well?pip install trains==0.16.2rc0
HI QuizzicalDove0
I guess the reason is that the idea is integration is literally 2 lines, and it will take less time to execute the code on a system with working env (we assume there is one) then to configure all the git , python packages, arguments etc...
All that said you can create an experiment from code , using Task.import_task https://allegro.ai/docs/task.html#trains.task.Task.import_task
GrievingTurkey78 can you send the entire log?
Hi @<1716987933514272768:profile|SuccessfulPuppy43>
How to make remote ClearML agent do
pip install -e .
in theory there is no need to do that clearml-agent adds the repo root folder to the python path.
If you insist on actually installing it, try to add to your "installed packages" section a "requirement.txt" compatible line:
-e .
Hi ObedientDolphin41
I keep bumping against the
ModuleNotFoundError: No module named
exception.
Import the package inside the component function (the one you decorated), it will make sure it lists it in the requirements section automatically.
You can also set it manually by passing it to as the "packages" argument on the decorator function:
RoughTiger69 whats the clearml version you are using ?
btw: you are running it locally, then enqueuing and running it remotely via the agent ?
clearml-task
Β seems does not allow me passing theΒ
run
Β argument without value
EnviousStarfish54 did you try --args run=True
I'm assuming run is a boolean of a sort ?
a. The submitted job would automatically download data from internal data repository, but it will be time consuming if data is re-downloaded every time. Does ClearML caching the data somewhere?
What do you mean by the agent will download the data ? are you referring to Dataset
?
OddShrimp85 you can see the full configuration at the top of the Task log. What do you have there? Also what is the clearml python version?
tasks.add_or_update_artifacts/v2.10 (Invalid task status: expected=created, status=completed)>
Hi UpsetCrow72
How come you are trying to sync
a "completed" (finalized) dataset ?
The api server by default spins multiple processes (they all might be busy a tye time with a huge flood of requests, but this is still multi process). Let me check if there is an easy way to set more processes
Just curious about the timeout, was it configured by clearML or the GCS? Can we customize the timeout?
I'm assuming this is GCS, at the end the actual upload is done GCS python package.
Maybe there is an env variable ... Let me google it
Hi @<1600661423610925056:profile|StrongMouse81>
using serving base url and also other endpoint of model we add using:
clearml-serving model add
we get the attached respond:
And other model endpoints are working for you?
Too late for what?
To update the task.requirements before it actually creates it (the requirements are created in a background thread)