I think that what happened was you are running it on the host machine (not inside the docker)
I probably missed a "
somewhere
model_path/run_2022_07_20T22_11_15.209_0.zip , err: [Errno 28] No space left on device
Where was it running?
I take it that these files are also brought into pipeline tasks's local disk?
Unless you changed the object, then no, they should not be downloaded (the "link" is passed)
Hmm can you try:--args overrides="['log.clearml=True','train.epochs=200','clearml.save=True']"
If you do not have a lot of workers, that I would guess console outputs
Hi @<1523704757024198656:profile|MysteriousWalrus11>
in the pipeline quickly between pipeline.add_step() functions?
You mean you want to get access to the parent Task ids and query them directly ?
I think the easiest way is to pass it as one of the parameters
(you can get to the pipeline Task itself from the running component, then get the dag, but these are internal functions, maybe we should make them external for easier querying ?)
pipe.add_step(
name="stage_process",
...
I don't know whether you have access to the backend,
Creepy , no I do not ๐
I can't make anything appear in the console part of the ui
clearml_task.logger.report_text("some text")
should work
RoughTiger69 I think this could work, a pseudo example:
` @PipelineDecorator.component(...)
def the_last_step_before_external_stuff():
print("doing some stuff")
@PipelineDecorator.pipeline()
def logic():
the_last_step_before_external_stuff()
if not check_if_data_was_ingested_to_the_system:
print("aborting ourselves")
Task.current_task().abort()
# we will not get here, the agent will make sure we are stopped
sleep(60)
# better safe than sorry
exit(0) `wdyt? (the...
Out of curiosity, if Task flush worked, when did you get the error, at the end of the process ?
MassiveHippopotamus56
the "iteration" entry is actually the "max reported iteration over all graphs" per graph there is different max iteration. Make sense ?
Ohh I see, could you copy paste what you put there (instead of the secret and key *** will do ๐ )
Hi CheekyElephant36
First you need to run it once on your machine, once this is done (only a few steps is enough), you can one it and enqueue it. Then to actually connect the aws autoscaler (the part that spins machines and runs tasks) go to applications and select the aqs autoscaler.
Btw i think the next video will be about YOLO + autoscaler
although ideally i'd like to tell it exactly where to unzip it.
Ohh you can use .get_local_mutable_copy()
It will unzip it to specific folder
Hi AbruptWorm50
the second "epoch loss" is the scalar for the "validation" process (see "validation: epoch loss" series is actually the TF file/folder prefix automatically added)
Make sense ?
DepressedChimpanzee34 <character> will almost always be converted into \ because otherwise it will not support \t or \n etc.
What I'm looking here is some logic that will allow us not to break backwards compatibility on the one hand, but still will allow you to have something like "first\second" entry.
WDYT? any ideas? (I really want to make sure we fix it as soon as possible)
Is there a way I could move the JWT authentication (not authorization) logic into an API Gateway or Load Balancer?
Hmm in theory, but not in practice ๐
if ClearML is following OAuth 2.0, t
This is for the SSO part, not for the API, API is only using JWT for verification, the login process itself is with external SSO (OAuth 2.0). But the open-source version does not support SSO ๐
Why are you trying to add another ELB with JWT verification on it ? ...
Hi ReassuredTiger98
Could you send the log of both run ?
(I'm not sure this is a bug, or some misconfiguration , but the scenario should have worked...)
why are all defined components shown in the UI Results/Plots/PipelineDetails/ExecutionDetails section? Shouldn't it make more sense to show only the ones that are used in that pipeline?
They are listed there (because of the decorator, you basically "say" these are steps so they are listed), the actual resolving (i.e. which steps are actually being called) is done in "real-time"
Make sense ?
You can switch to docker-mode for better control over cuda drivers, or use conda and specify cudatoolkit (this feature will be part of the next RC, meanwhile it will install the cudatoolkit based on the global cuda_version).
Hi @<1540142641931358208:profile|FancyBaldeagle86>
You mean in the UI? i.e. clone an experiment hover over the Configuration / Hyperparameter section and clicking edit ?
Hi CluelessElephant89
hey guys, I believeย
clearml-agent-services
ย isn't necessary right?
Generally speaking, yes you are corrected ๐
Specifically, this is the "services" queue agent, running your pipeline logic, services etc.
But it is not a must to get the server to work, and you can also spin it on a different host
It takes 20mins to build the venv environment needed by the clearml-agent
You are Joking?! ๐ญ
it does apt-get install python3-pip , and pip install clearml-agent, how is that 20min?
So the issue is that you have two reference branches on the local git, one to gitlab one to gitea and it fails to understand which on is the correct remote ...
I wonder if "git ls-remote --get-url" will always work ?!
if project_name is None and Task.current_task() is not None: project_name = Task.current_task().get_project_name()
This should have fixed it, no?
5 seconds will be a sleep between two consecutive pulls where there are no jobs to process, why would you increase it to a higher pull freq ?
I saw documentation, but I can't make the proper dict object for hyperparams
I see, this is what you are after (I think)
https://github.com/allegroai/clearml/blob/fb644fe9ec6be36b8f2f70a34256fbdc593d663a/clearml/backend_api/services/v2_20/tasks.py#L3138
FYI: These days TB became the standard even for pytorch (being a stand alone package), you can actually import it from torch.
There is an example here:
https://github.com/allegroai/trains/blob/master/examples/frameworks/pytorch/pytorch_tensorboard.py
HealthyStarfish45 did you manage to solve the report_image issue ?
BTW: you also have
https://github.com/allegroai/trains/blob/master/examples/reporting/html_reporting.py
https://github.com/allegroai/trains/blob/master/examples/reporting/...