Reputation
Badges 1
25 × Eureka!NastyFox63 ask SuccessfulKoala55 tomorrow, I think there is a way to change the default settings even with the current version.
(I.e. increase the default 100 entries limit)
Thanks SmallDeer34 , I think you are correct, the 'output' model is returned properly, but "input" are returned as model name not model object.
Let me check something
I specifically set is as empty withΒ
export_data['script']['requirements'] = {}
Β in order not to reduce overhead during launch. I have everything installed inside the container
Do you have everything inside the container Inside a venv ?
Hi RoughTiger69
but still get the semantics of knowing when an (external) file changed?
How would you know it changed?
This implies you have a way to verify hash, which means you download the data , no?
Try removing this magic environment that tells the sub-process there was already an Initialized Task.
import os env = dict(**os.environ) env.pop('TRAINS_PROC_MASTER_ID', None)
π
Whatβs the general pattern for running a pipeline - train model, evaluate metrics and publish the model if satisfactory (based on a threshold, for example)
Basically I would do:
parameters for pipeline:
TaskA = Training model Task (think of it as our template Task)
Metric = title/series/sign we want to choose based on, where sign is max/min
Project = Project to compare the performance so that we could decide to publish based on the best Metric.
Pipeline:
Clone TaskA Change TaskA argu...
FiercePenguin76
So running the Task.init from the jupyter-lab works, but running the Task.init from the VSCode notebook does not work?
Hi JitteryCoyote63 when you run the trains-agent it tells you where it puts the logs, it's a temp auto generated filename usually under /tmp/Running TRAINS-AGENT daemon in background mode, writing stdout/stderr to /tmp/.trains_agent_daemon_out4uahki3i.txt
Hi DeliciousBluewhale87
When you say "workflow orchestration", do you mean like a pipeline automation ?
RipeGoose2 models are automatically registered
i.e. added to the models artifactory, but it only points to where the files are stored
Only if you are passing the output_uri
argument to the Task.init, they will be actually uploaded.
If you want to disable this behavior you can passTask.init(..., auto_connect_frameworks={'pytorch': False})
... grab the model artifacts for each, put them into the parent HPO model as its artifacts, and then go through the archive everything.
Nice. wouldn't it make more sense to "store" a link to the "winning" experiment. So you know how to reproduce it, and the set of HP that were chosen?
No that the model is bad, but how would I know how to reproduce it, or retrain when I have more data etc..
Hi MotionlessSeagull22
Hmm I'm not this is possible in the UI.
You can compare multiple experiments and view the images in form of thumbnails one next to the other, But full view will be a single image...
You can however right click on the image and get a direct link, then open a new tab ... :(
I'm assuming these are the Only packages that are imported directly (i.e. pandas requires other packages but the code imports pandas so this is what listed).
The way ClearML detect packages, it first tries to understand if this is a "standalone" scrip, if it does, than only imports in the main script are logged. Then if it "thinks" this is not a standalone script, then it will analyze the entire repository.
make sense ?
JitteryCoyote63
Should be added before theΒ
if name == "main":
?
Yes, it should.
From you code I understand it is not ?
What's the clearml
version you are using ?
Hi GentleSwallow91
I think this would be a good start:
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py
wdyt?
Are you getting the error from boto failing to launch additional ec2 instances ?
(torchvision vs. cuda compatibility, will work on that),
The agent will pull the correct torch based on the cuda version that is available at runtime (or configured via the clearml.conf)
I guess. or pipelines that you can compose after running experiments to see that experiments are connected to each other
hmm what do you mean by "compose after running experiments" ? like a way to group them? what is the relation between one "item" to another ?
If this is a sequence of Tasks , are they executed by a controller ?
It seems stuck somewhere in the python path... Can you check in runtime what's os.environ['PYTHONPATH']
That would match what
add_dataset_trigger
and
add_model_trigger
already have so it would be good
Sounds good, any chance you can open a github issue, so that we do not forget?
Another parameter for when the task is deleted might also be useful
That actually might be more complicated, because there might be a race condition, basically missing the delete operation...
What would be the use case?
basically @<1554638166823014400:profile|ExuberantBat24> you can think of hyper-datasets as a "feature-store for unstructured data"
I have mounted my s3 bucket at the location /opt/clearml/data/fileserver/ but I can see my data is not being stored in s3 but its storing in ebs. How so?
I'm assuming the mount was not successful
What you should see is a link to the files server inside clearml, and actual files in your S3 bucket
works seamlessly throughout and in our current on premise servers...
I'm assuming via something close to what I suggested above with .netrc ?