Done HandsomeCrow5 +1 added 🙂
btw: if you feel you can share how your reports looks like (screen shot is great), this will greatly help in supporting this feature , thanks
Different question. How can I pass PYTHONPATH env variable to a task, run by agent (so python can find classes inside m subdirectories)?
Hi HelpfulHare30
By default the working directory will be added to the python path, this means if I have under execution:Working Dir: "." Script: "src/script.py"
The root git repo will be added to the python path.
BTW: next RC you could add a flag to the agent to always add the git repo
And it is not working ? what's the Working Dir you have under the Execution Tab ?
Hi DilapidatedDucks58 just making sure, the link is pyrorch nightly artifactory? Or is it a direct link to the package? Reason for asking, I was not aware they have proper artifactory... When the task runs the trains agent will update the installed packages with all the installed packages it used. Could you verify you have the correct version?
Regarding the extra files, you are correct, the docker container is reset every run, so they will get lost. What are those files for? Could you add ...
Hi VivaciousBadger56
Basically you can think of MLRun as "amazon lambda service without amazon". It is designed to run a "function" in scale on multiple nodes.
ClearML on the other hand is an MLOps platform. It does the experiment tracking, it orchestrates Task (think jobs), it does data management and lastly we recently released the serving. These are two different use cases.
Am I making sense here?
That depends on the HPO algorithm, basically the will be pushed based on the limit of "concurrent jobs", so you do not end up exploding the queue. It also might be a Bayesian process, i.e. based on previous set of parameters and runs, like how hyper-band works (optuna/hpbandster)
Make sense ?
I have mounted my s3 bucket at the location /opt/clearml/data/fileserver/ but I can see my data is not being stored in s3 but its storing in ebs. How so?
I'm assuming the mount was not successful
What you should see is a link to the files server inside clearml, and actual files in your S3 bucket
well that depends on you, what did you write there to know it is the best one ? file name ? added some metric ?
Hmm what's the clearml version? Whats the python version, whats the OS? And pytorch version?
Hi @<1657918706052763648:profile|SillyRobin38>
In the
preprocess.py
files, we will have so many similar lines which is not good.
Actually the clearml-serving supports also directories, i.e. you can package an entire module as part of the preprocess, which would be easier for your code
Another option is to package your code in a python package and have that installed on the container (there is a special env var that allows you to add those to the serving container)
...
Should work out of the box, maybe the only thing to notice is that you will get a Task for every local_rank 0 process
does that make sense ?
However, regarding your recommendation of using
StorageManager
class to delete the URL, it seems that this class only contains methods for checking existence of files, downloading files and uploading files, but
no method
for actually
deleting
files based on their URL (see doc
and
).
Yes you are correct 😞 you should use a "deeper" class:
helper = StorageHelper.get(remote_url)
helper.delete(remo...
As I installed ClearML using pip,
Where is the clearml-serving runs ? usually your configuration file is in ~/clearml.conf
Notice if it is not there it means it is using the defaults so just create a new one and add that line
PlainSquid19 yes the link is available on in the actual paid product 😞
I don't think they have the documentation open yet...
My recommendation is to fill the contact us form, you'll get a free online tour as well 😉
Multi-threaded multi-processes multi-nodes 🙂
Hi @<1704304350400090112:profile|UpsetOctopus60>
https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_kubernetes_helm
Just use the helm charts. It's the easiest
AFAICS it's quite trivial implementation at the moment, and would otherwise require parsing the text file to find some references, right?
Yes, but the main issue is the parsing, it needs to have a specific standard. We use HOCON because it is great to read and edit (basically JSON would be a subset of HOCON)
the original pyhocon does support include statements as you mentioned -
Correct, my thinking was to expand them into "@configuration_section.key" or something of that nature
In our case, we have a custom YAML instruction
!include
, i.e.
Hmm interesting, in theory this might work since configuration encoding (when passing dicts), is handled with HOCON which does support referencing.
That said currently it is not aware of "remote configurations" only ENV variables and local files.
It will be cool to add, do we have a github issue on that? (would you like to see if you can PR such a thing?)
like what all are important metric monitoring queries w.r.t. the serving tasks that can be visualized and shown in grafana?
Basically latency amd requests per minute are automatically reported. Additional reports are based on your RestAPI in/out.
Imagine the following restapi request json payload
{x=123, y=456}
and a return json of
{z=789}
The metrics you can add to the monitoring are the keys on both these jsons, i.e. "x", "y", "z"
These metrics can be both log...
` @PipelineDecorator.component(
name="my step", return_values=['data_frame'], cache=True, task_type=TaskTypes.data_processing)
def step_one(pickle_data_url: str, extra: int = 43):
stuff here `This seemed to work for me
MassiveHippopotamus56
the "iteration" entry is actually the "max reported iteration over all graphs" per graph there is different max iteration. Make sense ?
Hi JuicyDog96
The easiest way is:from trains.backend_api.session.client import APIClient client = APIClient() client.projects.get_all()
You can just run it from a python console and check what you are getting.
Full API is https://github.com/allegroai/trains/tree/master/trains/backend_api/services/v2_8
Hi SubstantialElk6
but in terms of data provenance, its not clear how i can associate the data versions with the processes that created it.
I think DeliciousBluewhale87 ’s approach is what we are aiming for, but with code.
So using clearml-data
from CLI is basically storing/versioning of files (with differentiable based storage etc, but still).
What ou are after (I think) is in your preprocessing code using the programtic Dataset class, to create the Dataset from code, this a...