
Reputation
Badges 1
25 × Eureka!EnviousPanda91 so which frame works are being missed? Is it a request to support new framework or are you saying there is a bug somewhere?
I see, so basically pull a fixed set of configuration for everyone from the server.
Currently only the scale/enterprise version supports such a feature 😞
Hmm, you can delete the artifact with:task._delete_artifacts(artifact_names=['my_artifact']
However this will not delete the file itself.
Do delete the file I would do :remote_file = task.artifacts['delete_me'].url h = StorageHelper.get(remote_file) h.delete(remote_file) task._delete_artifacts(artifact_names=['delete_me']
Maybe we should have a proper interface for that? wdyt? what's the actual use case?
Any other port that could be open? (if SSH is already open we cannot launch another daemon on the same port)
Seems like a okay clearml.conf
file
Notice this is the error:404
can you curl to this address ? are you sure you have httpS and not http ? was the dns configured ?
The wheel you download from pip, for example this one torch-1.11.0-cp38-cp38-manylinux1_x86_64.whl
is actually both CPU and cuda 117
Hi @<1575656665519230976:profile|SkinnyBat30>
Streamlit apps are backend run (i.e. the python code drives the actual web app)
This means running your Tasks code and exposing the web app (i.e. http) streamlit.
This is fully supported with ClearML, but unfortunately only in the paid tiers 😞
You can however run your Task with an agent, make sure the agent's machine is accessible and report the full IP+URL as a hyper-parameter or property, and then use that to access your streaml...
Hi HealthyStarfish45
- is there an advantage in using tensorboard over your reporting?
Not unless your code already uses TB or has some built in TB loggers.
html reporting looks powerfull, can one inject some javascript inside?
As long as the JS is self contained in the html script, anything goes :)
Hi @<1657918706052763648:profile|SillyRobin38>
I'm curious to know if it's possible to prevent uploading a duplicate endpoint.
...and we attempt to upload it again without any changes to the command content,
Basically you overwrite it, and yes, possible 🙂
any other aspect, could the system prevent the duplicate upload?
so basically check the hash and say, no need to upload?
Import Error sounds so out of place it should not be a problem :)
Hi JitteryCoyote63
The new pipeline is almost ready for release (0.16.2),
It actually contains this exact scenario support.
Check out the example, and let me know if it fits what you are looking for:
https://github.com/allegroai/trains/blob/master/examples/pipeline/pipeline_controller.py
JitteryCoyote63 correct, you could also use Task.create
that creates a Task but does not do any automagic.
I also saw the PR for set_parent, will be merged shortly 🙂 thanks!
Now I see, the scenario is similar to the HyperParameter scenario , see the TrainsJob https://github.com/allegroai/trains/blob/master/trains/automation/job.py
I still don't see why you would change the type of the cloned Task, I'm assuming the original Task had the correct type, no?
Hmm, so currently you can provide help, so users know what they can choose from, but there is no way to limit it.
I know the Enterprise version has something similar that allows users to create a custom "application" from a Task, there you can define a drop and as such, but that might be an overkill here, wdyt?
create a new file, copy paste to the new file these lines, and run it inside vscode, what are you getting in the console?
from clearml import Task Task.add_requirements("tensorflow") task = Task.init(project_name="debug", task_name="requirements") print("done")
is there GPU support
That's basically depends on your template yaml resources, you can have multiple of those each one "connected" with a diff glue pulling from a diff queue. This way the user can enqueue a Task in a specific queue, say single_gpu
, then the glue listens on that queue and for each clearml Task it creates a k8s job the single gpu as specified in the pod template yaml.
JitteryCoyote63
are the calls from the agents made asynchronously/in a non blocking separate thread?
You mean like request processing on the apiserver are multi-threaded / multi-processed ?
Sure thing, any vanilla AMI will work, as long as it has python3 and docker preinstalled (obviously if you need GPU support than drivers preinstalled as well)
Quick update Nexus supports direct http upload, which means that as CostlyOstrich36 mentioned, just pointing to the Nexus http upload endpoint would work:output_uri="http://<nexus>:<port>/repository/something/"
See docs:
https://support.sonatype.com/hc/en-us/articles/115006744008-How-can-I-programmatically-upload-files-into-Nexus-3-
This is strange, let me see if we can get around it, because I'm sure it worked 🙂
Hi FlatOctopus65
You are almost thereprev_task: Task = Task.get_task(task_id=<prev_task_id_here>) model = prev_task.models['output'][-1] my_check_point = model.get_local_copy()
Hi @<1554275779167129600:profile|ProudCrocodile47>
Do you mean @ clearml.io ?
If so, then this is the same domain (.ml is sometimes flagged as spam, I'm assuming this is why they use it)
SubstantialElk6
The ~<package name with first name dropped> == a.b.c
is a known conda/pip temporary install issue. (Some left over from previous package install)
The easiest way is to find the site-packages folder and delete the package, or create a new virtual environment
BTW:
pip freeze will also list these broken packages
No should be fine... Let me see if I can get a windows box 🙂
Yes! Thanks so much for the quick turnaround
My pleasure 🙂
BTW: did you see this (it seems like the same bug?!)
https://github.com/allegroai/clearml-helm-charts/blob/0871e7383130411694482468c228c987b0f47753/charts/clearml-agent/templates/agentk8sglue-configmap.yaml#L14
but it is not possible to write to a private channel in which the bot is added.
Is this a Slack limitation ?
None of them is problematic, this is what I'm trying to say 🙂
I think the minio browser gets confused.
if you want to test the upload time on the client you can try:task.flush(wait_for_uploads=True) tic = time() task.upload_artifact('test', '/tmp/localfile') task.flush(wait_for_uploads=True) print(time() - tic)
MysteriousBee56 okay look for the folder ~/.trains/vcs_cache you will find the git repo there, just overwrite the content with your local copy