Reputation
Badges 1
25 × Eureka!Are you running it in venv mode or docker mode?
Hi JitteryCoyote63
If you want to stop the Task, click Abort (Reset will not stop the task or restart it, it will just clear the outputs and let you edit the Task itself) I think we witnessed something like that due to DataLoaders multiprocessing issues, and I think the solution was to add 'multiprocessing_context='forkserver' to the DataLoaderhttps://github.com/allegroai/clearml/issues/207#issuecomment-702422291
Could you verify?
It works. However, still, it sometimes takes a strangely long time for the agent to pick up the next task (or process it), even if it is only "Hello World".
The agent check every 2/5 seconds if there is a new Task to be launched, could that be it?
Hi @<1704304350400090112:profile|UpsetOctopus60>
https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_kubernetes_helm
Just use the helm charts. It's the easiest
But a warning instead of an error would be good.
Yes, that makes sense, I'll make sure we do that
Does this sound like a reasonable workflow, or is there a better way maybe?
makes total sense to me, will be part of next RC 🙂
Hi ExcitedCat13
Sure, download the plugin from the git repo (Install instructions in the repo).
Regarding remote debugging, are referring to ssh ?
The plugin itself is designed to make sure that when you work on a remote machine with pycharm clearml will log the local git repo and changes (as the .git folder is not synced to the remote machine)
Test it on your local setup (I would hate to push a broken fix)
Is that possible?
UnsightlyShark53 See if this one solves the problem :)
BTW: the reasoning for the message is that when running the task with "trains-agent" if the parsing of the argparser happens before the the Task is initialized, the patching code doesn't know if it supposed to override the values. But this scenario was fixed a long time ago, and I think the error was mistakenly left behind...
Hi DeliciousBluewhale87
Yes that should have worked, can you verify the task status ?
Print(Task.get_task(...).get_status())
btw:
If you need to access it, just bash into the running dockerdocker exec -it <container_name> /bin/bash
I can probably have a python script that checks if there are any tasks running/pending, and if not, run docker-compose down to stop the clearml-server, then use boto3 to trigger the creating of a snapshot of the EBS, then wait until it is finished, then restarts the clearml-server, wdyt?
I'm pretty sure there is a nice way, let me check soemthing
. Are there any option to remove the example projects?
So sorry just realized I missed your message
Yes, but I'm not sure it will have an effect, see here
why the memory usage of the elastic search still persist on 32 gb after removing experiments?
did you restart the server after removing the experiments?
5 seconds will be a sleep between two consecutive pulls where there are no jobs to process, why would you increase it to a higher pull freq ?
The task pod (experiment) started reaching out to an IP associated with malicious activity. The IP was associated with 1000+ domain names. The activity was identified in AWS guard duty with a high severity level.
BoredHedgehog47 What is the pod container itself ?
EDIT:
Are you suggesting the default "ubuntu:18.04" is somehow contaminated ?
https://hub.docker.com/layers/library/ubuntu/18.04/images/sha256-d5c260797a173fe5852953656a15a9e58ba14c5306c175305b3a05e0303416db?context=explore
ReassuredTiger98 I can verify the code snippet reproduces the issues with packages missing from "installed package".
If you feel this is important, please open a GitHub issue.
Also, you can manually add packages:
Task.add_requirements('package_name_here', 'optional version here')
So when you manually load the package you can make sure it will be listed, do remember to call it before the Task.init call.
ConfusedPig65 could you send the full log (console) of this execution?
SubstantialElk6 I know they have full permission control in the enterprise edition, if this is something you need I suggest you contact http://allegro.ai 🙂
WackyRabbit7 I guess we are discussing this one on a diff thread 🙂 but yes, should totally work, that's the idea
Hi @<1593413673383104512:profile|MiniatureDragonfly17>
These are the specific model input/output layers name.
The way Triton analyses PyTorch model is usuallyinput__0 then input__1 for the input layers and output__0 and so on for the results:
You can see an example here:
None
--input-size 1 28 28 --input-name "INPUT__0" --input-type float32 --output-size -1 10 --output-name "OUTPUT__0" --outpu...
The package is just subdir by the way. So it should not be in installed packages anyways, right?
Correct, also when the agent is spinning the code it will automatically add the root of the git repository to the pythonpath so you should be able to load the package.
ReassuredTiger98 maybe we should add an option to send a text next to the abort?
(Actually it is just a matter of passing the argument)
wdyt?
Hi PerfectChicken66
every X iterations and delete the older ones with
I have to ask, why not just overwrite the artifact? it is basically the same, no ?!
older ones with
delete_artifacts
from
Task
I think you are correct, when you delete the entire Task you can specify, delete artifacts, but it does not do that on delete_artifact 😞
You can manually do that with:
` task._delete_uri(task.artifacts["artifact"].url)
task.delete_artifact() ...
ShallowGoldfish8 I believe it was solved in 1.9.0, can you verify?pip install clearml==1.9.0
You are correct, the agent will clone the git and install the requirements, as written in the task installed packages section. Regrading the git branch, notice it will pull the specific commit id as stated in the execution section, and it will apply any uncommitted changes. You can edit the execution section and change the commit to the latest in a specific version (you should probably also clear the uncommitted changes of you do that)
Hi CleanPigeon16
Yes there is, when you are cloning the pipeline in the UI, go to the Configuration/Pipeline/continue_pipeline and change it to True
Hi GreasyPenguin14
Quick question, any reason not to use a 2D scatter ? or a histogram (or any other non time-series plot)?
I did not start with python -m, as a module. I'll try that
I do not think this is the issue.
It sounds like anything you do on your specific setup will end with the same error, which might point to a problem with the git/folder ?