Reputation
Badges 1
29 × Eureka!AgitatedDove14 True. Haha not sure, now are we talking about a mini wiki kind of thing? And actually the slack thing is actually a good workaround this since people can just comment easily. It would only be bad if slack is not allowed to run in certain orgs or whatever.
AgitatedDove14 Wow this can go many different ways actually! I didn't think too much about where to place it too deeply. I just thought a comment section under tasks might be helpful but you're right. It might be too granular. But an experiment wide thread might be nice, same with project level too haha. Maybe.... an entire separate "forum" module might be cool (a whole new icon on the left hand side below worker/agent button) where people can do @'s for experiments/projects/tasks and even c...
SuccessfulKoala55 Thank you for your wonderful helpful fast replies as always. This is cool! So... I currently have the servers up via docker-compose up
command which has a lot of containers up and running. I assume... I need to spawn a bash shell inside the clearml-webserver
container, (currently there is no /opt/clearml/config/apiserver.conf
file) , create a apiserver.conf
inside /opt/clearml/config, and add those username and passwords and, save and then restart ...
CostlyOstrich36 Yep! Okay so adding in OutputModel().update_weights('my_best_model.bin')
was enough ! Now my outputs are stored as both in the models AND artifacts tab which is what I want. Thanks!
AgitatedDove14 hello, I think I got this to work. But just to make sure I am doing this correctly ... in order for me to get the task_id, do i have to upload and run a script with its default value first (since I don't have an initial task id), then clone it, edit the configuration inside that newly cloned one, get the id of the clone, and pass this into my script as the task_id and run it from my machine?
Is there a way do this without running it on my machine? Like the another user can e...
SuccessfulKoala55 Thank you! It worked!!!
Awesome. Thank you !!!!
AgitatedDove14 I am an idiot. Thank you for your wonderful patience. Yes, you are absolutely right. So I just needed to take the generated API Access key and Secret key and then export them as env variables, and... boom! Now I have a worker/agent clearml-services
on my server ready for work!
CostlyOstrich36 thanks for your quick replies as always, I sadly can't find where the execution tab is... I'm testing with the self hosted and free http://clear.ml account and I don't see it.
CostlyOstrich36 Got it thanks!!
AgitatedDove14 Thanks for the follow up as usual. And roger that. But I am curious, how would I get an agent to launch in the same instance of my clearml server? I figured I might as well set it up since it's there haha
This can be the "production tag" where a single task with the best results should hold.
sorry, I'll update that. I meant like the best artifacts. So multiple artifacts of the same task that I thought was the best. It may or may not be the latest task.
Thanks a lot AgitatedDove14 (Sorry for the late response) !!
Ah the latest tag of production sounds good to me. Then I guess I can retrieve the artifacts from that specific latest tagged task then.
Yep, automatically moving a tag
I'll check them out for sure!
hi CostlyOstrich36 thanks for responding. So running that command I get
`
[2021-10-19 20:53:52,726] [9] [INFO] [clearml.app_sequence] ################ API Server initializing #####################
[2021-10-19 20:53:52,727] [9] [INFO] [clearml.database] Initializing database connections
[2021-10-19 20:53:52,727] [9] [INFO] [clearml.database] Using override mongodb host mongo
[2021-10-19 20:53:52,728] [9] [INFO] [clearml.database] Using override mongodb port 27017
[2021-10-19 20:53:52,729] ...
I believe I ran that vm command already
CostlyOstrich36 SuccessfulKoala55 super late update, but it turns out I needed to beef up the machine. Thanks for all the help!
Okay thanks CostlyOstrich36 and SuccessfulKoala55 I'll beef up my server first and then run this again.
CostlyOstrich36 uh oh... I think i need more memory...
`
There is insufficient memory for the Java Runtime Environment to continue.
Native memory allocation (mmap) failed to map 2060255232 bytes for committing reserved memory.
An error report file with more information is saved as:
logs/hs_err_pid59.log
error:
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
OpenJDK 64-Bit Server VM warning: IN...
CostlyOstrich36 O geez, you're going to laugh, but Im using a ec2 free tier and it only gives me 1 GiB of memory
oh wait, I don't see the 99-clearml.conf yet... let me try that before I kill this instance
and I think stopped state allows me to keep my artifacts!
CostlyOstrich36 Oh, im just thinking if it was super late at night and I'm at near 0% brain power and accidentally published the wrong experiment (even though maybe I told myself not to publish until I'm not so tired) haha. It would be nice to undo it without having to delete it completely.
And just curious, I tried out the reset code, but I get an error.
Here's my code:
` from clearml import Task
a_task = Task.get_task(task_id='7dae94daf28144b09011e0582bcd130e')
a_task.reset(force=True) ...
CostlyOstrich36 You're definitely not stupid! Thanks for all your help around here !!
Looks like the ElasticSearch service is down?
CostlyOstrich36 okay gotcha, actually, I found a workaround! I just use a_task.mark_stopped(force=True)
or evena_task.mark_started(force=True)
works too!