We should probably have a section on that (i.e. running two agents on the same GPU, then explain how top use it)
and: " clearml_agent: ERROR: 'charmap' codec can't encode character '\u0303' in position 5717: character maps to <undefined> "
Ohh that's the issue with the LC_ALL missing in the docker itself (i.e unicode code character will break it)
Add locals into the container, in your clearml.conf add the followingagent.extra_docker_shell_script: ["apt-get install -y locales",]Let me know if that solves the issue (as you pointed, it has nothing to do with importing package X)
Sure GiddyTurkey39 , Checkout the cleanup service:
https://github.com/allegroai/trains/blob/master/examples/services/cleanup/cleanup_service.py
Hi SmoothSheep78
Do you need to import the previous state of the trains-server, or are you starting from scratch ?
UpsetCrocodile10
Does this method expect
my_train_func
to be in the same file as
As long as you import it and you can pass it, it should work.
Child exp get's aborted immediately ...
It seems it cannot find the file "main.py" , it assumes all code is part of a single repository, is that the case ? What do you have under the "Execution" tab for the experiment ?
Hi UpsetCrocodile10
execute them and return scalars.
This should be a good start (I hope 🙂 )
` for child in children:
put the Task into an execution queue
Task.enqueue(child, queue_name='my_queue_here')
wait for the task to finish
child.wait_for_status(status=['completed'])
reload all the metrics
child.reload()
get the metrics
print(child.get_last_scalar_metrics()) `
Hi @<1551376687504035840:profile|StraightSealion9>
AWS Autoscaler to create a new instance when you enqueue a task to the relevant queue.
Does that mean that you were able to enqueue a Task and have it launch on the remote EC2 machine ?
Update us if it solved the issue (for increased visibility)
Hi FreshBat85clearml_agent: ERROR: 'utf-8' codec can't decode byte 0xfc in position 38: invalid start byteThis is a notorious issue with python and UTF-8/Unicode support.
Any chance there is "unicode"/utf8 code in the uncommitted changes section ?
BTW you can set an environment variable before spinning the agent, telling it always to use UTF8set PYTHONUTF8=1
Or maybe do you plan to solve this problem in the very short term? (edited)
Yes we will 🙂
WickedElephant66 this seems like a general network issue, like the docker service is missing your companies firewall certificate.
Can you pull any container from docker hub ?
ZanyPig66 is this reproducible? This sounds like a bug, whats the TB version and OS you rae using?
Is this example working for you (i.e. you see debug images)
https://github.com/allegroai/clearml/blob/master/examples/frameworks/pytorch/pytorch_tensorboard.py
But I think this error has only appeared since I upgraded to version 1.1.4rc0
Hmm let me check something
Ohh so the setup.py is the one containing these requirements, oops I totally missed that :( let me check what pep has to say about that ... (Basically this is not a clearml issue but a pip one...)
Do you have a roadmap which includes resolving things like this
Security SSO etc. is usually out of scope for the open-source platform as it really makes the entire thing a lot harder to install and manage. That said I know that on the Enterprise solution they do have SSO and LDAP support and probably way more security features. I hope it helps 🙂
Hi LazyTurkey38
, is it possible to have the agents keep a local version and only download the diff of the job commit to speed things up?
This is what it does, it has a local cached copy and it only pulls the latest changes
Hi @<1624941407783358464:profile|GrievingTiger47>
I think you should try to contact the sales guys here: None
Hi RoughTiger69
Is the pipeline in question based on decorators or is it based on existing Tasks?
Metadata might be expensive, it's a RestAPI call, and we have found users putting hundreds of artifacts, with preview entries ...
I want to run only that sub-dag on all historical data in ad-hoc manner
But wouldn't that be covered by the caching mechanism ?
I pull all the parameters, and then manually filter on the HP keys (manually=I have to plug them in, they are not part of optimizer object)
So is this an improvement to optimizer._get_child_tasks_ids(...) interface ?
e.g. return a structure like:[ { 'id': task_id, 'hp1': value, 'hp2': value, 'hp3': value, 'objective': dict(title='title', series='series', value=42 }, ]
quick question:CLEAR_DATA="./data/dataset_for_modeling"Should I pass the folder of the extracted zip file (assuming train.txt is the training dataset) ?
if the file is untracked by git, it is not saved by clearml
Yep 😞
Does clearml-agent install the repo with
pip install -e .
It is supported, but the path to the repo cannot be absolute (as it will probably be something else in the agent env)
You can add "git+ https://github.com ...." to the "installed packages" The root path of your repository is always added to the PYTHONPATH when the agents executes it, so in theory there is no need to install it wi...
Could it be the code is not in a git repository ?clearml support either a single script or a git repository, but Not a collection of standalone files. wdyt?
Hi GreasyPenguin66
Is this for the client side ? If it is why not set them in the clearml.conf ?
Hi DilapidatedDucks58
eg, we want max validation accuracy and all other metric values for the corresponding epoch
Is this the equivalent of nested sort ?
Wouldn't you get the requested behavior if you add all metric columns but sort based on the "accuracy" column ?