
Reputation
Badges 1
25 × Eureka!AgitatedTurtle16 could you check with the latest clearml RC (I remember a similar issue was fixed).pip install clearml==0.17.5rc3
Then run againclearml-task ...
Yes, found the issue :) I'll see to it there is a fix in the next RC. ETA early next week
Fix pushed to github 🙂pip install git+
MysteriousBee56 there is no way to tell the trains-agent to pull from local copy of your repository...
You might be able to hack it, if you copy the entire local repo to the trains-agent version control cache. would that help you?
task.connect
is two way, it does everything for you:base_params = dict(param1=123, param2='text') task.connect(base_params) print(base_params)
If you run this code manually, then print is exactly what you initialized base_params
with. But when the agent is running it, it will take the values from the UI (including casting to the correct type), so print will result in values/types from the UI.
Make sense ?
No worries, let's assume we have:base_params = dict( field1=dict(param1=123, param2='text'), field2=dict(param1=123, param2='text'), ... )
Now let's just connect field1:task.connect(base_params['field1'], name='field1')
That's it 🙂
Hi ProudChicken98task.connect(input)
preserves the types based on the "input" dict types, on the flip side get_parameters
returns the string representation (as stored on the clearml-server).
Is there a specific reason for using get_parameters
over connect ?
Oh :)task.get_parameters_as_dict()
Can I change the parameters before executing the draft task
Yes you can, after you clone the experiment everything becomes editable, so you can edit the config in the UI.
For example, let's assume I have config.yml, and in my code I do:my_file = task.connect_configuration('config.yml') with open(my_file, 'rt') as f: ...
Then after I clone it in the UI and edit the configuration, when it will be executed remotely,my_file
will contain the content of the configuration as s...
Hi GiddyTurkey39
us the config file connect to the Task via Task.connect_configuration
?
One last thing make sure you spin the pod container with privileged mode, because the trains-agent docker will spin a sibling docker for your actual experiment.
If you need to change the values:config_obj.set(...)
You might want to edit the object on a copy, not the original 🙂
it worked!!!!
YEY!
I pass the IDs to the docker container as environment variables, so this does need restart for the docker container but I guess we can live with that for now
So this would help you decide on which actual Model file to download ? (trying to understand how the argument is being used, meaning should we have it stored somewhere? there is meta-data on the Model itself so we can use that to store the data)
BTW: the cloning error is actually the wrong branch, if you take a look at your initial screenshot, you can see the line before last branch='default'
which I assume should be branch='master'
(The error itself is still weird, but I assume that this is what git is returning)
Okay
Try to reset the experiment and resend for execution, let me know if you still get the error, if you do, could you send a screen grap of the Execution tab? Trains supports either git repo, or standalone code (jupyter) but not a mixture of the two. This means that if you want to run the jupyter/colab the cloning will have to be part of the notebook itself (as you already have it). That said, due to the way CoLab works, Trains will log your execution history (as opposed to the entire jupy...
Simple git clone on that repo works well
On the machine running the trains-agent ?
I know there is a aux cfg with key value pairs but how can use it in the python code?
This is actually for helping to configure Triton services, you cannot (I think) easily access it from the code
or shall I call the Task.init even from the agent
WorriedParrot51 I think something is lost here.
Task.init() is always called, even when the agent is executing the code. The difference is in what happens inside the Task.init() call. When the codebase itself is executed by the trains-agent, it signals through OS environment to the task.init() that instead of a new created task, it should use the already created one. from this point all data flows from the trains-server back into the c...
but it still not is able to run any task after I abort and rerun another task
When you "run" a task you are pushing it to a queue, so how come a queue is empty? what happens after you push your newly cloned task to the queue ?
None
See: Add an experiment hyperparameter:
and add gpu
: True
Notice you should be able to override them in the UI (under Args seciton)
Hi @<1597762318140182528:profile|EnchantingPenguin77>
, but it seems like clearml always create a virtual environmen
Yes that's correct, but the new venv inside the container inherits from the system packages (so if nothing changes it does nothing)
Is there a way that I can have the clearml-task to automatically activated a virtual environment use the activated custom virtual environment in my docker and run the scripts
Yoo can but the "correct" way to work with python and co...
is it displaying that it is running anything?
Hi TightElk12
it would raise an error if the env where execution happens is not configured to track things on our custom server to prevent logging to the public demo server ?
What do you mean by that? catching the default server instead of the configured one ?
I see TightElk12
You can always setup the OS environments : CLEARML_API_HOST CLEARML_WEB_HOST CLEARML_FILES_HOST with the correct configuration Or you can simply set CLEARML_NO_DEFAULT_SERVER=1 which will prevent any usage of the default demo serverwdyt?
Thanks for the detials @<1597762318140182528:profile|EnchantingPenguin77>
clearml.Auto-Scaler - INFO - New instance b97e702d-e2b3-4f28-adab-be59648601ea listening to test-gpu queue
This looks like a new agent was spined on your EC2 account, can you see it in the "Workers" page ?
I think the part that is missing for me is the context, in other words how would one configure the execution_plan
and why would they configure it in a specific way?
My intuition, without fully understanding it, is that for some reason the internal DAG/decision is exposed to the user, and it feels like too much information. Basically I have a hunch that the users should not need to have such deep understanding to control the flow, and they should end up with an abstraction on top of it. ...
Where would I put these credentials? I don't want to expose them in the logs as environmental variable or hard code them.
Hi GleamingGrasshopper63
So basically you need a vault, to store those credentials...
Unfortunately the open-source version does not contain vault support, but the paid tiers scale/enterprise do.
There you can have an environment variable defined in the vault, that each time the agent runs your code, it will pull it from the vault and set it on your process. wdyt ?