
Reputation
Badges 1
27 × Eureka!what if I can’t use the docker mode because the agent is already running inside docker
no, without docker
And one more thing(seems like agent didn’t pull all necessary code)
I’m trying to run task export.py
with remote agent. But in this script there are some imports from my other .py scripts:
import torch - this module should be installed with pip
from from generate_triton_config import generate_configs - this import from generate_triton_config.py
And it also raise an error:
Traceback (most recent call last):
File "/root/.clearml/venvs-builds/3.10/code/export.py", line 8,...
Thank you!
That works
CLEARML_AGENT_SKIP_PIP_VENV_INSTALL & CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL
But still have a problem with, agent didn’t pull all necessary code
Do you have any ideas why that happens?
In “Installed packages” section, I see:
# Local modules found - skipping:
# generate_triton_config == ../generate_triton_config.py
Mb this relate somehow?
It even failed to create correct enviroment
And when it tried to execute import torch
It raise large error:
ImportError: cannot import name '_get_sequence_nr' from 'torch._C._autograd' (unknown location)
@<1523701070390366208:profile|CostlyOstrich36> mb any other ideas, how to fix? 🫠
from clearml import Task
from clearml.automation import TaskScheduler
# Create the scheduler and make it poll quickly for demo purpose
scheduler = TaskScheduler(
sync_frequency_minutes=1,
force_create_task_project='name',
force_create_task_name='Scheduler'
)
# Get the task that we want to rerun
# task_to_schedule = Task.get_task(project_name='name', task_name='')
task_to_schedule = Task.get_task(task_id='8ee7b88505a...')
# Add the scheduler based on task above and overri...
yes, if I manually put any task in this queue, it starts without problems
Can you provide an information, how to do that?
Okay thank you so much
But I think I solve problem with credentials by using clearml_agent v1.8.1rc2
But now I get an issue with local python modules 🫠
Even when I set
agent.skip_pip_venv_install = 1
agent.skip_python_env_install = /usr/bin/python
In worker logs I see:
Environment setup completed successfully
Starting Task Execution:
@<1698868530394435584:profile|QuizzicalFlamingo74> did u find solution?)
@<1523701070390366208:profile|CostlyOstrich36> yes, I use same clearml.conf file
sdk{
environment{
test_var1: "var1_value"
test_var2: "var2_value"
Sorry, I didn’t really understand about top level
What do you mean?
@<1523701070390366208:profile|CostlyOstrich36> any updates here?
Get the same result
But then I will have an empty credentials inside S3BucketConfigurations
, here
Upd: if I delete keys ‘key’ and ‘secret_key’ from top section I got
2024-05-27 16:29:51,597 - clearml.storage - ERROR - Failed creating storage object
Reason: Missing key and secret for S3 storage access (
)
So I need to use EXPORT
before spinning up clearml-agent daemon?
@<1709740168430227456:profile|HomelyBluewhale47> @<1523701070390366208:profile|CostlyOstrich36> any updates here?
Also face same problem, envs that I set in clearml.conf can’t be used when I run agent
Thank you this works 🙌
But by the way, why agent didn’t read env variables from clearml.conf file ?
yes I see this tast in queue, but they just ‘pending’
But as I said before it seems that agent got them:
And in agent log I can see:
agent.environment.login = ****
agent.environment.pass = ****
@<1523701070390366208:profile|CostlyOstrich36> any updates?
Is it possible to avoid using docker mode?
I tried:
1. output_url = "
"
2. output_url = "
"
3. output_url = "s3://"
Now I use same config, as in original question:
s3 {
key: "key"
secret: "sec_key"
credentials: [
{
host: "host:port"
bucket: "bucket_name"
key: "key"
secret: "sec_key"
multipart: false
secure: false
}
]
}
And pass output...
@<1523701070390366208:profile|CostlyOstrich36>
And if I use docker mode, an agent will read env variables from clearml.conf file?
@<1689446563463565312:profile|SmallTurkey79> did you solved this issue with fatal: could not read Username
?
@<1523701070390366208:profile|CostlyOstrich36> looks the same to me, I use it this example to setup my confign
Or is there any diff?
I tried None :port/bucket
But result still the same 🥲
I think the sdk is aware that this is not amazon, because when I specify the ‘region’ field then it ignores my host and uses the amazon host