JitteryCoyote63

215 Questions, 1023 Answers

Active since 10 January 2023

Last activity 3 months ago

Reputation

Badges 1

981 × Eureka!

Questions 215
Answers 1023

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Hi, Is There A Way To Control After How Much Time An Agent That Went Down Is Removed From The Web-Ui? I Find The Current Value Too High For My Needs

Hi, is there a way to control after how much time an agent that went down is removed from the web-ui? I find the current value too high for my needs

mlops

2 years ago

0 Votes

2 Answers

2K Views

0 Votes 2 Answers 2K Views

Is There An Option To Make Trains-Agent Create Experiment Virtualenvs With

Is there an option to make trains-agent create experiment virtualenvs with --system-site-packages parameter?

clearml

5 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hey Guys, Quick Question: Is There A Tool Function To Know If A Task Id Is Valid? Not Verifying That The Task Itself Exists, Just That The Task Id Is The Correct Format

Hey guys, quick question: is there a tool function to know if a task id is valid? Not verifying that the task itself exists, just that the task id is the cor...

clearml

5 years ago

0 Votes

10 Answers

2K Views

0 Votes 10 Answers 2K Views

Hey There, I Moved The Clearml S3 Bucket Where I Stored All My Clearml Data From One S3 Bucket To Another And Now I Realized That All The Models/Experiments Logged In The Clearml-Server Still Refer To The Old S3 Bucket. Is There A Way To Update All The Re

Hey there, I moved the clearml s3 bucket where I stored all my clearml data from one s3 bucket to another and now I realized that all the models/experiments ...

clearml

4 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Hi, I Am Getting An Error While Running

Hi, I am getting an error while running task.mark_stopped() , any idea why? (clearml 1.0.2, clearml-agent 1.0.0, python 3.6) File "/home/machine/.clearml/ven...

clearml

4 years ago

Show more results

0 Hi, I Have Another Bug To Report For Clearml-Server 1.2 (Self Hosted) In The Console Logs Of An Experiments, I Cannot See The Latest Logs. Eg My Experiment Is Done, But I Can Only See The Logs Of To The Installation Of The Packages. If I Download The Log

I think it comes from the web UI of the version 1.2.0 of clearml-server, because I didn’t change anything else

3 years ago

0 Hi, I Am Getting The Following Errors In The Experiments I Am Currently Running:

how would it interact with the clearml-server api service? would it be completely transparent?

4 years ago

0 Hi Guys, With The New Venv Caching Available In Clearml, I Have The Following Problem: I Force My Pip Requirements To Be:

yes, in the code, i do:
task._wait_for_repo_detection() REQS_TASK = ["torch==1.3.1", "pytorch-ignite @ git+ ", "."] task._update_requirements(REQS_TASK) task.execute_remotely(queue_name=args.queue, clone=False, exit_process=True)

4 years ago

0 Hi Again, My Clearml Api-Server Is Having A Memory Leak. Each Time I Restart It, Its Ram Consumption Grows Until Getting Oom, Is Not Killed And Make The Ec2 Instance Crash

Thanks a lot, I will play with that!

4 years ago

0 Hi, I Would Like To Follow-Up In This

AgitatedDove14 Yes exactly! it is shown in the recording above

3 years ago

0 Hi, I Attached An Iam Role To An Ec2 Instance To Grant Access To An S3 Bucket. The Ec2 Instance Is Running A Clearml-Agent (V1.1.0). I Didn’T Specify Any Key/Secret For Clearml. The Tasks Fail With The Following Error:

Yes, I will check now

4 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

After some investigation, I think it could come from the way you catch error when checking the creds in trains.conf: When I passed the aws creds using env vars, another error poped up: https://github.com/boto/botocore/issues/2187 , linked to boto3

5 years ago

0 Are The Various Task Types Available In 0.15? I Am Getting

Would you like me to open an issue for that or will you fix it?

5 years ago

I’d like to move to a setup where I don’t need these tricks

3 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

So most likely trains was masking the original error, it might be worth investigating to help other users in the future

5 years ago

0 Is It Possible To Run An Agent, Listen To The Services Queue Without Using Docker?

Does what you suggested here > https://github.com/allegroai/trains-agent/issues/18#issuecomment-634551232 also applies for containers used by the services queue?

5 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

I will probably just use everywhere an absolute path to be robust against different machine user accounts: /home/user/trains.conf

5 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

I'll try to pass these values using the env vars

5 years ago

0 Are The Various Task Types Available In 0.15? I Am Getting

Awesome, thanks!

5 years ago

0 Hi, I Would Like To Report Another Bug Introduced With Clearml-Server 1.2.0: In The Comparison Page Of Two Experiments, On The Scalar Tab, With The Graph Layout, When Clicking On The Eye On One Scalar Group To Hide The Related Graphs, The Later Do Disappe

And after the update, the loss graph appears

3 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

without the envs, I had error: ValueError: Could not get access credentials for ' s3://my-bucket ' , check configuration file ~/trains.conf After using envs, I got error: ImportError: cannot import name 'IPV6_ADDRZ_RE' from 'urllib3.util.url'

5 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

what would be the name of these vars?

5 years ago

0 Hi, I Face A Strange Behavior From The Clearml-Agent: It’S Running In Services Mode, Not In Docker Mode, Cpu Only. I Want To Execute Two Tasks On This Service Agent. One Works, The Other Always Fails After Being Enqueued And Picked By The Agent With The E

Nice, seems to work! 🎉

4 years ago

0 Hi Again, My Clearml Api-Server Is Having A Memory Leak. Each Time I Restart It, Its Ram Consumption Grows Until Getting Oom, Is Not Killed And Make The Ec2 Instance Crash

"Can only use wildcard queries on keyword and text fields - not on [iter] which is of type [long]"

4 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

but I also make sure to write the trains.conf to the root directory in this bash script:
echo " sdk.aws.s3.key = *** sdk.aws.s3.secret = *** " > ~/trains.conf ... python3 -m trains_agent --config-file "~/trains.conf" ...

5 years ago

0 Hi, I Have A Local Package That I Use To Train My Models. To Start Training, I Have A Script That Calls

Hi NonchalantHedgehong19 , thanks for the hint! what should be the content of the requirement file then? Can I specify my local package inside? how?

3 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

So the problem comes when I do
my_task.output_uri = " s3://my-bucket , trains in the background checks if it has access to this bucket and it is not able to find/ read the creds

5 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

And I can verify that ~/trains.conf exists in the su home folder

5 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

File "devops/valid.py", line 80, in valid(parse_args) File "devops/valid.py", line 41, in valid valid_task.output_uri = args.artifacts File "/data/.trains/venvs-builds/3.6/lib/python3.6/site-packages/trains/task.py", line 695, in output_uri ", check configuration file ~/trains.conf".format(value)) ValueError: Could not get access credentials for 's3://ml-artefacts' , check configuration file ~/trains.conf

5 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

region is empty, I never entered it and it worked

5 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

AgitatedDove14 Yes exactly, I tried the fix suggested in the github issue urllib3>=1.25.4 and the ImportError disappeared 🙂

5 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

(btw, yes I adapted to use Task.init(...output_uri=)

5 years ago

0 Hi Again, My Clearml Api-Server Is Having A Memory Leak. Each Time I Restart It, Its Ram Consumption Grows Until Getting Oom, Is Not Killed And Make The Ec2 Instance Crash

Now, I know the experiments having the most metrics. I want to downsample these metrics by 10, ie only keep iterations that are multiple of 10. How can I query (to delete) only the documents ending with 0?

4 years ago

0 Hi Again, My Clearml Api-Server Is Having A Memory Leak. Each Time I Restart It, Its Ram Consumption Grows Until Getting Oom, Is Not Killed And Make The Ec2 Instance Crash

So it looks like it tries to register a batch of 500 documents

4 years ago

0 Hi, Is It Possible To Pass Environment Variables To Agents Created By The Aws Autoscaler Service?

I get the following error:

4 years ago

Show more results