Reputation
Badges 1
533 × Eureka!AgitatedDove14 ⬆ please help 🙏
no need to do it again, I ahve all the settings in place, I'm sure it's not a settings thing
So just to correct myself and sum up, the credentials for AWS are only in the cloud_credentials_*
does the services mode have a separate configuration for base image?
FriendlySquid61
Just updating, I still haven't touched this.... I did not consider the time it would take me to set up the auto scaling, so I must attend other issues now, I hope to get back to this soon and make it work
a machine that had previous installation, but I deleted the /opt/trains directory beforehand
Oh... from the docs I understood that I don't have to run the script, that I can either configure it in the UI, or with the sscript (wizard) so I ignored it up until now
🤔 is the "installed packages" part editable? good to know
Isn't it a bit risky manually changing a package version? what if it won't be compatible with the rest?
Also being able to separate their configurations files would be good (maybe there is and I don't know?)
Martin: In your trains.conf, change the valuefiles_server: ' s3://ip :port/bucket'
Isn't this a client configuration ( trains-init )? Shouldn't be any change to the server configuration ( /opt/trains/config... )?
` ClearML launch - launch any codebase on remote machine running clearml-agent
Creating new task
Error: requirements.txt not found [/home/elior/Dev/Projects/XXXXX/requirements.txt] Use --requirements or --packages `
inference table is a pandas dataframe
google store package could be the cause, because indeed we have the env var set, but we don't use the google storage package
Is there a more elegant way to find the process to kill? Right now I'm doing pgrep -af trains but if I'll have multiples agents, I will never be able to tell them apart
sorry I think it trimmed it
I want to collect the dataframes from teh red tasks, and display them in the pipeline task
I think you are talking about separate problems - the "WARNING DIFF IS TOO LARGE" is only a UI issue, that you can't see hte diff in the UI - correct me if I'm wrong with this
Maria seems to be saying that the execution FAILS when she has uncomitted changes, which is not the expected behavior - am I right maria?
(I'm working with maria)
essentially, what maria says is when she has a script with uncomitted changes, when executing remotely, the script that actually runs on the remote machine is without the uncomitted changes
e.g.:
Her git status is clean, she makes some changes to script.py and executes it remotely. What gets executed remotely is the original script.py and not the modified version she has locally
I'm not, just want to be very precise an consice about them when I do ask... but bear with me, its coming 🙂
The only way to change it is to convert apiserver_conf to a dictionary object ( as_plain_ordered_dict() ) and edit it
Could be, my message is that in general, the ability to attach a named scalar (without iteration/series dimension) to an experiment is valuable and basic when looking to track a metric over different experiments
the Task object has a method called Task.execute_remotely
Look it up here:
https://allegro.ai/docs/task.html#trains.task.Task.execute_remotely
So just to be clear - the file server has nothing to do with the storage?
AgitatedDove14 permanent. I want to start with a CLI interface that allows me add users to the trains server
Gotcha, didn't think of an external server as Service Containers are part of Github's offering, I'll consider that
I might, I'll look at the internals later cause at a glance I didn't really get the logic inside get_local_copy ... the if there is ending with if ... not cached_file: return cached_file which from reading doesn't make much sense
is it possible to access the children tasks of the pipeline from the pipeline object?
I set it to true and restarted by agent
