Reputation
Badges 1
25 × Eureka!WickedGoat98 Same for me, let me ask the UI guys, I think this is a UI bug.
Also maybe before you post the article we could release a fix to both, what do you think?
EDIT:
Never mind ๐ i just saw the medium link, very cool!!!
function and just seem to be getting an "isadirectory" error?
Can you post here what you are getting ? which clearml version are you using ?!
also tried manually adding
leap==0.4.1
in the task UI which didn't work.
That has to work, if it did not, can you send the log for the failed Task (or the Task that did not install it)?
The environment in the logs does show that leap is being installed potentially from a cache?
- leap @ file:///opt/keras-hannd...
Hi BattyLion34
script_a.py
ย generates fileย
test.json
ย in project folder
So let's assume "script_a" generates something and puts it under /tmp/my_data
Then it can create a dateset from the folder /tmp/my_data
, with Dataset.create() -> Dataset.sync -> Dataset.upload -> Dataset.finalize
See example: https://github.com/alguchg/clearml-demo/blob/main/process_dataset.py
Then "script_b" can get a copy of the dataset using "Dataset.get()", see examp...
Welp, it's been a day with the new settings, and stats went up 140K for API calls
... going to check again tomorrow to see if any of that was spill over from yesterday
140K calls a day, how often are you sending scalars ? how long is it running? how many experiments are running ?
The agent is using Bash (but when you add command line to the docker run, .bashrc is not executed, hence no conda
in PATH)
Maybe add the full path to the conda executable:ocker_setup_bash_script= [ "export PATH=""/workspace/miniconda/bin:$PATH", "export LOCAL_PYTHON=/workspace/miniconda/bin/python3","/workspace/miniconda/bin/conda activate /PATH_GOES_HERE"])
Let's try:
` echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/docker-clean ; chown -R root /root/.cache/pip ; export DEBIAN_FRONTEND=noninteractive ; export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL libsm6 libxext6 libxrender-dev libglib2.0-0" ; [ ! -z $(which git) ] || export CLEARML_APT_INSTALL="$CLEARML_APT_INSTALL git" ; declare LOCAL_PYTHON ; for i in {10..5}; do which python3.$i && python3.$i -m pip --version && export LOCAL_PYTHON=$(which python3.$i) && b...
Hi SkinnyPanda43
In your local machine do not pass output_uri at all, so nothing will be uploaded.
On the agent's configuration file configure, default_output_uri
to the S3 bucket
(Notice you can always override them in the UI, see the bottom of the execution Tab)
https://github.com/allegroai/clearml-agent/blob/e93384b99bdfd72a54cf2b68b3991b145b504b79/docs/clearml.conf#L312
That is correct.
Obviously once it is in the system, you can just clone/edit/enqueue it.
Running it once is a mean to populate the trains-server.
Make sense ?
DepressedChimpanzee34 any string serialization package I tried will convert r"some\blah" into "some\\blah" (json yaml hocon) otherwise you end up with \b as an escape character. I'm really not sure what to do here. (And reinventing the standard seems unhealthy)
ETA for the next release is end of the month/early March, it is planned to include many other improvements ๐
Hi @<1526371965655322624:profile|NuttyCamel41>
. I do that because I do not know how to get the pickle file into the docker container
What would the pickle file do?
and load the MinMaxScaler within the script, as the sklearn dependency is missing
what do you mean by that? are you getting an error when loading your model ?
Looking at theย
supervisor
ย method of the baseย
AutoScaler
ย class, where are the worker IDs kept.
Is it in the class attributeย
queues
ย ?
Actually the supervisor is passing a fixed prefix, then it asks the clearml-server on workers starting with this name.
This way we can have a fixed init script for all agents, while we still can differentiate them from the other agent instances in the system. Make sense ?
So clearml server already contains an authentication layer (JWT Token), and you do have a full user management on top:
https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_config#web-login-authentication
Basically what I'm saying if you add httpS on top of the communication, and only open the 3 ports, you should be good to go. Now if you really need SSO (AD included) for user login etc, unfortunately this is not part of the open source, but I know they have it in the scale/ent...
Hi MagnificentSeaurchin79
Yes this is a bit confusing ๐
Datasets are stored as delta changes from parent versions.
A dataset contains a list of files and list of artifacts where these files exist. This means that if we add a new file to a dataset we create a new dataset from a parent dataset and want to add a file, we have to add a link to the file, and have a new artifact containing just the delta (i.e. the new file) from the parent version When you delete a file you just remove the li...
Okay I found it, this is due to the fact the newer versions are sending the events/images in a subprocess (it used to be a thread).
The creation of the object is done on he main process, updating file index (round robin manner), but the check itself, happens on the subprocess., which is not "aware" of the used indexes (i.e. it is always 0, hence when exceeding the history side, it skips it)
EnviousPanda91 notice that when passing these arguments to clearml-agent you are actually passing default args, if you want an additional argument to Always be used, set the extra_docker_arguments
here:
https://github.com/allegroai/clearml-agent/blob/9eee213683252cd0bd19aae3f9b2c65939d75ac3/docs/clearml.conf#L170
Hi DeliciousBluewhale87
I think we had a docker that does exactly that, and then you would spin the docker as a k8s service , is this what you are referring to?
I see, you can manually do that with add steps, i.e.
for elem in map:
pipeline.add_step(..., elem)
or you can do that with full logic:
@PipelineDecorator.component(...)
def square_num(num):
return num**2
@PipelineDecorator.pipeline(...)
def map_flow(nums):
res = []
# This will run in parallel
for num in nums:
res.append(square_num)
# this is where we actually wait for the results
for r in res:
print_nums(r)
map_flow([1,2,3,5,8,13])
`...
it would be clearml-serverโs job to distribute to each user internally?
So you mean the user will never know their own S3 access credentials?
Are those credentials unique per user or once"hidden" for all of them?
So the way it works anything in the " extra_docker_shell_script
" section is executed inside the container everytime the container spins. I'm thinking that theextra_docker_shell_script
will pull the environment file from an S3 bucket and apply all "secrets" (or secrets are embedded into the startup bash script, like "export AWS_SECRET=abcdef"), that said this will not be on a per user basis ๐
Does that help?
Hi CrookedWalrus33
docker_setup_bash_script= ["export PATH=""/workspace/miniconda/bin:$PATH"])
Oh I think you are correct, this should do the trick:docker_setup_bash_script= ["export PATH=/workspace/miniconda/bin:$PATH", "export LOCAL_PYTHON=/workspace/miniconda/bin/python3"]
This will make sure both agent and script execute on the same python
but to run a script inside a docker which already has the environment built in.
If this is already activated, the latest agent w...
RoughTiger69
Apparently,
, doesnโt populate that dict with
any keys that donโt already exist in it
.
Are you saying new entries are not added to the Dict even if they are on the Task (i.e. only entries that already exist on the dict are populated ?
But you already have all the entries defined here:
https://github.com/allegroai/clearml/blob/721569bb77d89d89e5b4f32a0ed98311c4574650/examples/services/aws-autoscaler/aws_autoscaler.py#L22
Since all this is ha...
Using agent v1.01r1 in k8s glue.
I think a fix was recently committed, let me check it
MysteriousBee56 and please this one: "when you run theย trains-agent
ย with --foreground , before it starts the docker it print the full command line"
Thanks JuicyFox94 for letting us know.
I'm checking what's the status with it
Hi DepressedChimpanzee34
Why do you need to have the configuration added manually ? isn't the cleaml.conf easier ? If not I think OS environments are easier no? I run run above code, everything worked with no exception/warning... What is the try/except solves exactly ?
however when I clone or reset said task after completion and then enqueue it again, I get the above error.
This part is somewhat confusing... There is no magic happening behind the scenes, cloning a Task and creating it, is basically the same ... Do you have a reference to the YOLOv5 code base itself, maybe I can figure out what's the issue?
No, an old experiment changed, nothing was rerun
ohh, that is odd. I think the max iteration value is stored on the DB, which is odd if it changed after an update.
BTW: just making sure, could it be these Tasks were imported ? (i.e. offline execution + import)
Hi LazyFox65
So the idea is that you add two lines of code to your codebase :from clearml import Task task = Task.init(project_name='examples', task_name='change me')
And you run it once, then it will create the experiment, environment arguments etc.
Now that you have it in the UI you can clone / change all the fields and send for execution.
That said you can also create an experiment from CLI (basically pointing to a repo and entry point)
You can read here:
https://github.com/allegroa...