It seems to me that the source of the mismatch is the str(tuple())
Hmm DepressedChimpanzee34 my bad it seems the loading is done via YAML loader, but the dumping is straight forward str casting...
https://github.com/allegroai/clearml/blob/6e6271fb91f2aeb2aa7a13c6d07d4e635baaa670/clearml/backend_interface/task/task.py#L934
What would you expect to get (BTW "value\blah"
is Not a valid string assignment in python as there is no \b escape character, it should be "value\blah" which translates into the text "value\blah")
AgitatedDove14 , in my use case, the strings are regular expressions. If the reported regular expression string changes when reported, it messes up my run.
I can't define the "legal regular expression space", off the top of my head. The expected behavior is reported_string == original_string..
No worries, I would love for us to come up with a nice solution 🙂
AgitatedDove14 , agree \ disagree?
DepressedChimpanzee34 <character> will almost always be converted into \ because otherwise it will not support \t or \n etc.
What I'm looking here is some logic that will allow us not to break backwards compatibility on the one hand, but still will allow you to have something like "first\second" entry.
WDYT? any ideas? (I really want to make sure we fix it as soon as possible)
AgitatedDove14 , I see, someone must have faced the issue of dumping regular expression strings in tuple before?
I'll have a think and a look too, unfortunately not today
there seem to be an additional logic to:str_value = str(value)
str('.') -> '\.'
however in the configuration I see: '.' for non nested strings
BTW:str('\.') Out[4]: '\\.' str(('\.', )) Out[5]: "('\\\\.',)"
This is just python str casting
ohh actually I think I remember, when you connect a dictionary, the local dtype is used for the casting of the remote matching key (probably more nuanced)
It would be very very useful for my use case, and I believe a relatively popular use case in general for example when using regular expression configurations
Hi DepressedChimpanzee34
How do I reproduce the issue ?
What are we expecting to get there ?
Is that a Colab issue or hyper-parameter encoding issue ?
AgitatedDove14What I'm looking here is some logic that will allow us not to break backwards compatibility on the one hand, but still will allow you to have something like "first\second" entry. WDYT? any ideas? (I really want to make sure we fix it as soon as possible)
Do you mean something like the dictionary structure?
I am not sure about a clean elegant solution yet.. but this patch does some of the job:str('(' + ', '.join([str(elm) for elm in tuple_value])+')')
I am actually curious about the way you cast it back to the original dtype of the individual elements.. because the patch above only solves the display issue as far as I understand
It should preserve the order as the order of the update back (i.e. when executed by the agent) is the same as the order of the keys (obviously py3.7+ becuase it creates dict not Ordered Dicts)
I understand that to report any value should be presented as string, how does the "inverse casting" work when I pull some value from the config?
if preferred I can open a Github issue about this
so a different behavior between a string and a string in a tuple is by design? I find it confusing, I guess this is the YAML convention?
https://colab.research.google.com/drive/1w5lQGxsblnLGlhJEDH_b0aIiUvjGeLjy?usp=sharing
AgitatedDove14 , TimelyPenguin76 , a small blast from the past
Unfortunately it seems like this is not working for backslash escape character
https://demoapp.demo.clear.ml/projects/7eaa1749475d4ad4bd21a5456fd2e157/experiments/3efe981238e543c8b6ad682dd13c72bc/output/hyper-params/hyper-param/General
https://colab.research.google.com/drive/1w5lQGxsblnLGlhJEDH_b0aIiUvjGeLjy?usp=sharing
well, kind of, I linked the other topic, but it was completely unrelated
this topic is about the issue with reporting a configuration with a string inside a tuple that has backslash
DepressedChimpanzee34
so parsing bask is done via a yaml reader:
https://github.com/allegroai/clearml/blob/49fcbd7bbf3236f4175cdff29fa951847b0923cc/clearml/backend_interface/task/args.py#L506
We could add extra test here, checking for \ in the string, that should solve it and will be backwards compatible (I think)
https://github.com/allegroai/clearml/blob/49fcbd7bbf3236f4175cdff29fa951847b0923cc/clearml/backend_interface/task/task.py#L935
AgitatedDove14 , when I test using the yaml python package:
I see the following:import yaml yaml.dump({'\.': ('a', '\.')}) [In]: '\\.: !!python/tuple\n- a\n- \\.\n'
YAML treats both strings in tuples and outside the same, however this is not the behavior you get in clearml task.connect
I am actually not sure specifically about \b myself, but even when replacing with . I am getting \. double backslash instead of the single backslash ( for the tuple case ). which in the case of a regexp expression changes the meaning of the expression. the expected behavior would be registering it as single backslash
DepressedChimpanzee34 any string serialization package I tried will convert r"some\blah" into "some\\blah" (json yaml hocon) otherwise you end up with \b as an escape character. I'm really not sure what to do here. (And reinventing the standard seems unhealthy)
this topic is about the issue with reporting a configuration with a string inside a tuple that has backslash
So the encoding itself is done YAML style, and based on your example \b Has to be encoded to \b because this is string encoding, like \n will become "new line"
Make sense ?
AgitatedDove14 , by the way, can you take a look at https://clearml.slack.com/archives/CTK20V944/p1625558368001600
maybe you'll have other ideas? at the moment it seems like a dead end
Sorry found the code on the Task, duh 🙂# get_ipython().magic('pip install clearml') import clearml from clearml import Task task = Task.init(project_name='examples', task_name='test param', reuse_last_task_id=False) param = { 'tuple_double_quotes_r': (r"value\blah", 1), 'tuple_double_quotes': ("value\blah", 1), 'tuple_single_quotes': ('value\blah', 1), "double_quotes_r": r"value\blah", 'double_quotes': "value\blah", 'single_quotes': 'value\blah' } task.connect(param) print('done') r"value\blah"
what are you expecting to get here ? '\b' is not a valid escaping (I think)
I thought this is the issue on the thread you linked, did I miss something ?