#3105 SSSD cant parse GPO if AD server have Russain language
Closed: cloned-to-github 3 years ago by pbrezina. Opened 7 years ago by slavon.

(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [ad_gpo_parse_ini_file] (0x0400): ini_filename:/var/lib/sss/gpo_cache/open-bs.local/Policies/{867F611D-CF79-4E4E-83F7-DF3EA3A36D05}/GPT.INI
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [ad_gpo_parse_ini_file] (0x0020): ini_config_file_open failed [84][Invalid or incomplete multibyte or wide character]
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [ad_gpo_parse_ini_file] (0x0020): Error encountered: 84.
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [perform_smb_operations] (0x0020): Cannot parse ini file: [84][Invalid or incomplete multibyte or wide character]
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [main] (0x0020): perform_smb_operations failed.[84][Invalid or incomplete multibyte or wide character].
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [main] (0x0020): gpo_child failed!


GPT.ini have line
DispayName = TEXT
TEXT in CP1251 codepage

(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [ad_gpo_parse_ini_file] (0x0400): ini_filename:/var/lib/sss/gpo_cache/open-bs.local/Policies/{867F611D-CF79-4E4E-83F7-DF3EA3A36D05}/GPT.INI
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [ad_gpo_parse_ini_file] (0x0020): ini_config_file_open failed [84][Invalid or incomplete multibyte or wide character]
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [ad_gpo_parse_ini_file] (0x0020): Error encountered: 84.
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [perform_smb_operations] (0x0020): Cannot parse ini file: [84][Invalid or incomplete multibyte or wide character]
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [main] (0x0020): perform_smb_operations failed.[84][Invalid or incomplete multibyte or wide character].
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [main] (0x0020): gpo_child failed!

Fields changed

description: (Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [ad_gpo_parse_ini_file] (0x0400): ini_filename:/var/lib/sss/gpo_cache/open-bs.local/Policies/{867F611D-CF79-4E4E-83F7-DF3EA3A36D05}/GPT.INI
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [ad_gpo_parse_ini_file] (0x0020): ini_config_file_open failed [84][Invalid or incomplete multibyte or wide character]
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [ad_gpo_parse_ini_file] (0x0020): Error encountered: 84.
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [perform_smb_operations] (0x0020): Cannot parse ini file: [84][Invalid or incomplete multibyte or wide character]
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [main] (0x0020): perform_smb_operations failed.[84][Invalid or incomplete multibyte or wide character].
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [main] (0x0020): gpo_child failed!

GPT.ini have line
DispayName = TEXT
TEXT in CP1251 codepage
=> {{{

(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [ad_gpo_parse_ini_file] (0x0400): ini_filename:/var/lib/sss/gpo_cache/open-bs.local/Policies/{867F611D-CF79-4E4E-83F7-DF3EA3A36D05}/GPT.INI
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [ad_gpo_parse_ini_file] (0x0020): ini_config_file_open failed [84][Invalid or incomplete multibyte or wide character]
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [ad_gpo_parse_ini_file] (0x0020): Error encountered: 84.
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [perform_smb_operations] (0x0020): Cannot parse ini file: [84][Invalid or incomplete multibyte or wide character]
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [main] (0x0020): perform_smb_operations failed.[84][Invalid or incomplete multibyte or wide character].
(Sat Jul 23 03:00:29 2016) [[sssd[gpo_child[4579]]]] [main] (0x0020): gpo_child failed!

GPT.ini have line
DispayName = TEXT
TEXT in CP1251 codepage

}}}

Please attach the test file that fails.

If it has sensitive information please send it directly to dpal@redhat.com.

GPO - ini_config_file_open failed [84][Invalid or incomplete multibyte or wide character]
GPT.INI

Done.

'DispayName' is autogenerated and can't be changed in GPO UI. In English translate is "New group policy object"

I will try find time to look into the code and the issue this week however what I suspect is that this is not a proper UTF8 file. It uses double byte characters without BOM which is the indicator that file contains double byte characters. Thus the iconv function tries to do UTF8 to UTF8 conversion (for validation purposes) and fails. I suspect the way to go would be to suppress the check but I need to see whether and how it is possible.

Russian windows use CP1251 encoding in all ini files

_comment0: Russian windows user CP1251 encoding in all ini files => 1469450575228517

I took a look at the issue.

Problem:
If there is no BOM the converter assumes that file is in UTF8.

Solution:

We can't use local locale especially in the case of GPO since the GPO files are delivered from a different machine. I also do not think we can just ignore the characters that do not convert this can lead to unpredictable results.
We need to allow the file to be in different encodings and thus allow setting "default" encoding by application.

In INI:

In ini_config_priv.h
In struct ini_cfgfile
add a field that will be a pointer to string.

In ini_fileobj.c  
In functions:
 ini_config_file_open
 ini_config_file_from_mem
Initailize this new member to NULL
In function
ini_config_file_reopen
strdup the value that exists in the originating structure
If dup fails - error out

In ini_config_file_destroy check that this new member is not NULL
and free it before freeing the structure.

Add two new functions:
(make sure they are added to the public header with all the comments and symbols list)
ini_config_file_set_default_encoding(struct ini_cfgfile *file_ctx, char *encoding);
const char *ini_config_file_set_default_encoding(struct ini_cfgfile *file_ctx);
These functions will set and return the encoding string from the file context structure. 
The setting function will dup the input if it is not NULL.

Change function common_file_convert() to accept another parameter: an encoding string.
It will be taken from the member of the struct defined above.

Pass this encoding string to initialize_conv() function in a new argument.
Rewrite check_bom() function to return an indicator that there was 
no BOM in the file. 
Inside initialize_conv() after calling check_bom() before calling iconv_open() 
check the indicator.
If there was no BOM and default encoding string is not set then use the 
default encoding string that is passed in as the "from" encoding (second argument).
Otherwise use what is already there.

I do not know if we need to do anything with the saving functionality. I doubt at least that can be a separate thing tracked by a separate ticket.
I will be glad to review a patch.

In SSSD we need to have a configuration value in the sssd.conf that will define the encoding of the GPO files. Use that in SSSD when we allocate file context. Allocate context and set the value using the new set functiona added above.

CC-ing Guenther..

Guenther, do you know if Samba already handles the different encodings somehow?

cc: => gc

Fields changed

cc: gc => gd

It occurred to me that we can solve things differently outside of INI.
We can do it in SSSD directly. AFAIK the GPO design the GPO data is fetched from AD and then stored in a memory buffer that is currently passed to INI.

For example we can add a setting to sssd.conf: gpo_code_page = <codepage>
If this value is set then SSSD code will do the conversion of the memory buffer from this code page to UTF8 before passing it to INI.
The function inside INI that does the conversion can be used as an inspiration for an SSSD function to perform the conversion of the GPO memory buffer before passing it to INI.

I think its overhead for settings.

Variants:
1. Get codepage info from AD/LDAP
2. Use autoconverter to locale like iconv
3. Grep only needed params lines fron ini
4. Replace bad values with BAD_VALUE

All varants don't need additional settings

_comment0: I think its overhead for settings.

Variants:
1. Get codepage info from AD/LDAP
2. Use autoconver to locale lice iconv
3. Grep only needed params lines fron ini
4. Replace bad values with BAD_VALUE

All varants don't need additional settings => 1470861410298033
_comment1: I think its overhead for settings.

Variants:
1. Get codepage info from AD/LDAP
2. Use autoconverter to locale lile iconv
3. Grep only needed params lines fron ini
4. Replace bad values with BAD_VALUE

All varants don't need additional settings => 1470861532799233

Hmm, how would one go about retrieving the codepage from AD?

I am not sure that code page for the user (a preference) and the code page of the GPO file would be the same.
I suspect that GPO is encoded in the code page of the server or expected code page of the client.

I do not know ho it is done and whether it is done in AD at all. Also locale does not help. I think we can try to detect the encoding of the buffer as iconv does it but it will me more work and I have not found any good example how to do it in code. Everything just points to the fact that iconv utility can do it. But this seems to be a bit of dark magic.

Third option would be possible but then we would need to change code to not do the conversion in some cases. It is unclear (since INI is a generic library) in which cases the convertion is desired and in which cases not. That would lead to pretty much same amount of code that I proposed above.

Same with the last option.

The issue is that INI converts the whole buffer even before it started parsing so options 3 & 4 would not work until we manage to suppress conversion and this suppression code would be quite close to what I proposed originally.

That said I agree that if we can find a solution that does not require a configuration option we should use it.

IMO we should document what encodings SSSD can work with.

The problem with the new SSSD option is that nothing guarantees us that all GPOs are encoded in the same way. This would be problematic especially if we want to convert the buffer before we pass it to libini functions, because we could damage valid UTF (that we can detect and work with) and create a mess (tha we can not do anything with).

We could add an option to change the default 'fallback' encoding that will be used in case libini can not detect type of UTF used (currently we fallback to UTF-8). This will require changes in libini (as Dmitri described), but we could change the default to whatever iconv understands.

Would it help if we had sssctl command (or some other tool) that detects GPOs on the server with problematic encodings? The intermediate step would be to check if all necessary GPOs and their attributes in AD are readable by the client machine (we already had user with permission issues and it is a server side misconfiguration).

Then we would print a report that AD admins can use to change encoding or permissions for specific GPOs. Is this acceptable compromise to avoid adding a new option and changes to libini?

I think the proper way to solve this problem is to have a setting in sssd.conf that would specify the expected encoding of the GPO files to match what people have on the server. I do not think asking people to change the encoding for the GPO files on the server is a legitimate solution. I do not even know if Russian (or any special language) Windows server has an option to control it. Having a tool that would see what encoding GPO has would help but we can in most cases it will be possible to detect it based on the main default encoding page of the server.

Lukas thinks we should fall back to ansi if the utf-8 (which is already fallback) parsing fails. I think it is not bad, but it will not work for all possible encodings. But we can do it as enhanced fallback IMO.

I think adding the new option as Dmitri suggested will be a good solution to cover the cases where falling back to ansi will not be sufficient. Having just the enhanced fallback is not enough IMO.

OK, I'm filing this ticket to the sssd-1.16 bucket (the next milestone to be triaged, honestly..)

I also filed and deferred ticket #3196 that tracks the option in case the recovery method wasn't successful, then we can revive the ticket from Deferred.

milestone: NEEDS_TRIAGE => SSSD 1.16 beta

Fields changed

rhbz: => todo

Metadata Update from @slavon:
- Issue set to the milestone: SSSD Future releases (no date set yet)

7 years ago

If I understood the code in common_file_convert() correctly we know where we are in the input stream when the error occurs because src points to the offending byte. So we can check the bytes before if we can find an '=' character before a '\n' character so that we can assume that the error happens in a value and not in a key. Since MS-GPOL says The gpt.ini file MUST be encoded in UTF-8 ... I think we can safely assume that if the offending character is in a key the file is corrupted.

It looks like for values the UTF-8 restriction is a bit more relaxed, especially for the displayName. Unfortunately MS-GPOL does not say anything about the allowed or expected encodings or codepages here. My feeling is that the specific code page is only used for certain value, e.g. displayName, and the reminder of the file is encoded in UTF-8 as required by MS-GPOL. So it would not make sense to convert the whole file base on this code page because then we might fail on UTF-8 multibyte characters use in other areas of the file.

My suggestion would be to copy the value with the offending byte verbatim to the output stream manually and let continue iconv starting with the next key. This should of course only happen if ini_config_file_open or similar is called with a specific flag.

Metadata Update from @sbose:
- Custom field design_review reset (from 0)
- Custom field mark reset (from 0)
- Custom field patch reset (from 0)
- Custom field review reset (from 0)
- Custom field sensitive reset (from 0)
- Custom field testsupdated reset (from 0)
- Issue close_status updated to: None

5 years ago

Metadata Update from @jhrozek:
- Custom field rhbz adjusted to https://bugzilla.redhat.com/show_bug.cgi?id=1661055 (was: todo)

5 years ago

Metadata Update from @thalman:
- Custom field design_review reset (from false)
- Custom field mark reset (from false)
- Custom field patch reset (from false)
- Custom field review reset (from false)
- Custom field sensitive reset (from false)
- Custom field testsupdated reset (from false)
- Issue tagged with: bugzilla

4 years ago

SSSD is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in SSSD's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/SSSD/sssd/issues/4138

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @pbrezina:
- Issue close_status updated to: cloned-to-github
- Issue status updated to: Closed (was: Open)

3 years ago

Login to comment on this ticket.

Metadata